Great Expectations

Software Development

Revolutionizing the speed and integrity of data collaboration

View all 70 employees

About us

The mission of Great Expectations is to revolutionize the speed and integrity of data collaboration. Great Expectations is the leading open source tool for defeating pipeline debt through data testing, documentation, and profiling. Data teams all over the world use Great Expectations to boost the confidence, integrity, and speed of their work.

Website: https://greatexpectations.io
External link for Great Expectations
Industry: Software Development
Company size: 11-50 employees
Headquarters: Remote
Type: Privately Held
Founded: 2017
Specialties: data science, data engineering, data pipelines, pipeline debt, data quality, data monitoring, and MLOps

Locations

Primary

Remote, US

Get directions

Employees at Great Expectations

See all employees

Updates

Great Expectations

3,848 followers
1d
Report this post
"7...fix your DQ issues." Good advice Benjamin Rogojan!

Benjamin Rogojan

Data, Automation And Analytics Consulting | Reach Out For Data Infra Consults
5d

9 pieces of advice after 8+ years of helping companies set up their data infra 1. Start simple, the less tools and pipelines you have the less you have to maintain 2. Focus on building process maturity along with data infrastructure maturity 3. Get buy in early from the business 4. Spend time understanding the business, their problems, operations, etc 5. Don’t get distracted by ad-hoc requests(figure out what you should do and what you shouldnt and why) 6. Always try to fix data issues at the source 7. Being $1 off today means you could be off thousands tomorrow, fix your DQ issues 8. Set-up a peer review process for your analytics and data science projects 9. Communicate, communicate, communicate, with your stakeholders, the business, etc. Whether you have good or bad news What lessons have you learned?

Like Comment Share
Great Expectations

3,848 followers
2d
Report this post
Glad to be part of your stack, Zach Wilson!

Zach Wilson Zach Wilson is an Influencer

Founder @ DataExpert.io | YouTube: Data with Zach | ADHD | contact: [email protected]
4d

My favorite stack to build a data analytics product - Apache Spark (for processing) - Amazon S3 (for storage) - Apache Iceberg (for metadata) - Apache Airflow (for scheduling) - Apache Superset (for visualization) - Great Expectations (for data quality) #dataengineering

Like Comment Share
Great Expectations

3,848 followers
5d
Report this post
📣 The new GX Expectations Gallery is here! 🎉 You asked, we listened! We’ve been collecting feedback and are thrilled to announce a new Expectations Gallery. 💡Learn more about the improved search and navigation, sample data and pass/fail case examples, and other updates on the GX blog: https://hubs.li/Q02CKSK50 👀 Visit the new Expectations Gallery and enjoy a better way to build your validations. https://hubs.li/Q02CKKgq0

The new GX Expectations Gallery

Like Comment Share
Great Expectations

3,848 followers
1w
Report this post
If you love the advantages of GX OSS highlighted in this post, be sure to check out GX Cloud for even more benefits! https://lnkd.in/gDMbuj43
Karl C.

Data Engineering | Data Warehousing | Analytics Engineering
3w Edited

DAY 12 - Project: Automated data quality testing using Great Expectations and BigQuery Data engineering is all about delivering high-quality data to the right people at the right time. Data quality metrics can be grouped as: business dimensions (metrics accuracy, etc) and technical dimensions (not missing data, non duplicate, etc). Read more about Data Quality as the introductory of this project https://lnkd.in/dsWA-FEe After we understand how data quality can be critical, we will try to implement Great Expectations as 1 of the powerful tool to automate and perform data quality testing. Enjoy and follow this through https://lnkd.in/d8ARch2w side note: I also added section about `dbt test`, take it a look and compare with Great Expectations to see how they differ. Next time, I will cover also about Open Metadata as 1 of the exciting tools, stay tune! #DataEngineeringIn30Days
Like Comment Share
Great Expectations

3,848 followers
1w
Report this post
📣 Our June community meetup is tomorrow! We've got a full schedule of demos and updates to share. Join us! When: Tuesday, June 18 at 9am PT / 12pm ET / 4pm UT Register: Visit https://lnkd.in/eFYSs5Cc to sign up. See you there! PS: We've adjusted the cadence of our meetups—the next one after tomorrow's will be August 20.

Great Expectations Community Events

addevent.com

Like Comment Share
Great Expectations

3,848 followers
2w
Report this post
#dataquality is 🔑
Chad Sanderson

CEO @ Gable.ai (Shift Left Data Platform)
3w

"Data Quality is our largest barrier to AI adoption" - said by one of the world's most sophisticated technology companies. While the hype cycle of AI is still on an exponentially upward trend (for now, anyway) many teams are starting to run into a stark reality: The data required to power the models executives want either A.) does not exist or B.) does exist but isn't trustworthy. Andrew Ng, founder of the AI Fund once stated that the purpose of MLOps is to ensure the availability of high-quality data throughout the ML project lifecycle. I will go one step further than Andrew. I do not believe that Data Quality is a problem that can be solved by data scientists or AI teams. Data Quality is a problem that MUST be solved by data producers. If quality does not exist at the source, then every action you might take is reactive and remedial in nature. Due to the variety of changes to data being virtually limitless, it is impossible to write a test before every possible change without understanding how that change impacts downstream systems. Unfortunately, it will usually take a major incident before most executives acknowledge the potential risks of not having preventative / proactive data quality solutions in place, with explicitly defined ownership at the source. Downstream teams must do the hard work of communicating to data producers the outcomes on AI and other data products when incidents occur. This is the only way engineering organizations will take this problem seriously. Good luck!
Like Comment Share
Great Expectations

3,848 followers
2w
Report this post
Understanding your data’s quality is about finding the right answers, right? Nope. Understanding the quality of your data isn’t the process of finding the right answer; it’s the process of ruling out all the wrong answers. This is actually good news, because now we have a framing that actually empowers us in developing confidence in our data. Learn more ➡️https://hubs.ly/Q02y_VBd0 #dataquality #dataengineering #dataengineer #dataanalyst #datascience

Why data quality is actually really difficult

greatexpectations.io

Like Comment Share
Great Expectations

3,848 followers
3w
Report this post
Thanks for the shoutout, Fabiana Clemente!

Fabiana Clemente

Data quality for data science | Data-Centric AI | Data Preparation & Synthetic data
1mo

Here's a bit of a rant... #DataQuality often gets overlooked because it's not glamorous or exciting, and it lacks a dedicated champion. But ignoring it could jeopardize your entire #AI strategy. Despite being crucial to the success of AI, the importance of data quality and the development tools that ensure it have not received the attention they deserve. The movement of #datacentricAI seem to be the beginning to change it, but #LLMs and #foundationalmodels took over the hype. In my opinion #dataquality should take the narrative back, and here's why: 1. Poor-quality data results in inaccurate models, biased outcomes, and unreliable AI systems. 2. Neglecting data quality tools in favor of model development and productization has hindered innovation in the field. While progress is being made in improving data quality tooling, it’s not keeping pace with the rapid development of new AI models. 3. Investing in data quality is essential for the future of AI, regardless of the models or architectures used. In an AI-dominated world, the key differentiator is your data! As someone activate among the data community and #opensource, I do believe that we should invest in robust data management infrastructure, promote data literacy among AI practitioners, and encourage collaboration between data engineers and data scientists. Strong data governance frameworks are also essential to ensure data quality, privacy, and security. A big shoutout to the companies that are giving visibility to the importance of #dataquality Cleanlab YData Great Expectations! What tooling do you use? Tag others that have been relevant for your journey into #ai and #data.

Like Comment Share
Great Expectations

3,848 followers
3w
Report this post
☀️ It’s almost June, and that means new GX Cloud workshops! 📆 June dates: Tuesday June 11, 12pm ET (9am PT) for “Getting started with GX Cloud and PostgreSQL” Tuesday June 25, 12pmET (9am PT) for “Getting started with GX Cloud and Snowflake” 🖊️ Register here: https://hubs.ly/Q02y_W7M0

GX Cloud workshop signup

pages.greatexpectations.io

1 Comment

Like Comment Share
Great Expectations

3,848 followers
3w
Report this post
If data quality problems are so universal, why is solving them so hard? Shouldn’t we have the tools by now? Is that even the right question? 🤔 Spoiler: it’s not. We have the tools ⚒️, but we’re not putting them together in the right way. Today on the GX blog, we talk about the role of surprise in data quality 😯 and go down the rabbit hole 🐇 to discover a framing for data quality that actually empowers data teams. 📖Read it here: https://hubs.ly/Q02y_Srz0 #dataquality #dataengineering #dataengineer #dataanalyst #datascience

Why data quality is actually really difficult

1 Comment

Like Comment Share

Browse jobs

Funding

Great Expectations 2 total rounds

Last Round

Series B Mar 10, 2022

US$ 40.0M

See more info on crunchbase

Great Expectations

Software Development

Revolutionizing the speed and integrity of data collaboration

About us

Locations

Employees at Great Expectations

Hernan Alvarez

VP of Product

Bill Dirks

Max Gazor

General Partner at CRV

Katrina Masiak

Updates

Join now to see what you are missing

Similar pages

Gable

Streamdal

Mage

dbt Labs

Prefect

Tabular (now part of Databricks)

Starburst

Voltron Data

Monte Carlo

Preset

Browse jobs

Developer jobs

Intern jobs

Recruiter jobs

Full Stack Engineer jobs

Manager jobs

Javascript Developer jobs

Summer Intern jobs

Engineer jobs

Marketing Intern jobs

Software Intern jobs

Writer jobs

Director jobs

Production Coordinator jobs

Information Technology Recruiter jobs

Social Media Manager jobs

Talent Acquisition Specialist jobs

PHP Developer jobs

Web Developer jobs

Team Lead jobs

Solutions Engineer jobs

Funding