Great Expectations

Great Expectations

Software Development

Revolutionizing the speed and integrity of data collaboration

About us

The mission of Great Expectations is to revolutionize the speed and integrity of data collaboration. Great Expectations is the leading open source tool for defeating pipeline debt through data testing, documentation, and profiling. Data teams all over the world use Great Expectations to boost the confidence, integrity, and speed of their work.

Website
https://greatexpectations.io
Industry
Software Development
Company size
11-50 employees
Headquarters
Remote
Type
Privately Held
Founded
2017
Specialties
data science, data engineering, data pipelines, pipeline debt, data quality, data monitoring, and MLOps

Locations

Employees at Great Expectations

Updates

  • View organization page for Great Expectations, graphic

    3,848 followers

    "7...fix your DQ issues." Good advice Benjamin Rogojan!

    View profile for Benjamin Rogojan, graphic

    Data, Automation And Analytics Consulting | Reach Out For Data Infra Consults

    9 pieces of advice after 8+ years of helping companies set up their data infra 1. Start simple, the less tools and pipelines you have the less you have to maintain 2. Focus on building process maturity along with data infrastructure maturity 3. Get buy in early from the business 4. Spend time understanding the business, their problems, operations, etc 5. Don’t get distracted by ad-hoc requests(figure out what you should do and what you shouldnt and why) 6. Always try to fix data issues at the source 7. Being $1 off today means you could be off thousands tomorrow, fix your DQ issues 8. Set-up a peer review process for your analytics and data science projects 9. Communicate, communicate, communicate, with your stakeholders, the business, etc. Whether you have good or bad news What lessons have you learned?

  • View organization page for Great Expectations, graphic

    3,848 followers

    Glad to be part of your stack, Zach Wilson!

    View profile for Zach Wilson, graphic
    Zach Wilson Zach Wilson is an Influencer

    Founder @ DataExpert.io | YouTube: Data with Zach | ADHD | contact: [email protected]

    My favorite stack to build a data analytics product - Apache Spark (for processing) - Amazon S3 (for storage) - Apache Iceberg (for metadata) - Apache Airflow (for scheduling) - Apache Superset (for visualization) - Great Expectations (for data quality) #dataengineering

  • View organization page for Great Expectations, graphic

    3,848 followers

    📣 The new GX Expectations Gallery is here! 🎉 You asked, we listened! We’ve been collecting feedback and are thrilled to announce a new Expectations Gallery. 💡Learn more about the improved search and navigation, sample data and pass/fail case examples, and other updates on the GX blog: https://hubs.li/Q02CKSK50 👀 Visit the new Expectations Gallery and enjoy a better way to build your validations. https://hubs.li/Q02CKKgq0

    The new GX Expectations Gallery

    The new GX Expectations Gallery

  • View organization page for Great Expectations, graphic

    3,848 followers

    If you love the advantages of GX OSS highlighted in this post, be sure to check out GX Cloud for even more benefits! https://lnkd.in/gDMbuj43

    View profile for Karl C., graphic

    Data Engineering | Data Warehousing | Analytics Engineering

    DAY 12 - Project: Automated data quality testing using Great Expectations and BigQuery Data engineering is all about delivering high-quality data to the right people at the right time. Data quality metrics can be grouped as: business dimensions (metrics accuracy, etc) and technical dimensions (not missing data, non duplicate, etc). Read more about Data Quality as the introductory of this project https://lnkd.in/dsWA-FEe After we understand how data quality can be critical, we will try to implement Great Expectations as 1 of the powerful tool to automate and perform data quality testing. Enjoy and follow this through https://lnkd.in/d8ARch2w side note: I also added section about `dbt test`, take it a look and compare with Great Expectations to see how they differ. Next time, I will cover also about Open Metadata as 1 of the exciting tools, stay tune! #DataEngineeringIn30Days

    • No alternative text description for this image
  • View organization page for Great Expectations, graphic

    3,848 followers

    #dataquality is 🔑

    View profile for Chad Sanderson, graphic

    CEO @ Gable.ai (Shift Left Data Platform)

    "Data Quality is our largest barrier to AI adoption" - said by one of the world's most sophisticated technology companies. While the hype cycle of AI is still on an exponentially upward trend (for now, anyway) many teams are starting to run into a stark reality: The data required to power the models executives want either A.) does not exist or B.) does exist but isn't trustworthy. Andrew Ng, founder of the AI Fund once stated that the purpose of MLOps is to ensure the availability of high-quality data throughout the ML project lifecycle. I will go one step further than Andrew. I do not believe that Data Quality is a problem that can be solved by data scientists or AI teams. Data Quality is a problem that MUST be solved by data producers. If quality does not exist at the source, then every action you might take is reactive and remedial in nature. Due to the variety of changes to data being virtually limitless, it is impossible to write a test before every possible change without understanding how that change impacts downstream systems. Unfortunately, it will usually take a major incident before most executives acknowledge the potential risks of not having preventative / proactive data quality solutions in place, with explicitly defined ownership at the source. Downstream teams must do the hard work of communicating to data producers the outcomes on AI and other data products when incidents occur. This is the only way engineering organizations will take this problem seriously. Good luck!

    • No alternative text description for this image
  • View organization page for Great Expectations, graphic

    3,848 followers

    Understanding your data’s quality is about finding the right answers, right? Nope. Understanding the quality of your data isn’t the process of finding the right answer; it’s the process of ruling out all the wrong answers. This is actually good news, because now we have a framing that actually empowers us in developing confidence in our data. Learn more ➡️https://hubs.ly/Q02y_VBd0 #dataquality #dataengineering #dataengineer #dataanalyst #datascience

    Why data quality is actually really difficult

    Why data quality is actually really difficult

    greatexpectations.io

  • View organization page for Great Expectations, graphic

    3,848 followers

    Thanks for the shoutout, Fabiana Clemente!

    View profile for Fabiana Clemente, graphic

    Data quality for data science | Data-Centric AI | Data Preparation & Synthetic data

    Here's a bit of a rant... #DataQuality often gets overlooked because it's not glamorous or exciting, and it lacks a dedicated champion. But ignoring it could jeopardize your entire #AI strategy. Despite being crucial to the success of AI, the importance of data quality and the development tools that ensure it have not received the attention they deserve. The movement of #datacentricAI seem to be the beginning to change it, but #LLMs and #foundationalmodels took over the hype. In my opinion #dataquality should take the narrative back, and here's why: 1. Poor-quality data results in inaccurate models, biased outcomes, and unreliable AI systems. 2. Neglecting data quality tools in favor of model development and productization has hindered innovation in the field. While progress is being made in improving data quality tooling, it’s not keeping pace with the rapid development of new AI models. 3. Investing in data quality is essential for the future of AI, regardless of the models or architectures used. In an AI-dominated world, the key differentiator is your data! As someone activate among the data community and #opensource, I do believe that we should invest in robust data management infrastructure, promote data literacy among AI practitioners, and encourage collaboration between data engineers and data scientists. Strong data governance frameworks are also essential to ensure data quality, privacy, and security. A big shoutout to the companies that are giving visibility to the importance of #dataquality Cleanlab YData Great Expectations! What tooling do you use? Tag others that have been relevant for your journey into #ai and #data.

  • View organization page for Great Expectations, graphic

    3,848 followers

    If data quality problems are so universal, why is solving them so hard? Shouldn’t we have the tools by now? Is that even the right question? 🤔 Spoiler: it’s not. We have the tools ⚒️, but we’re not putting them together in the right way. Today on the GX blog, we talk about the role of surprise in data quality 😯 and go down the rabbit hole 🐇 to discover a framing for data quality that actually empowers data teams. 📖Read it here: https://hubs.ly/Q02y_Srz0 #dataquality #dataengineering #dataengineer #dataanalyst #datascience

    Why data quality is actually really difficult

    Why data quality is actually really difficult

Similar pages

Browse jobs

Funding

Great Expectations 2 total rounds

Last Round

Series B

US$ 40.0M

See more info on crunchbase