great_expectations
by
great-expectations

Description: Always know what to expect from your data.

View great-expectations/great_expectations on GitHub ↗

Summary Information

Updated 11 minutes ago

Added to GitGenius on September 24th, 2025

Created on September 11th, 2017

Open Issues & Pull Requests: 49 (+0)

Number of forks: 1,753

Total Stargazers: 11,525 (+0)

Total Subscribers: 102 (+0)

Issue Activity (beta)

Open issues: 25

New in 7 days: 2

Closed in 7 days: 1

Avg open age: 379 days

Stale 30+ days: 21

Stale 90+ days: 17

Recent activity

Opened in 7 days: 2

Closed in 7 days: 1

Comments in 7 days: 2

Events in 7 days: 8

Top labels

community (403)
devrel (314)
stale (145)
bug (136)
help wanted (88)
triage (71)
enhancement (45)
feature (44)

Most active issues this week

#11884 Security report — possible pull_request_target + checkout-head RCE (please contact privately) - 6 events / 2 comments
#11887 pydantic 2.5 support - 2 events / 0 comments

Explore full issue details

Detailed Description

Great Expectations is an open-source data quality framework designed to help data teams define, validate, and document their data, ensuring reliability and preventing "data debt." The repository serves as the central hub for this powerful tool, providing the code, documentation, and community resources necessary to implement robust data quality checks across various data pipelines and systems. At its core, Great Expectations allows users to express "Expectations" – test cases for their data – in a declarative, human-readable format.

The fundamental concept revolves around Expectations, which are assertions about data. These can range from simple checks like `expect_column_to_exist` or `expect_column_values_to_be_between` to more complex statistical assertions. A collection of these Expectations forms an "Expectation Suite," which acts as a data contract or a blueprint for the expected state of a dataset. Users can build these suites interactively using a profiler that infers expectations from existing data, or by manually defining them. This process helps formalize implicit knowledge about data into explicit, executable tests.

Great Expectations connects to a wide array of data sources, including Pandas DataFrames, Spark DataFrames, and various SQL databases, making it highly versatile for different data environments. Once Expectations are defined, "Checkpoints" are used to run these suites against new batches of data. A Checkpoint is a configuration that specifies which Expectation Suite to run against which data asset, and what actions to take based on the validation results. These actions often include saving the validation results and building "Data Docs."

"Data Docs" are a cornerstone feature, transforming Expectation Suites and validation results into a beautiful, human-readable website. This site serves as a living data dictionary, documenting the expected structure and quality of data, and providing a historical record of validation outcomes. Data Docs significantly improve communication and collaboration among data engineers, data scientists, and business stakeholders, offering a transparent view into data quality over time. They are invaluable for debugging data issues, onboarding new team members, and maintaining trust in data assets.

The repository showcases Great Expectations' integration capabilities with popular data tools and workflows, including Apache Airflow, dbt, and various CI/CD pipelines. By embedding Great Expectations into existing data pipelines, teams can automatically validate data at critical junctures, catching issues before they propagate downstream. This proactive approach to data quality helps maintain data integrity, reduces rework, and ensures that analytics, machine learning models, and reports are built on reliable foundations. Ultimately, Great Expectations empowers data teams to build more robust, trustworthy, and maintainable data systems.

great_expectations
by
great-expectations

Summary Information

Issue Activity (beta)

Recent activity

Top labels

Most active issues this week