Implements robust data validation and quality assurance using Great Expectations, dbt tests, and formal data contracts.
This skill provides standardized production patterns for establishing data integrity throughout the modern data stack. It enables developers to implement sophisticated validation suites using Great Expectations, build comprehensive dbt test layers, and define machine-readable data contracts between teams. By focusing on the six dimensions of data quality—completeness, uniqueness, validity, accuracy, consistency, and timeliness—this skill ensures reliable data pipelines and prevents downstream failures through automated CI/CD integration and real-time monitoring patterns.
Key Features
01Standardized Data Contract definitions using YAML and SodaCL
021 GitHub stars
03Great Expectations suite and checkpoint implementation
04Automated data quality pipeline orchestration
05Advanced dbt test patterns and custom generic tests
06Validation mapping across 6 key data quality dimensions
Use Cases
01Enforcing schema and business rules in production data warehouses
02Establishing formal data SLAs and contracts between engineering and business teams
03Preventing broken data from reaching downstream BI tools or ML models