Traces column-level data lineage through SQL, Kafka, Spark, and ORM codebases to map data flow and transformations.
The Lineage skill empowers developers to navigate complex data ecosystems by analyzing source code to map column-level data flows across heterogeneous stacks. It recursively traces data paths through SQL views, Kafka topics, Spark jobs, and JDBC writes without requiring live database connections or external tools. By providing structured reports with confidence ratings and visual SVG diagrams, it helps teams perform impact analysis, debug data quality issues, and document data provenance directly within their development workflow.
主要功能
01Deep recursive analysis of upstream sources and downstream consumers
02Automated generation of visual SVG and ASCII lineage diagrams
03Static code analysis approach with zero external dependencies or database access
04Cross-stack tracing across SQL, Kafka, Spark, JDBC, and ORM layers
057 GitHub stars
06Confidence-aware mapping with explicit ratings for every data hop
使用场景
01Performing impact analysis before modifying a database schema or column definition
02Debugging data discrepancies by tracing reporting columns back to their original source systems
03Documenting data flow for compliance and auditing in complex microservice architectures