01Semantic duplicate detection with configurable similarity thresholds
0269 GitHub stars
03Automated schema validation for complex document and query structures
04Coverage analysis to identify gaps in domains and content types
05Difficulty distribution auditing for balanced AI evaluation benchmarks
06Referential integrity checks between queries and document sections