01Implements Eval-Driven Development (EDD) principles for AI-assisted coding
02Generates comprehensive evaluation reports with version-controlled history
03Includes multi-modal scoring via code-based, model-based, and human graders
040 GitHub stars
05Supports automated Capability and Regression evaluation types
06Tracks reliability metrics including pass@k and pass^k indicators