013 GitHub stars
02Analyzes exit codes and raw logs to confirm actual success
03Validates agent delegation results through independent diff analysis
04Detects and blocks vague language and premature satisfaction statements
05Enforces a strict Red-Green cycle for regression testing
06Mandates fresh command execution for every completion claim