01Automate job creation using standardized model configurations like Claude 4.5 Opus
02Query and filter jobs based on environment IDs, status, or problem sets
03Retrieve real-time job results and pass rates for specific evaluation runs
040 GitHub stars
05Access detailed execution transcripts and container logs for deep-dive debugging
06Python-native implementation to ensure stable authentication and header handling