01Comprehensive parsing of inspect-ai .eval logs for multi-task performance tracking.
02Intelligent epoch-to-metric mapping using SLURM job name metadata.
03Advanced binary classification metrics including Balanced Accuracy and F1 scores.
04Automated extraction of training loss and step counts from SLURM output files.
05Consolidated Markdown report generation with structured status, training, and evaluation tables.
0611 GitHub stars