01Standardized rubric generation for objective and subjective criteria
02Mitigation strategies for position, length, and self-enhancement biases
03Automated LLM-as-a-judge pipeline implementation
04Pairwise comparison protocols with consistency checking
0539 GitHub stars
06Structured metric selection framework for diverse evaluation tasks