01RAGAS integration for specialized RAG performance metrics
02Dataset-based evaluation runs for regression testing
03Automated LLM-as-Judge patterns for scalable quality monitoring
04Manual trace and generation scoring for human evaluation
051,449 GitHub stars
06User feedback integration with boolean and categorical ratings