01Custom LLM-as-Judge patterns for domain-specific quality metrics
02Quantitative scoring for goal success, faithfulness, and coherence
0313 built-in evaluators for correctness, safety, and tool accuracy
04Continuous production monitoring with CloudWatch alerting
050 GitHub stars
06Pre-production on-demand and batch testing capabilities