Acerca de
The Langfuse Agent Evaluator is a specialized skill designed to bring rigorous observability and testing to AI agent development. It automates a multi-phase workflow that includes running dataset experiments with configured judges, performing deep-dive root cause analysis on failed traces, and comparing performance across different development cycles. By identifying specific failure patterns and symptoms, it provides structured recommendations for fixes without the risk of auto-applying unverified changes, ensuring developers have high-quality documentation and clear paths to improvement via Linear or local reports.