Acerca de
The agent-test skill provides a robust framework for validating LLM agent behavior within the Treasure Data ecosystem. It allows developers to define complex, multi-round interaction scenarios in YAML and utilize a specialized judge agent to evaluate responses against specific, measurable criteria. This tool is essential for regression testing, refining agent prompts, and ensuring consistent performance across diverse user inputs without the need for manual oversight, significantly speeding up the agent development lifecycle.