概要
This skill provides a structured framework for managing LLM prompt iterations, enabling developers to version-control templates, run deterministic A/B tests, and monitor production performance across different variants. By implementing standardized metadata capture—including template hashes, variables, and version IDs—it facilitates data-driven prompt engineering. Users can leverage built-in patterns for regression detection, performance analysis (latency, cost, and success rates), and safe gradual rollouts, ensuring that prompt updates consistently improve agent behavior rather than introducing unforeseen issues.