소개
This skill empowers developers to build more reliable distributed systems by implementing core chaos engineering principles directly within their workflow. It provides a structured framework for defining steady-state metrics, automating failure injections across network and infrastructure layers, and validating system recovery paths using AI-driven agents. By facilitating proactive testing and automated runbook generation, it helps teams identify architectural weaknesses and improve disaster recovery protocols before they impact production users.