01Automated generation of disaster recovery runbooks based on findings
02Controlled failure injection across network, infrastructure, and application layers
03Safety-first blast radius management with automatic rollback triggers
0499 GitHub stars
05Automated steady-state monitoring and deviation detection
06Multi-agent coordination for experiment execution and production monitoring