This skill equips Claude with Site Reliability Engineering (SRE) expertise, focusing on the practical application of SLOs, SLIs, and Error Budgets. It provides standardized templates for reliability documentation and ready-to-use implementation patterns for critical resilience mechanisms like circuit breakers, exponential backoff, and bulkheads. Whether you are architecting a new distributed system or hardening an existing one, this skill ensures your infrastructure is observable, scalable, and designed for failure.
주요 기능
01Graceful degradation strategies for high-traffic services
0272 GitHub stars
03SLO and SLI definition templates with Prometheus query examples
04Error budget tracking and burn rate alerting logic
05Resilience patterns including Circuit Breakers and Bulkheads
06Capacity planning and k6 load testing configurations