About
This skill equips Claude with Site Reliability Engineering (SRE) expertise, focusing on the practical application of SLOs, SLIs, and Error Budgets. It provides standardized templates for reliability documentation and ready-to-use implementation patterns for critical resilience mechanisms like circuit breakers, exponential backoff, and bulkheads. Whether you are architecting a new distributed system or hardening an existing one, this skill ensures your infrastructure is observable, scalable, and designed for failure.