关于
This skill provides a comprehensive framework for establishing Site Reliability Engineering (SRE) practices by defining measurable reliability targets. It helps developers and SREs implement SLIs, SLOs, and error budgets using PromQL and Prometheus recording rules. By balancing innovation velocity with service stability, it enables data-driven decisions on when to freeze features or prioritize reliability based on real-time error budget burn rates and sophisticated multi-window alerting strategies.