关于
This skill provides a comprehensive framework for implementing Site Reliability Engineering (SRE) practices within your services. It guides you through the process of defining user-centric SLIs, setting realistic SLO targets, and calculating error budgets to balance innovation velocity with system stability. With built-in support for Prometheus recording rules, multi-window burn rate alerts, and Grafana dashboard structures, it enables teams to move from reactive firefighting to proactive, data-driven reliability management.