概要
This skill provides a comprehensive framework for implementing Site Reliability Engineering (SRE) practices by establishing measurable reliability targets. It guides users through defining SLIs for availability and latency, setting realistic SLO targets based on business needs, and calculating error budgets to balance innovation with stability. With built-in support for Prometheus recording rules and multi-window alerting strategies, it enables teams to move beyond basic uptime monitoring to sophisticated, user-centric reliability management that aligns technical performance with business goals.