소개
This skill provides a comprehensive framework for establishing measurable reliability targets using SRE best practices. It helps developers and DevOps teams define Service Level Indicators (SLIs), set Service Level Objectives (SLOs), and calculate error budgets to balance innovation velocity with system stability. By providing ready-to-use Prometheus recording rules, multi-window burn rate alerting configurations, and Grafana dashboard structures, it streamlines the process of transforming raw metrics into actionable reliability insights and automated alerting policies.