关于
This skill empowers teams to implement Site Reliability Engineering (SRE) best practices by providing structured frameworks for defining and managing system reliability. It helps users identify critical user journeys, select meaningful Service Level Indicators (SLIs) like availability and latency, and establish Error Budgets that balance innovation velocity with operational stability. By providing specific implementation patterns for multi-window burn rate alerting and dashboard design, this skill ensures that reliability is measured and managed through the lens of actual user experience.