Implements comprehensive Service Level Objective (SLO) frameworks and error budget practices to optimize system reliability and performance.
This skill equips Claude with the expertise of a Site Reliability Engineer (SRE) to help teams define, implement, and maintain rigorous reliability standards. It facilitates the creation of meaningful Service Level Indicators (SLIs), the establishment of realistic SLOs, and the development of error budget-based engineering practices. Whether you are building monitoring dashboards or aligning technical reliability targets with business objectives, this skill provides the domain-specific guidance needed to balance rapid feature velocity with system stability.
主な機能
01Implement error budget-based engineering and policy-making practices
02Establish meaningful SLIs (Service Level Indicators) for availability and latency
03Create standardized monitoring dashboards and automated alerting workflows
04Design comprehensive SLO frameworks tailored to specific service architectures
05Align technical reliability targets with overarching business priorities
0639 GitHub stars
ユースケース
01Establishing error budget policies to determine when to freeze feature releases
02Standardizing observability and reliability practices across multiple engineering squads
03Defining SLIs and SLOs for microservices to measure production health