How does this skill handle error budgets?

The skill provides formulas to calculate error budgets (1 - SLO) and helps define policies that dictate whether a team should focus on new features or reliability based on the remaining budget.

What are multi-window burn rate alerts?

These are sophisticated alerting rules that combine short-term and long-term windows to identify both sudden outages and slow performance degradations while reducing false positives.

Does it support Prometheus and Grafana?

Yes, it includes specific PromQL recording rules, alert definitions for Prometheus, and structural guidance for creating effective Grafana dashboards.

What is the difference between SLI and SLO in this skill?

SLIs (Service Level Indicators) are the specific metrics used to measure performance, such as request success rate. SLOs (Service Level Objectives) are the target values or ranges for those metrics, such as achieving a 99.9% success rate.

Service Level Objective (SLO) Framework

Name: Service Level Objective (SLO) Framework
Author: amurata

byamurata

•

分析と監視

Implements measurable reliability targets using SLIs, SLOs, and error budgets to balance service performance with innovation speed.

This skill provides a comprehensive framework for defining and implementing Service Level Indicators (SLIs), Service Level Objectives (SLOs), and error budgets within a Site Reliability Engineering (SRE) context. It guides users through establishing reliability targets, creating Prometheus recording and alert rules, and designing Grafana dashboards to visualize service health. By balancing the cost of downtime with development velocity, it enables engineering teams to make data-driven decisions regarding feature releases versus reliability improvements, ensuring high user satisfaction without sacrificing agility.

主な機能

01Design standardized Grafana dashboards for reliability visualization

023 GitHub stars

03Define availability, latency, and durability SLIs with PromQL templates

04Calculate and manage error budgets to guide deployment frequency

05Generate Prometheus recording rules for automated SLO compliance tracking

06Implement multi-window burn rate alerts to minimize monitoring noise

ユースケース

01Establishing reliability benchmarks and targets for new microservices

02Automating SRE alerting based on error budget consumption rates

03Managing the trade-off between rapid feature delivery and system stability

What are Skills?·How to Install

Install with 🐟 Skill.Fish

npx skillfish add amurata/cc-tools slo-implementation

For use in Claude.ai and ChatGPT

主な機能

01Design standardized Grafana dashboards for reliability visualization

023 GitHub stars

03Define availability, latency, and durability SLIs with PromQL templates

04Calculate and manage error budgets to guide deployment frequency

05Generate Prometheus recording rules for automated SLO compliance tracking

06Implement multi-window burn rate alerts to minimize monitoring noise

ユースケース

01Establishing reliability benchmarks and targets for new microservices

02Automating SRE alerting based on error budget consumption rates

03Managing the trade-off between rapid feature delivery and system stability