Does it include specific monitoring tool configurations?

Yes, it includes Prometheus recording rules and multi-window burn rate alerting configurations to detect reliability issues quickly while minimizing false positives.

Can I customize the SLO targets?

Absolutely. The skill provides examples for various availability levels (99% to 99.99%) and guidance on how to adjust targets based on business requirements and user expectations.

How does this skill help with error budgets?

It provides standardized formulas and policies to calculate remaining error budgets, helping teams decide when to prioritize reliability fixes over new feature development.

What is the difference between an SLI and an SLO?

An SLI (Service Level Indicator) is a specific measurement of service performance, like latency or error rate, while an SLO (Service Level Objective) is the target value or range for that measurement.

SLO Implementation Framework

Name: SLO Implementation Framework
Author: kivilaid

bykivilaid

•

分析与监控

Implements Site Reliability Engineering (SRE) practices by defining Service Level Indicators (SLIs), Objectives (SLOs), and error budgets.

This skill provides a comprehensive framework for establishing and monitoring service reliability targets. It helps developers and SREs define meaningful SLIs for availability and latency, set realistic SLO targets, and calculate error budgets to balance innovation with stability. With ready-to-use Prometheus recording rules, multi-window alerting strategies, and Grafana dashboard templates, it automates the technical heavy lifting of reliability engineering, allowing teams to make data-driven decisions about feature velocity and infrastructure investments based on real-world service performance.

主要功能

01Multi-window burn rate alert implementation to reduce noise

02Grafana dashboard structures for reliability visualization

03Standardized templates for SLI and SLO definitions

04Pre-configured Prometheus recording and alerting rules

056 GitHub stars

06Error budget calculation formulas and policy frameworks

使用场景

01Establishing internal reliability targets for production microservices

02Implementing SRE-driven error budget policies to manage deployment risks

03Reducing alert fatigue through sophisticated multi-window burn rate monitoring

What are Skills?·How to Install

Install with 🐟 Skill.Fish

npx skillfish add kivilaid/plugin-marketplace slo-implementation

For use in Claude.ai and ChatGPT

Download Skill