Can I use this with Prometheus and Grafana?

Yes, the skill includes specific recording rules, alerting logic, and dashboard structures optimized for Prometheus and Grafana environments.

What is an error budget burn rate?

It is a metric that measures how quickly your reliability budget is being consumed, allowing for proactive alerting before a Service Level Objective is officially violated.

Does this skill help with latency monitoring?

Absolutely. It includes specific patterns for Latency SLIs (e.g., p95 thresholds) to ensure you are measuring performance from the user's perspective.

How does this skill help with SRE practices?

It provides standardized templates, PromQL queries, and configuration patterns to define SLIs/SLOs and manage error budgets effectively within your development workflow.

SLO & Reliability Implementation

Name: SLO & Reliability Implementation
Author: HermeticOrmus

byHermeticOrmus

0•

분석 및 모니터링

Defines and implements measurable service level objectives and error budgets to optimize system reliability and engineering velocity.

This skill provides a comprehensive framework for Site Reliability Engineering (SRE) practices, specifically focusing on Service Level Indicators (SLIs), Service Level Objectives (SLOs), and Error Budgets. It helps developers and SREs establish clear reliability targets using Prometheus metrics, create automated alerting based on error budget burn rates, and design observability dashboards. By balancing reliability requirements against innovation goals, it ensures teams maintain high performance while making data-driven decisions about feature deployment and infrastructure stability.

주요 기능

010 GitHub stars

02Automated SLI/SLO definition and hierarchy mapping

03Multi-window error budget burn rate calculations

04Standardized error budget policies and review workflows

05Pre-configured Prometheus recording and alerting rules

06Structured Grafana dashboard templates for reliability tracking

사용 사례

01Establishing internal reliability targets for production microservices

02Implementing SRE practices within existing Prometheus/Grafana stacks

03Balancing feature velocity with system stability using data-driven error budgets

What are Skills?·How to Install

Install with 🐟 Skill.Fish

npx skillfish add hermeticormus/alqvimia-contador slo-implementation

For use in Claude.ai and ChatGPT

주요 기능

010 GitHub stars

02Automated SLI/SLO definition and hierarchy mapping

03Multi-window error budget burn rate calculations

04Standardized error budget policies and review workflows

05Pre-configured Prometheus recording and alerting rules

06Structured Grafana dashboard templates for reliability tracking

사용 사례

01Establishing internal reliability targets for production microservices

02Implementing SRE practices within existing Prometheus/Grafana stacks

03Balancing feature velocity with system stability using data-driven error budgets

What are Skills?·How to Install

Install with 🐟 Skill.Fish

npx skillfish add hermeticormus/alqvimia-contador slo-implementation

For use in Claude.ai and ChatGPT