What is the primary focus of this skill?

The skill focuses on implementing Service Level Objectives (SLOs) and Service Level Indicators (SLIs) to maintain system reliability.

Is this specific to one cloud provider?

No, the reliability frameworks and implementation logic provided are applicable across AWS, Google Cloud, Azure, and on-premise environments.

How does this help with feature velocity?

It uses error budgets to provide a data-driven framework for deciding when to prioritize stability over new features or vice versa.

Can I use this for dashboard configuration?

Yes, the skill provides patterns and best practices for building comprehensive SLO dashboards and automated alerting workflows.

Does this skill require existing telemetry data?

Yes, it is most effective when you have access to service metrics and telemetry to build and validate your reliability targets.

SLO & Reliability Monitoring

Name: SLO & Reliability Monitoring
Author: sickn33

bysickn33

•

31,722

•

분석 및 모니터링

Implements comprehensive Service Level Objective (SLO) frameworks and error budget practices to balance system reliability with feature velocity.

This expert-level skill facilitates the implementation of Site Reliability Engineering (SRE) standards by helping teams define Service Level Indicators (SLIs) and Service Level Objectives (SLOs). It guides users through creating meaningful monitoring systems, establishing error budgets, and aligning technical reliability targets with overarching business priorities. By providing domain-specific guidance on observability, it ensures that data-driven decisions govern the balance between infrastructure stability and rapid feature development.

주요 기능

0131,722 GitHub stars

02Establishment of error budget-based engineering practices

03Alignment of uptime targets with business objectives

04Definition of meaningful Service Level Indicators (SLIs)

05Design of reliability monitoring dashboards and alerts

06Standardization of observability practices across teams

사용 사례

01Designing stakeholder-ready observability and performance reports

02Establishing reliability targets for new microservices

03Integrating error budget tracking into CI/CD workflows

주요 기능

0131,722 GitHub stars

02Establishment of error budget-based engineering practices

03Alignment of uptime targets with business objectives

04Definition of meaningful Service Level Indicators (SLIs)

05Design of reliability monitoring dashboards and alerts

06Standardization of observability practices across teams

사용 사례

01Designing stakeholder-ready observability and performance reports

02Establishing reliability targets for new microservices

03Integrating error budget tracking into CI/CD workflows