Acerca de
This skill provides a robust framework for Site Reliability Engineering (SRE) practices, allowing teams to move beyond simple monitoring to goal-oriented observability. It facilitates the definition of Service Level Indicators (SLIs) for availability, latency, and durability, and helps set realistic Service Level Objectives (SLOs) based on user expectations. By calculating error budgets and generating sophisticated Prometheus alerting rules—including multi-window burn rates—it ensures that reliability is measurable, actionable, and aligned with business objectives, helping teams decide when to prioritize feature work versus stability improvements.