Who is this skill designed for?

It is built for software engineers, SREs, and DevOps professionals who manage production systems and participate in on-call rotations.

How does this skill help during a SEV-1 incident?

It provides immediate, structured guidance for acknowledgement, stakeholder notification, and mitigation steps like rollbacks or scaling to reduce Mean Time to Recovery (MTTR).

What infrastructure tools does this cover?

The provided templates include common bash and kubectl commands for Kubernetes environments, but can be adapted for any cloud provider or stack.

Can I customize the severity levels?

Yes, the skill acts as a foundation that Claude uses to draft procedures tailored to your specific organizational needs and infrastructure.

Does it include post-incident templates?

Yes, it includes a comprehensive post-mortem template designed to identify root causes, contributing factors, and trackable action items.

Incident Response Runbooks

Name: Incident Response Runbooks
Author: TheBushidoCollective

byTheBushidoCollective

•

분석 및 모니터링

Standardizes production incident management with specialized playbooks for on-call engineering and post-mortem documentation.

This skill provides Claude with a comprehensive framework for handling system outages and performance degradations. It empowers developers and SREs to navigate high-pressure on-call scenarios by providing structured templates for incident detection, severity assessment, stakeholder communication, and technical mitigation strategies like rollbacks and service restarts. By embedding best practices for investigation and post-incident analysis, it ensures teams not only resolve issues quickly but also capture the root causes necessary for long-term system reliability.

주요 기능

01Severity-based incident management framework (SEV-1 to SEV-4)

02Standardized communication templates for stakeholder updates

0397 GitHub stars

04Comprehensive post-mortem and root cause analysis templates

05Technical mitigation playbooks for Kubernetes and API services

06On-call handoff procedures and escalation decision trees

사용 사례

01Drafting immediate status updates during an active production outage

02Creating a detailed post-mortem report following a service recovery

03Standardizing on-call handoff documentation for engineering teams

주요 기능

01Severity-based incident management framework (SEV-1 to SEV-4)

02Standardized communication templates for stakeholder updates

0397 GitHub stars

04Comprehensive post-mortem and root cause analysis templates

05Technical mitigation playbooks for Kubernetes and API services

06On-call handoff procedures and escalation decision trees

사용 사례

01Drafting immediate status updates during an active production outage

02Creating a detailed post-mortem report following a service recovery

03Standardizing on-call handoff documentation for engineering teams