Automates resilience testing and fault injection for Kubernetes clusters using Chaos Mesh and LitmusChaos patterns.
This skill empowers developers and SREs to transform reliability from a passive afterthought into an active practice by injecting intentional faults into Kubernetes environments. It provides production-proven experiment patterns for pod deletion, network latency, and resource constraints, helping teams identify system weaknesses, limit blast radius, and validate that observability tools correctly detect issues before they impact production.
주요 기능
01Blast radius control and safety guardrails
020 GitHub stars
03Chaos Mesh and LitmusChaos YAML configurations
04Network latency and packet loss simulation
05Pod deletion and resource exhaustion patterns
06Automated validation and rollback procedures
사용 사례
01Testing microservice resilience against sudden pod failures
02Validating monitoring and alerting during infrastructure stress
03Simulating network partitions and cross-region latency