概要
This skill provides end-to-end guidance for building robust observability stacks, from designing SLIs/SLOs to implementing distributed tracing and structured logging. It assists in setting up Prometheus, Grafana, and Loki, creating actionable alerts, and generating automated dashboards. Whether you are migrating from Datadog to open-source alternatives or troubleshooting performance bottlenecks, this skill provides the implementation patterns and scripts needed to ensure your infrastructure is measurable and resilient.