Does it include alerting templates?

It provides predefined Prometheus alerting rules covering critical scenarios like service outages, high error rates, and latency thresholds (p95/p99).

How does it handle request tracking across services?

The skill implements correlation IDs and trace context propagation, allowing you to follow a single user request through multiple backend services in your logs and traces.

What observability standards does this skill support?

This skill implements industry-standard practices including OpenTelemetry for distributed tracing, the RED method for service metrics, and structured JSON for machine-parseable logging.

Is this skill compatible with Kubernetes environments?

Absolutely. It includes patterns for implementing Kubernetes liveness, readiness, and startup probes to ensure proper container orchestration and health monitoring.

Can it help monitor LLM-specific metrics?

Yes, it includes specialized integration for Langfuse to track LLM-specific data such as token usage, execution costs, and prompt/response traces.

Observability & Monitoring

Name: Observability & Monitoring
Author: yonatangross

byyonatangross

•

分析と監視

Implements comprehensive observability frameworks including structured logging, Prometheus metrics, and distributed OpenTelemetry tracing.

The Observability & Monitoring skill provides a standardized approach to tracking application health and performance across distributed systems. It guides developers through the implementation of the three pillars of observability—logs, metrics, and traces—using industry-standard tools like Prometheus and OpenTelemetry. By integrating structured JSON logging, RED method metrics, and specialized Langfuse decorators for LLM-specific monitoring, this skill ensures production-ready systems are measurable, debuggable, and resilient.

主な機能

01Specialized LLM observability for tracking token usage, latency, and costs via Langfuse.

02Pre-configured alerting rules and dashboard designs for proactive incident management.

03Setup of Prometheus metrics following the RED (Rate, Errors, Duration) method for service health.

0469 GitHub stars

05Distributed tracing integration using OpenTelemetry for visual request waterfall analysis.

06Implementation of structured JSON logging and correlation IDs for cross-service request tracking.

ユースケース

01Monitoring LLM application performance and operational costs in production environments.

02Establishing Service Level Objectives (SLOs) through automated metrics collection and alerting.

03Debugging complex distributed systems by correlating logs and traces across multiple microservices.

What are Skills?·How to Install

Install with 🐟 Skill.Fish

npx skillfish add yonatangross/orchestkit observability-monitoring

For use in Claude.ai and ChatGPT

主な機能

01Specialized LLM observability for tracking token usage, latency, and costs via Langfuse.

02Pre-configured alerting rules and dashboard designs for proactive incident management.

03Setup of Prometheus metrics following the RED (Rate, Errors, Duration) method for service health.

0469 GitHub stars

05Distributed tracing integration using OpenTelemetry for visual request waterfall analysis.

06Implementation of structured JSON logging and correlation IDs for cross-service request tracking.

ユースケース

01Monitoring LLM application performance and operational costs in production environments.

02Establishing Service Level Objectives (SLOs) through automated metrics collection and alerting.

03Debugging complex distributed systems by correlating logs and traces across multiple microservices.