What is the RED method mentioned in the documentation?

The RED method stands for Rate, Errors, and Duration. It is the industry-standard methodology for monitoring request-driven services included in this skill.

How does it help with LLM cost management?

The skill includes specialized rules for cost tracking and token usage monitoring using the Langfuse Metrics API to prevent budget overruns.

Can it detect when my AI model's quality drops?

Yes, it features statistical drift detection (using PSI and KS tests) and quality regression monitoring to alert you of performance changes in production.

Does this skill support self-hosted monitoring solutions?

Yes, the patterns are designed to work with both cloud and self-hosted instances of Langfuse and Prometheus for maximum data sovereignty.

What monitoring tools does this skill support?

It provides patterns and rules for Prometheus, Grafana, OpenTelemetry, and specifically Langfuse for specialized LLM-based observability.

Monitoring & Observability

Name: Monitoring & Observability
Author: yonatangross

byyonatangross

•

116

•

Analytics & Monitoring

Implements comprehensive infrastructure monitoring, LLM tracing, and drift detection patterns to ensure production-grade system reliability.

This skill provides a robust framework for managing the health and performance of modern applications, specifically focusing on the intersection of traditional infrastructure and LLM-driven systems. It includes standardized patterns for Prometheus metrics, Grafana dashboards, and OpenTelemetry tracing, alongside specialized modules for Langfuse LLM observability, cost tracking, and statistical drift detection. By automating the setup of RED metrics, golden signals, and quality evaluation scoring, it helps developers catch silent failures, regressions, and cost spikes before they impact production users.

Key Features

01116 GitHub stars

02Standardized alerting rules based on severity levels and dynamic percentile thresholds

03LLM observability with Langfuse for tracing, cost tracking, and evaluation scoring

04Statistical and quality drift detection to identify performance regressions in AI models

05Silent failure detection for LLM agents including tool skipping and token spike alerting

06Infrastructure monitoring using Prometheus metrics and Grafana dashboard templates

Use Cases

01Debugging silent agent failures and optimizing token costs through detailed trace analysis

02Implementing automated drift detection to monitor for quality degradation in production AI models

03Setting up full-stack observability for a new LLM-powered application using Prometheus and Langfuse

What are Skills?·How to Install

Install with 🐟 Skill.Fish

npx skillfish add yonatangross/orchestkit monitoring-observability

For use in Claude.ai and ChatGPT

Download Skill