Discover Agent Skills for analytics & monitoring. Browse 47 skills for Claude, ChatGPT & Codex.
Diagnoses and resolves LLM workflow issues by performing structured root-cause analysis on Langfuse traces.
Analyzes and visualizes LLM quality scores, trends, and regressions within the Langfuse observability platform.
Provides strategic guidance for evaluating and optimizing AI agents using Langfuse traces and data-driven iteration loops.
Manages the complete lifecycle of Langfuse LLM prompts, including version control, deployment labels, and side-by-side version comparisons.
Provides deep insights into multi-turn LLM conversations by analyzing and debugging Langfuse trace sessions.
Orchestrates end-to-end evaluation cycles for AI agents using Langfuse to identify performance regressions and generate actionable optimization reports.
Manages human annotations and manual scoring workflows for Langfuse LLM traces directly from Claude.
Analyzes multi-turn conversation flows and session-level metrics within Langfuse to debug user journeys and track LLM performance.
Manages Langfuse datasets for AI regression testing and golden set curation directly through the Claude Code CLI.
Analyzes tracked behaviors and outcomes to generate actionable insights and data-driven feedback loops.
Fetches and analyzes Langfuse traces with customizable output modes to debug LLM workflows and monitor application performance.
Analyzes LLM performance metrics and score trends directly within Claude Code using Langfuse observability data.
Manages Langfuse datasets and traces to streamline LLM application evaluation and regression testing.
Generates professional ASCII-based charts, progress bars, and dashboards for terminal-based data visualization.
Automates the creation and configuration of Langfuse datasets for LLM evaluation and observability workflows.
Builds interactive, publication-quality data visualizations and complex SVG-based diagrams using the D3.js library.
Configures and installs a real-time status line for Claude Code to monitor token usage, API costs, and context window limits.
Profiles and optimizes Python code to eliminate bottlenecks, reduce memory overhead, and accelerate execution using advanced profiling tools and best practices.
Facilitates LLM experiment execution and prompt evaluation using Langfuse datasets and automated LLM-as-judge scoring.
Maps high-level goals to the specific daily behaviors that produce them to create actionable, data-driven tracking systems.
Monitors, troubleshoots, and calibrates automated quality metrics and safety evaluations for AI-generated tarot readings.
Implements end-to-end request tracking across microservices using Jaeger, Tempo, and OpenTelemetry to identify performance bottlenecks.
Configures Prometheus for comprehensive metric collection, infrastructure monitoring, and proactive alerting.
Creates and manages production-grade Grafana dashboards for real-time visualization of system, infrastructure, and application metrics.
Implements measurable reliability targets using Service Level Indicators (SLIs), Service Level Objectives (SLOs), and error budgets to balance stability with innovation velocity.
Manages healthcare metrics and dbt models for the Engage Analytics project, including questionnaire processing and indicator definitions.
Performs comprehensive market sizing calculations using TAM, SAM, and SOM frameworks to evaluate business opportunities and startup potential.
Streamlines the development, debugging, and optimization of scripts within SAP Analytics Cloud Analytics Designer and Optimized Story Experience.
Monitors system-wide health metrics including CPU utilization, memory consumption, disk usage, and active process states.
Analyzes financial transaction data to generate comprehensive spending insights, interactive visualizations, and personalized budget recommendations.
Scroll for more results...