Discover Agent Skills for analytics & monitoring. Browse 47skills for Claude, ChatGPT & Codex.
Configures and manages Grafana alerts for Claude Code to monitor session anomalies, error rates, and resource utilization.
Configures and enables OpenTelemetry logging, metrics, and tracing for Claude Code to monitor session performance and costs.
Deploys a comprehensive LGTM observability stack to Railway cloud for centralized monitoring and team access.
Builds bespoke, interactive data visualizations and complex codebase maps using the D3.js library for high-level data storytelling.
Monitors and reduces Anthropic API expenses through advanced token tracking and implementation of optimization patterns like prompt caching and effort selection.
Orchestrates a multi-agent debugging workflow using Claude, Gemini, and Codex to perform advanced root cause analysis and automated fix generation.
Analyzes real-time cryptocurrency market sentiment and whale activity using Grok's native X integration.
Orchestrates multiple AI agents to perform systematic root cause analysis, semantic log classification, and automated fix generation for complex system failures.
Analyzes Claude Code telemetry data to provide deep insights into performance, costs, and tool usage patterns.
Automates the setup and management of comprehensive Grafana dashboards for monitoring Claude Code performance, costs, and errors.
Performs comprehensive system audits and diagnostic health checks for Maestro skills, agents, hooks, and memory systems.
Analyzes Claude Flow swarms to detect performance bottlenecks, profile operations, and provide actionable AI-powered optimization recommendations.
Extracts actionable patterns and learnings from autonomous coding sessions to optimize future AI performance.
Deploys and manages comprehensive Grafana dashboards for monitoring Claude Code performance, costs, and session health.
Accesses and analyzes Railway build, deployment, and runtime logs for debugging and monitoring applications.
Extracts actionable insights and performance patterns from autonomous coding sessions to optimize future AI interactions.
Analyzes Claude Code telemetry to generate actionable insights into performance, costs, and tool usage patterns.
Monitors token usage and optimizes API expenditure for autonomous coding agents.
Implements robust error-handling patterns across API routes, client-side components, and data fetching logic to ensure application stability and graceful failure.
Debugs and resolves intent classification, routing errors, and detection misfires within the OpenEvent-AI workflow.
Debugs and eliminates generic fallback responses by pinpointing failure triggers and automating reproduction steps.
Validates Decision API JSONL logs against schema requirements and platform invariants to ensure data integrity.
Guides teams through creating blameless post-incident reviews, identifying root causes, and implementing actionable follow-up items.
Identifies performance bottlenecks in Python code through systematic profiling and applies targeted, measurable optimizations.
Standardizes on-call shift transitions using structured context transfer, incident documentation, and escalation procedures to ensure service reliability.
Streamlines production incident management by providing structured runbook templates and standardized response procedures.
Implements comprehensive monitoring, distributed tracing, and visualization for Istio and Linkerd service mesh environments.
Implements distributed tracing using Jaeger and Tempo to monitor request flows and optimize performance in microservices architectures.
Configures and optimizes Prometheus for robust metric collection, alerting, and observability across infrastructure and applications.
Enforces an observation-first debugging workflow by verifying system outputs and logs before making code changes.
Scroll for more results...