Detects and analyzes performance regressions across software versions using statistical significance testing and automated benchmarking.
This skill enables Claude to automatically identify performance degradation by comparing current metrics against historical baselines. It integrates with industry-standard tools like k6 and Artillery to measure latency percentiles (p50-p99), throughput, and memory utilization. By applying rigorous statistical tests like Welch's t-test and Mann-Whitney U, the skill filters out environmental noise to provide high-confidence reports on regressions, helping developers catch bottlenecks and memory leaks early in the SDLC.
Características Principales
01Automated baseline identification from git tags and historical monitoring data
02Deep memory leak detection via heap snapshot and growth rate analysis
03Advanced statistical significance testing to eliminate benchmark noise
04102 GitHub stars
05Comprehensive reporting with latency distributions and resource utilization trends
06Multi-tool integration for load testing including k6, Artillery, and wrk
Casos de Uso
01Detecting subtle memory leaks and heap growth in high-traffic applications
02Comparing API latency distributions (p95, p99) between production releases
03Validating performance stability before merging major architectural refactors