Which forecasting models is this compatible with?

It is designed to work with any model metrics output by Nixtla libraries (StatsForecast, TimeGPT, NeuralForecast) or any custom model following the CSV format.

What statistical insights are provided in the reports?

Reports include mean, median, standard deviation, min/max values, percentiles (25th to 95th), and model win rates across all series.

What file formats does this skill support?

The skill parses standard CSV files containing forecast metrics with columns for series_id, model, and metrics like sMAPE, MASE, MAE, or RMSE.

Can I use this for regression testing in CI/CD?

Yes, by providing a baseline CSV and a threshold percentage, the skill can automatically detect performance drops and generate GitHub issue templates for your team.

How do I trigger the report generation?

You can trigger the skill in Claude Code using commands like 'generate benchmark report', 'analyze forecast metrics', or 'create performance summary'.

Nixtla Benchmark Reporter

Name: Nixtla Benchmark Reporter
Author: intent-solutions-io

byintent-solutions-io

0•

데이터 과학 및 ML

Generates detailed markdown reports and regression analysis from forecasting accuracy metrics to automate model benchmarking.

This skill automates the evaluation of Nixtla forecasting experiments by transforming raw metrics into comprehensive, production-ready benchmark reports. It calculates key summary statistics, identifies best-performing models, detects performance regressions against historical baselines, and generates actionable recommendations. Designed for data scientists and ML engineers, it streamlines the evaluation process from hours to minutes, ensuring consistent reporting and rigorous quality control across time-series forecasting workflows.

주요 기능

01Automated calculation of sMAPE, MASE, MAE, and RMSE summary statistics

02Multi-model comparison tables with automatic winner identification

03Support for multiple output formats including executive summaries and technical reports

04Performance regression detection with configurable alerting thresholds

050 GitHub stars

06Generation of GitHub issue templates for automated degradation reporting

사용 사례

01Comparing TimeGPT or StatsForecast model performance across large-scale datasets

02Monitoring forecast quality in CI/CD pipelines to prevent performance regressions

03Communicating forecasting experiment results to stakeholders via formatted markdown reports

What are Skills?·How to Install

Install with 🐟 Skill.Fish

npx skillfish add intent-solutions-io/plugins-nixtla nixtla-benchmark-reporter

For use in Claude.ai and ChatGPT

Download Skill