Can I use this for multi-class classification?

Yes, it includes patterns for macro, micro, and weighted averaging for multi-class metrics, as well as one-vs-all ROC calculations.

How do I handle imbalanced datasets?

The skill includes specific patterns for PR AUC (Precision-Recall) and gain/lift curves, which are often superior to standard ROC curves for imbalanced data.

Which R packages does this skill primarily use?

It focuses on the tidymodels ecosystem, specifically yardstick for metrics, probably for calibration and thresholds, and tidyposterior for statistical model comparison.

Does it support regression tasks?

Absolutely, it provides patterns for standard metrics like RMSE and R-squared, as well as robust metrics like Huber loss and MAE.

Can this help with prediction uncertainty?

Yes, it provides implementations for bootstrap confidence intervals and conformal prediction intervals using the probably package.

TidyR Model Evaluation

Name: TidyR Model Evaluation
Author: choxos

bychoxos

数据科学与机器学习

Streamlines machine learning model performance assessment in R using the tidymodels ecosystem.

关于

This skill provides standardized patterns and implementation guides for evaluating machine learning models in R, specifically leveraging the yardstick and probably packages. It offers a comprehensive suite of tools for binary and multi-class classification, regression analysis, and survival outcomes. Users can easily implement sophisticated evaluation workflows including probability calibration, threshold optimization, visual diagnostics like ROC and Precision-Recall curves, and statistical model comparisons using tidyposterior. It is an essential resource for data scientists seeking to move beyond simple accuracy to rigorous, production-ready model validation.

主要功能

Comprehensive metric sets for classification, regression, and cost-sensitive tasks
Advanced threshold optimization for J-index and custom misclassification costs
Probability calibration methods including Platt scaling and isotonic regression
0 GitHub stars
Automated visualization patterns for ROC, PR, calibration, and gain/lift curves
Statistical model comparison and uncertainty quantification via bootstrapping

使用场景

Performing rigorous statistical comparisons between multiple model architectures using resampled data
Optimizing decision thresholds for medical or financial models where false positives and negatives have different costs
Validating classifier performance on imbalanced datasets using PR AUC and specialized metrics

关于

主要功能

Comprehensive metric sets for classification, regression, and cost-sensitive tasks
Advanced threshold optimization for J-index and custom misclassification costs
Probability calibration methods including Platt scaling and isotonic regression
0 GitHub stars
Automated visualization patterns for ROC, PR, calibration, and gain/lift curves
Statistical model comparison and uncertainty quantification via bootstrapping

使用场景

Performing rigorous statistical comparisons between multiple model architectures using resampled data
Optimizing decision thresholds for medical or financial models where false positives and negatives have different costs
Validating classifier performance on imbalanced datasets using PR AUC and specialized metrics