Can I specify custom split percentages?

Yes, you can define any specific proportions, such as 70% training, 15% validation, and 15% testing, and the skill will generate the corresponding logic.

Does it support stratified splitting for imbalanced data?

Yes, the skill follows best practices for data science and can implement stratification to ensure class distributions are maintained across all subsets.

Will my original dataset be modified?

No, the skill is designed to create new, separate files for each partition (e.g., train.csv, test.csv) to ensure your original source data remains untouched.

What file formats does this skill support?

The skill primarily generates Python code using standard libraries like Pandas, making it highly effective for CSV, Excel, and other common tabular data formats.

Dataset Splitter for ML

Name: Dataset Splitter for ML
Author: BbgnsurfTech

byBbgnsurfTech

•

Data Science & ML

Automates the partitioning of datasets into training, validation, and testing sets to streamline machine learning workflows.

This skill simplifies the data preparation phase of machine learning by automatically generating and executing Python code to split datasets according to user-defined ratios. Whether you need a standard 80/20 train-test split or a more complex three-way partition including a validation set, this skill ensures data integrity through randomization and optional stratification. It is an essential utility for data scientists and developers who need to evaluate model performance with robust, properly segmented data files without writing repetitive boilerplate code.

Key Features

01Automatic creation of separate CSV files for subsets

02Automated Python code generation for data partitioning

033 GitHub stars

04Best-practice implementation including data integrity checks

05Randomized splitting to prevent evaluation bias

06Support for custom split ratios (e.g., 70/15/15)

Use Cases

01Partitioning large-scale data for robust model evaluation

02Preparing a raw CSV file for a supervised learning model

03Creating a validation set to tune hyperparameters during training

What are Skills?·How to Install

Install with 🐟 Skill.Fish

npx skillfish add bbgnsurftech/claude-skills-collection skill-adapter

For use in Claude.ai and ChatGPT

Download Skill