About
This skill streamlines the data preparation phase of machine learning workflows by automatically dividing raw datasets into optimized subsets. It handles complex partitioning tasks like train-test-validation splits, ensuring data integrity through proper randomization and optional stratification for imbalanced datasets. By generating and executing specialized Python code using industry-standard libraries, it eliminates manual data handling errors and accelerates the transition from data collection to model training within the Claude Code environment.