概要
This skill empowers Claude to architect and implement scalable data pipelines using Ray Data, the industry standard for distributed machine learning data processing. It provides specialized guidance for loading multi-modal data, performing high-performance vectorized transformations, and integrating seamlessly with training frameworks like PyTorch and TensorFlow. By utilizing streaming execution and GPU-accelerated transforms, it allows developers to handle datasets exceeding local memory capacity while scaling from a single laptop to hundreds of cluster nodes for batch inference and ETL tasks.