소개
This skill provides comprehensive guidance for diagnosing and resolving performance bottlenecks in Apache Spark applications. It offers production-ready patterns for efficient memory management, join optimizations (including broadcast and salt joins), data skew mitigation, and storage format tuning. Whether you are dealing with OutOfMemory (OOM) errors, slow shuffles, or scaling data pipelines for massive datasets, this skill equips Claude with the technical patterns needed to build robust, high-performance distributed data processing jobs.