关于
The Spark Basics skill equips Claude with essential patterns for distributed data processing using PySpark. It provides immediate access to optimized code snippets for session management, data ingestion from various formats like Parquet and Delta Lake, complex transformations, and efficient data writing strategies. This skill is particularly useful for data engineers and scientists who need to build robust ETL pipelines while following performance best practices like broadcast joins, predicate pushdown, and efficient resource caching.