关于
This skill provides expert guidance for working with Apache Parquet, the industry-standard columnar storage format for big data. It empowers developers and data engineers to implement efficient storage patterns using Python, Pandas, and PyArrow. The skill covers essential performance optimizations including row group management, predicate pushdown for faster queries, and complex schema evolution strategies. Whether you are building a high-performance data lake or managing analytical pipelines, this skill ensures your data is stored correctly, compressed efficiently, and accessible with minimal overhead.