Can it help with data modeling and architecture?

Yes, it includes deep knowledge of dimensional modeling, Star and Snowflake schemas, Slowly Changing Dimensions (SCD), and Data Vault architectures.

Can it assist with troubleshooting slow data pipelines?

Yes, it can analyze Spark execution plans, SQL queries, and Airflow task logs to recommend performance optimizations and resource tuning.

Is it suitable for real-time streaming projects?

Absolutely; it includes specific patterns for Kafka event streaming, Spark Streaming configurations, and handling late-arriving data in real-time architectures.

How does it handle data quality and validation?

The skill provides workflows for implementing data validation frameworks like Great Expectations and configuring automated tests within dbt and Airflow DAGs.

What tools does this skill support?

It provides expertise in the modern data stack, including Python, SQL, Apache Spark, Airflow, dbt, Kafka, Snowflake, and Delta Lake.

Senior Data Engineer

Name: Senior Data Engineer
Author: oabdelmaksoud

byoabdelmaksoud

0•

데이터베이스 관리

Builds scalable, production-grade data pipelines and infrastructure using modern tools like Spark, Airflow, and dbt.

This skill empowers Claude with the specialized knowledge required to architect, implement, and optimize complex data systems. It covers the full lifecycle of data engineering, from designing robust ETL/ELT workflows and real-time streaming architectures to implementing advanced dimensional modeling and DataOps practices. Whether you are managing massive datasets with Spark, orchestrating workflows with Airflow, or ensuring data integrity with automated quality frameworks, this skill provides the best practices and implementation patterns necessary for reliable and high-performance data infrastructure.

주요 기능

01Design and implementation of batch and incremental ETL/ELT pipelines.

02Real-time event streaming architectures using Kafka and Spark Streaming.

03Automated data quality validation and monitoring frameworks.

04Performance tuning for SQL queries and large-scale distributed processing jobs.

05Advanced data modeling including Star Schema, Snowflake, and Data Vault.

060 GitHub stars

사용 사례

01Architecting a modern data lakehouse for unified analytics and reporting.

02Migrating legacy data processes to a modern stack with dbt and Snowflake.

03Implementing a comprehensive DataOps strategy with automated testing and observability.

주요 기능

01Design and implementation of batch and incremental ETL/ELT pipelines.

02Real-time event streaming architectures using Kafka and Spark Streaming.

03Automated data quality validation and monitoring frameworks.

04Performance tuning for SQL queries and large-scale distributed processing jobs.

05Advanced data modeling including Star Schema, Snowflake, and Data Vault.

060 GitHub stars

사용 사례

01Architecting a modern data lakehouse for unified analytics and reporting.

02Migrating legacy data processes to a modern stack with dbt and Snowflake.

03Implementing a comprehensive DataOps strategy with automated testing and observability.