What cloud platforms does this skill support?

It provides comprehensive guidance and implementation patterns for AWS (Glue, Redshift, S3), Azure (Synapse, Data Factory), and GCP (BigQuery, Dataflow) data stacks.

Can it help with real-time data processing?

Yes, it specializes in streaming architectures using Apache Kafka, Confluent, Apache Flink, and Change Data Capture (CDC) patterns for low-latency data movement.

Is it suitable for dbt and Airflow workflows?

Absolutely; it is an expert in the modern data stack, focusing on orchestration with Airflow/Dagster and SQL-based transformations using dbt Core and Cloud.

Does it include data quality tools?

It implements frameworks like Great Expectations and provides best practices for data validation, lineage tracking, and governance early in the design phase.

Data Engineering Expert

Name: Data Engineering Expert
Author: sickn33

bysickn33

•

31,721

•

Database Management

Builds and optimizes scalable data pipelines, modern data warehouses, and real-time streaming architectures using industry-standard tools.

This skill transforms Claude into a senior data engineer capable of designing, implementing, and optimizing complex data ecosystems. It covers everything from batch and streaming pipelines to data lakehouse architectures and cloud-native platforms like Snowflake, Databricks, and BigQuery. Use it to implement robust ETL/ELT workflows with dbt and Airflow, ensure data quality with Great Expectations, and manage large-scale data infrastructure using best practices in governance, security, and performance tuning.

Key Features

01Modern data warehouse and lakehouse implementation on AWS, Azure, and GCP

02Workflow orchestration and automation using Airflow, Dagster, and Prefect

0331,721 GitHub stars

04Advanced data modeling including dimensional, Data Vault, and OBT patterns

05Integrated data quality frameworks, lineage tracking, and governance

06End-to-end batch and streaming pipeline design using Spark, Flink, and Kafka

Use Cases

01Building a scalable dbt transformation layer for Snowflake or BigQuery environments

02Implementing automated data quality monitoring to prevent production pipeline failures

03Designing a real-time CDC pipeline to sync production databases with a cloud data warehouse

Key Features

01Modern data warehouse and lakehouse implementation on AWS, Azure, and GCP

02Workflow orchestration and automation using Airflow, Dagster, and Prefect

0331,721 GitHub stars

04Advanced data modeling including dimensional, Data Vault, and OBT patterns

05Integrated data quality frameworks, lineage tracking, and governance

06End-to-end batch and streaming pipeline design using Spark, Flink, and Kafka

Use Cases

01Building a scalable dbt transformation layer for Snowflake or BigQuery environments

02Implementing automated data quality monitoring to prevent production pipeline failures

03Designing a real-time CDC pipeline to sync production databases with a cloud data warehouse