Medallion Schema Reference FAQs

Question 1

When should I use this skill?

Accepted Answer

You should activate this skill proactively whenever you need to generate or update PySpark transformation code, especially when moving data between the Bronze, Silver, and Gold layers of a Medallion architecture.

Question 2

What does the Medallion Schema Reference skill do?

Accepted Answer

This skill acts as an automated reference engine for Claude Code. It queries DuckDB warehouses and parses data dictionaries to provide Claude with exact column names, data types, and business rules before it generates PySpark ETL code.

Question 3

What specific capabilities does it provide for PySpark?

Accepted Answer

The skill provides automated schema querying, business logic extraction from markdown dictionaries, cross-layer mapping validation, and the generation of standardized ETL class patterns with built-in error handling.

Question 4

Can it extract business logic from documentation?

Accepted Answer

Yes. It includes a specialized script to parse markdown-based data dictionaries, allowing it to identify primary/foreign key relationships, data quality rules, and specific business transformations that need to be implemented in the code.

Question 5

How does this skill improve my data engineering workflow?

Accepted Answer

It eliminates manual schema lookups and prevents 'hallucinated' column names. By automating the comparison between source and target layers, it ensures that your generated ETL scripts are accurate, standardized, and production-ready from the first draft.

Medallion Schema Reference

Medallion Schema Reference

About

Key Features

Use Cases

About

Key Features

Use Cases