Master materializations, DAGs, DRY principles, sources, packages, and Python models — the largest domain on the exam.
Domain 1 is the foundation of the exam. It tests your ability to build well-structured, performant dbt projects from scratch — converting business logic into maintainable SQL using dbt's core primitives.
Table, view, incremental, ephemeral — when and why to use each
Building clean DAGs and declaring raw dependencies
Macros, packages, and modular SQL for reusable code
dbt_project.yml, sources YAML, grants, Python models
The most common Domain 1 question type involves identifying where ref() should replace hardcoded table names. dbt can only infer DAG order from ref() — hardcoded table names are invisible to the compiler. Always replace every FROM and JOIN clause that references another dbt model with {{ ref('model_name') }}.
Click each topic to expand study notes.
is_incremental() macro; best for large, append-only datasets; needs unique_key for upserts{{ ref('model_name') }} — references another dbt model; dbt infers run order from these; use in every FROM/JOIN that references a dbt model{{ source('source_name', 'table_name') }} — references a raw table declared in a sources.yml; enables source freshness checksref(), dbt does not know the dependency and models may build in the wrong orderref(), not hardcoded table names.macros/; called with {{ macro_name() }}packages.yml; installed with dbt depsdbt_project.yml is the project config file — defines project name, model paths, and default/folder-level configurationsgrants config gives specific roles SELECT access on materialized models; defined in dbt_project.yml or inlineseeds/ folder; loaded with dbt seed; referenced in models with ref('seed_name')models/ with .py extension; must define a model(dbt, session) function that returns a DataFramedbt.ref() and dbt.source() inside the functiongit pull = git fetch + git merge — use to sync your branch with the head/main branchgit pull (not just git fetch) is the correct answer — it fetches AND merges.Check each item as you complete it. Track your readiness before the exam.
Know when to use view vs table vs incremental vs ephemeral
Build a simple 3-model DAG: staging → intermediate → mart
Understand what a source maps to (database + schema + table)
Implement a unique_key for upsert behavior
Know how folder-level configs cascade to child models
Edit packages.yml, run dbt deps, call {{ dbt_utils.generate_surrogate_key() }}
Practice Jinja syntax: {% macro %}, {{ }}, {% if %}
Understand node selection syntax: model+, +model, @model
Run dbt seed and verify it appears in the warehouse
Understand the grants config key and how it maps to GRANT SELECT
Only tables, platform-specific, uses dbt.ref() not ref()
git pull = fetch + merge; used to sync feature branch with main
Find compiled code in target/compiled/ directory
Read: "How we structure our dbt projects" — dbt Labs blog
Key syntax patterns for Domain 1 — study these cold.
5 exam-style questions. Select an answer to see the explanation.
Suggested 2-week approach — adjust based on your experience level.
These are the traps candidates fall into most often. Study the fix, not just the mistake.
The silent DAG breaker
Models build in the wrong order. dbt can't infer dependencies from hardcoded names — the downstream model may try to query a table that hasn't been built yet.
Replace every FROM schema.table and JOIN schema.table that references a dbt model with {{ ref('model_name') }}.
Misunderstanding incremental trade-offs
If a large % of rows update each run, an incremental model will still scan most of the table for changes, making it slower than a full rebuild.
Use incremental only when rows are mostly appended (not updated) and the dataset is large. For heavily-updated tables, use table materialization.
Using ref() for raw tables
Using ref() for a raw source table that isn't a dbt model will fail at compile time. source() exists specifically for raw/external tables.
Raw tables → declare in sources.yml and reference with {{ source('name','table') }}. dbt models → reference with {{ ref('model') }}.
When ephemeral becomes a performance problem
Ephemeral models are injected as CTEs — if many downstream models reference them, the same CTE is duplicated in every compiled query, causing slow compile times and large SQL.
Use ephemeral for light intermediate logic with only 1–2 consumers. For logic used by many models, use view or table instead.
Package macros aren't available until installed
Adding a package to packages.yml doesn't install it. Running dbt run before dbt deps will fail with "macro not found" errors.
After editing packages.yml, always run dbt deps first to install the packages into the dbt_packages/ directory.
dbt run executes only models. dbt build runs models, seeds, snapshots, AND tests together in DAG order — it's the recommended command for production jobs because it tests each node immediately after building it, catching issues early.--full-refresh), dbt builds the full table. The is_incremental() macro returns false on the first run, so the WHERE clause is not applied. Subsequent runs use incremental logic.schema config key. dbt appends the custom schema to the target schema by default (e.g., analytics_marketing), unless you override the generate_schema_name macro.+model selects the model and all its parents (upstream). model+ selects the model and all its children (downstream). @model selects the model, all its parents, AND all the parents of its children — useful for CI to ensure all dependencies of a changed model's downstream consumers are also built.dbt.ref('model_name') inside the model function. The dependency is tracked in the DAG just like SQL model references.grants config automatically runs GRANT statements after a model is materialized, giving specified roles or users SELECT access. This removes the need to manually manage permissions and ensures access is consistent across rebuilds. It's set in dbt_project.yml or inline in a model config block.