FlashGenius Logo FlashGenius
dbt Certification · Domain 3 of 6

dbt Analytics Engineering Certification — Debugging & Pipeline Management

Error logs, compiled SQL, YAML issues, DAG failure troubleshooting, and dbt clone — the skills that separate experienced analytics engineers.

~20%Combined Weight
Domains 3+4Combined
High PracticalSkill Focus
Scenario-BasedQuestion Style

What This Domain Covers

This combined domain covers how analytics engineers diagnose and fix problems — both in development (Domains 3) and in running data pipelines (Domain 4). Exam questions are scenario-based: "you see this error, what do you do?"

📋

Error Logs

Reading and interpreting dbt log output and error messages

🔍

Compiled SQL

Using target/compiled to isolate whether issues are dbt or SQL

⚙️

YAML Errors

Diagnosing .yml compilation errors and schema mismatches

🔄

Pipeline Mgmt

DAG failures, dbt clone, and integrated tool troubleshooting

🎯 Key Exam Insight

The exam distinguishes between a pure SQL error (wrong query logic) and a dbt error that presents as SQL (e.g., wrong ref() causing a missing table). The diagnostic step is always: compile the model and run the compiled SQL directly in your warehouse to see if the SQL itself works.

Key Concepts Deep Dive

📋
Reading Error Logs
Locating the root cause in dbt output
High Frequency
  • dbt logs are in logs/dbt.log and printed to the terminal during runs
  • Errors always include: the model that failed, the error type, and the database error message
  • Look for "Compilation Error" vs "Database Error" — different root causes, different fixes
  • Compilation Error — problem in your dbt YAML or Jinja; occurs before SQL runs
  • Database Error — the SQL was compiled fine but failed in the warehouse
  • dbt run exits with a non-zero code on any error; use --warn-error to treat warnings as errors
💡 First step in any debug: identify if it's a Compilation Error (dbt/Jinja issue) or a Database Error (SQL issue). The fix is completely different for each.
🔍
Using Compiled SQL to Debug
Isolating dbt issues from SQL issues
High Frequency
  • Run dbt compile to generate compiled SQL without executing it
  • Compiled SQL lives in target/compiled/[project]/models/[path]/model.sql
  • Copy the compiled SQL and run it directly in your warehouse query editor
  • If the SQL runs fine in the warehouse → it's a dbt execution or environment issue
  • If the SQL fails in the warehouse → it's a SQL logic problem
  • Compiled SQL also reveals what ref() and source() resolved to — useful for catching wrong table references
💡 The dbt Cloud IDE has a "Compile" button that shows compiled SQL instantly — use this to rapidly debug during development.
⚙️
YAML Compilation Errors
Schema file issues and .yml debugging
Medium Frequency
  • YAML errors occur before any SQL runs — dbt can't parse the project
  • Common causes: indentation errors, wrong key names, missing required fields, duplicate model names
  • Error message will indicate the file and often the line number
  • YAML is indentation-sensitive — use a YAML linter to catch spacing issues
  • Missing version: 2 at the top of a schema file causes parsing failures
  • Test for YAML errors with dbt parse — faster than running the full project
💡 dbt parse is the fastest way to catch YAML/schema errors. Run it before dbt run when you've made schema file changes.
🔄
DAG Failure Troubleshooting
Managing failures in production pipelines
High Frequency
  • When a model fails in the DAG, all downstream models are skipped (not failed)
  • dbt retry — re-runs only the failed nodes and their downstream dependencies from the last run
  • Use --select result:error+ to manually select all errored models and their children
  • Skipped models show as "WARN" in output, not "ERROR" — the error is in the upstream model
  • Fix the root cause model first, then retry — don't just retry skipped models
💡 dbt retry uses the run_results.json from the last run. It's smarter than re-running everything — it knows exactly which nodes failed and their downstream impact.
📋
dbt clone
Zero-copy cloning for CI and development
Medium Frequency
  • dbt clone creates zero-copy clones of production models in a dev/CI schema without re-running transformations
  • Platform support: Snowflake, BigQuery, Databricks (zero-copy); others use CREATE TABLE AS SELECT
  • Common use case: in CI, clone production tables to the PR schema so downstream tests have real data to run against
  • Syntax: dbt clone --select [models] with appropriate target configuration
  • Clones are pointers to production data (on supported platforms) — changes to production are reflected; changes to the clone don't affect production
💡 dbt clone solves the "my CI environment has no data" problem. Clone production into the CI schema, then run only the changed models on top of it.

Common Error Types & Fixes

Recognizing error patterns is a core exam skill.

🔴 Compilation Error: "Model 'X' (models/X.sql) depends on a node named 'Y' which was not found"

Cause: ref('Y') references a model that doesn't exist in the project, or the model name is misspelled.

Fix: Check the ref() call for typos. Verify the model file exists. Check that the model isn't excluded by a +enabled: false config.

🔴 Database Error: "Object 'SCHEMA.TABLE' does not exist"

Cause: A hardcoded table reference resolves to the wrong environment (dev vs prod), or the model hasn't been built yet.

Fix: Replace hardcoded references with ref(). Verify you're targeting the correct environment. Check if the referenced model was skipped due to an upstream failure.

🔴 YAML Error: "Error reading models/schema.yml: Syntax error..."

Cause: Indentation error, wrong key name, or invalid YAML syntax in a schema file.

Fix: Run dbt parse to get the line number. Check indentation (use spaces, not tabs). Validate the YAML with an online linter.

🔴 Incremental Error: "Column 'X' does not exist in target relation"

Cause: The incremental model added a new column, but the existing table in the warehouse doesn't have it.

Fix: Run dbt run --full-refresh for the model to rebuild the table with the new schema. Or use on_schema_change: 'append_new_columns' config.

🔴 Test Failure: "Got N results, configured to fail if != 0"

Cause: A data test (e.g., not_null, unique) found violations in the model's output.

Fix: This is data quality issue, not a dbt bug. Investigate the source data or model logic. Use store_failures to create a table of failing rows for analysis.

🔴 CI Build Failure: "Relation not found" in PR environment

Cause: The CI environment doesn't have the production models that the changed model depends on.

Fix: Use dbt clone to copy production models into the CI schema before running tests on the changed models. This is the "slim CI" pattern.

Quick Reference

Debug Workflow

# 1. Get error details dbt run --select failing_model # 2. Check compiled SQL dbt compile --select failing_model # → target/compiled/.../model.sql # 3. Run compiled SQL in warehouse # If it fails → SQL issue # If it passes → dbt issue # 4. After fix, test before merge dbt build --select failing_model

dbt retry & State

# Retry failed nodes from last run dbt retry # Manually select errored + downstream dbt run --select result:error+ # Select skipped models dbt run --select result:skipped # Check run results # → target/run_results.json

dbt clone (Slim CI)

# Clone prod models to CI schema dbt clone \ --select state:modified+ \ --state ./prod-artifacts \ --target ci # Then run only changed models dbt build \ --select state:modified+ \ --state ./prod-artifacts

Incremental Schema Changes

{{ config( materialized='incremental', on_schema_change='append_new_columns' -- options: ignore, fail, -- append_new_columns, -- sync_all_columns ) }} # Full rebuild: dbt run --full-refresh \ --select model_name

Domain 3 Practice Quiz

Scenario-based questions — identify the error and pick the right fix.

Domain 3 Study Plan

Debugging is a hands-on skill — practice breaking and fixing things.

Days 1–2 · Learn the Debug Workflow

  • Run dbt compile on a model and find the compiled file in target/compiled/
  • Intentionally introduce a bug (wrong column name) and use compiled SQL to diagnose
  • Practice distinguishing "Compilation Error" from "Database Error" in log output

Days 3–4 · YAML & DAG Failures

  • Introduce an intentional YAML indentation error and run dbt parse to find it
  • Simulate a DAG failure: break one model, observe which downstream models are skipped
  • Fix the root cause and use dbt retry — observe that it only re-runs the failed model and its children

Days 5–6 · dbt clone & CI patterns

  • Read the dbt clone documentation and understand zero-copy cloning on your platform
  • Understand the slim CI pattern: clone prod → build only changed models → run tests
  • Do 15–20 practice questions on debugging scenarios

Common Exam Mistakes — Domain 3

1

Treating skipped models as errors

Skipped ≠ failed

What Goes Wrong

When a model fails, downstream models are skipped (not failed). Candidates try to debug the skipped models instead of the actual failed upstream model.

The Fix

Find the model marked as ERROR in the output (not SKIP/WARN). Fix that model first. The skipped models will resolve automatically on the next run.

🛡️ In dbt output: ERROR = root cause. SKIP = victim. Always fix the ERROR first.
2

Not using compiled SQL to isolate bugs

Guessing at SQL logic without inspecting the compiled output

What Goes Wrong

Developers modify the model.sql repeatedly without checking what the compiled SQL actually looks like, missing that the issue is a Jinja/ref() resolution problem, not their SQL logic.

The Fix

Always run dbt compile first and inspect target/compiled/. The compiled SQL shows you exactly what dbt will execute — no guessing.

🛡️ dbt compile is free (no warehouse cost). Use it before every debug session.
3

Running the full project to test a single fix

Using dbt run without --select

What Goes Wrong

Running dbt run or dbt build without a --select flag rebuilds the entire project, wasting time and compute when you only changed one model.

The Fix

Use dbt build --select model_name to build and test just the changed model. Add + modifiers if you need parents or children: +model_name+.

🛡️ During development, always use --select. Save full runs for CI/CD pipelines.
4

Confusing dbt clone with dbt run

Clone doesn't execute transformations

What Goes Wrong

Candidates think dbt clone builds/refreshes models. It doesn't — it creates a reference (or copy) of an existing materialized object without running the model SQL.

The Fix

dbt clone copies existing materialized tables/views to a new schema. Use it to give your CI environment access to production data without re-running all transformations.

🛡️ Clone = copy existing artifact. Run = execute SQL and create/replace artifact.
5

Overlooking on_schema_change for incremental models

Schema changes break incrementals silently

What Goes Wrong

Adding a new column to an incremental model causes a schema mismatch error because the existing table doesn't have the new column. The default behavior is 'ignore' — new columns are dropped.

The Fix

Set on_schema_change: 'append_new_columns' to add new columns to the existing table, or run --full-refresh to rebuild the table with the updated schema.

🛡️ When adding columns to an incremental model, always plan for on_schema_change behavior.

Frequently Asked Questions

Where does dbt write compiled SQL? +
Compiled SQL is written to target/compiled/[project_name]/models/[path]/model.sql. The directory structure mirrors your models/ folder. Run dbt compile to generate all compiled files without executing them against the warehouse.
What is the difference between dbt retry and re-running with --select result:error+? +
dbt retry is a convenience wrapper — it reads the last run's results from target/run_results.json and re-runs failed + skipped nodes automatically. --select result:error+ is the manual equivalent, giving you more control over which results to act on. Both achieve the same result but retry is faster for the common case.
How do I determine if an error is a dbt issue vs a SQL issue? +
Run dbt compile --select model_name to generate the compiled SQL. Then copy the compiled SQL from target/compiled/ and run it directly in your warehouse query editor. If it runs successfully → the problem is in how dbt executes or connects. If it fails → the SQL logic is incorrect and needs to be fixed in your model.
When should I use dbt clone in a CI workflow? +
Use dbt clone when your CI environment doesn't have the production data that your changed models depend on. The pattern: (1) clone the production models that are parents of your changed models into the CI schema, (2) run only the changed models with --select state:modified+. This ensures tests run on real-ish data without rebuilding the entire warehouse.
What does the --full-refresh flag do to an incremental model? +
--full-refresh causes dbt to drop and recreate the incremental model as if it were a table — ignoring the is_incremental() filter and building from all source data. Use it when you've made schema changes to the model or when you need to backfill historical data.
What is a one-off job in dbt Cloud? +
A one-off job (or ad-hoc run) in dbt Cloud is a manual job execution triggered outside of a regular schedule. You can trigger it from the dbt Cloud UI or via API. It's useful for running a specific model after a data fix or for testing a new model in production without waiting for the scheduled run.
Official Resources

Domain 3 Study Resources

dbt debugging documentation and tools