FlashGenius Logo FlashGenius
Login Sign Up

dbt Analytics Engineering Certification (DBT-AE) Practice Questions: Data Modeling and Transformations Domain

Test your dbt Analytics Engineering Certification (DBT-AE) knowledge with 10 practice questions from the Data Modeling and Transformations domain. Includes detailed explanations and answers.

dbt Analytics Engineering Certification (DBT-AE) Practice Questions

Master the Data Modeling and Transformations Domain

Test your knowledge in the Data Modeling and Transformations domain with these 10 practice questions. Each question is designed to help you prepare for the DBT-AE certification exam with detailed explanations to reinforce your learning.

Question 1

You need to perform a transformation that involves filtering rows based on a dynamic date range using Jinja templating. Which of the following is the correct way to filter data for the last 30 days in a dbt model?

A) WHERE order_date >= date_sub(current_date, interval 30 day)

B) WHERE order_date >= '{{ execution_date - timedelta(days=30) }}'

C) WHERE order_date >= '{{ (current_timestamp() - interval '30 day') }}'

D) WHERE order_date >= '{{ dbt_utils.dateadd('day', -30, 'today') }}'

Show Answer & Explanation

Correct Answer: B

Explanation: In dbt, Jinja templating can be used to dynamically calculate date ranges. Option B correctly uses Jinja to subtract 30 days from the `execution_date`, which is a common pattern in dbt for dynamic date filtering. Option A is incorrect because it uses SQL syntax without Jinja. Option C is incorrect because it mixes Jinja with SQL in an unsupported manner. Option D incorrectly uses a non-existent function in dbt_utils; instead, date manipulation should be done using Jinja.

Question 2

Your dbt project has several models that need to be refactored into a new directory structure. Which command should you use to ensure all downstream dependencies are correctly updated after moving the models?

A) `dbt run --select +tag:new_structure`

B) `dbt deps`

C) `dbt compile`

D) `dbt run --select state:modified+`

Show Answer & Explanation

Correct Answer: D

Explanation: After refactoring models, the `dbt run --select state:modified+` command will run the modified models and all their downstream dependencies. Option A uses a tag selector, which doesn't automatically include all dependencies. Option B (`dbt deps`) installs dependencies from `packages.yml` but doesn't run models. Option C (`dbt compile`) compiles models but doesn't execute them. For more information, refer to the [dbt CLI documentation](https://docs.getdbt.com/reference/commands/run).

Question 3

You have a dbt project with a model `orders` that needs to be refactored to improve performance. The model currently selects all columns from a large table. Which of the following strategies is most effective for optimizing this model?

A) Use a `LIMIT` clause to reduce the number of rows processed.

B) Select only the necessary columns instead of using `SELECT *`.

C) Convert the model to an incremental model using the `is_incremental()` macro.

D) Add a `WHERE` clause to filter data based on a specific condition.

Show Answer & Explanation

Correct Answer: B

Explanation: Selecting only the necessary columns (option B) reduces the amount of data processed, which can significantly improve performance, especially with large tables. Using `SELECT *` can be inefficient as it processes all columns, many of which may not be needed. Option A (`LIMIT`) is not suitable for production models as it only reduces the number of rows temporarily for testing purposes. Option C (incremental models) is effective when dealing with large datasets that do not change often, but it requires additional setup and is not directly related to column selection. Option D (adding a `WHERE` clause) can improve performance by filtering rows, but it does not address the issue of unnecessary columns. Refer to the dbt documentation on [model performance optimization](https://docs.getdbt.com/docs/guides/best-practices/performance-optimization) for more details.

Question 4

You want to run only the modified models and their downstream dependencies in your dbt project using a CI/CD pipeline. Which dbt CLI command should you use?

A) `dbt run --select state:modified`

B) `dbt run --select state:modified+`

C) `dbt run --select tag:modified`

D) `dbt run --select state:modified++`

Show Answer & Explanation

Correct Answer: B

Explanation: Option B is correct because `dbt run --select state:modified+` selects the modified models and their immediate downstream dependencies. The `+` operator is used to include dependencies. Option A (`state:modified`) selects only the modified models without dependencies. Option C (`tag:modified`) is incorrect because it uses a tag, which is not related to state-based selection. Option D (`state:modified++`) includes both downstream and upstream dependencies, which is unnecessary if only downstream dependencies are needed. Refer to the dbt documentation on [state-based selection](https://docs.getdbt.com/reference/node-selection/syntax#state-based-selection) for more information.

Question 5

You are tasked with refactoring a dbt project to improve maintainability. Which of the following steps is NOT recommended as a best practice?

A) Organize models into subdirectories based on business domain.

B) Use Jinja macros for repeated SQL logic.

C) Hardcode database credentials in model files for easy access.

D) Leverage `ref` function to manage dependencies between models.

Show Answer & Explanation

Correct Answer: C

Explanation: The correct answer is C. Hardcoding database credentials in model files is a security risk and not a best practice. Instead, credentials should be managed through environment variables or dbt profiles. Option A is a best practice as organizing models by business domain improves project structure. Option B is recommended to avoid code duplication by using Jinja macros for repeated logic. Option D is a best practice because the `ref` function helps manage dependencies and ensures models are built in the correct order. For best practices on project organization, refer to the [dbt documentation on project structure](https://docs.getdbt.com/docs/build/projects).

Question 6

You are setting up a CI/CD pipeline for your dbt project. You want to run tests only on models that have been modified since the last successful run. Which dbt CLI command should you use?

A) `dbt test --select state:modified+`

B) `dbt test --models modified+`

C) `dbt test --select state:changed+`

D) `dbt test --models state:modified+`

Show Answer & Explanation

Correct Answer: A

Explanation: The correct answer is A. The command `dbt test --select state:modified+` is used to run tests on models that have been modified since the last successful run. The `state:modified+` selector identifies these models. Option B is incorrect because 'modified+' is not a valid selector without the 'state:' prefix. Option C is incorrect because 'state:changed+' is not a valid state-based selector in dbt. Option D is incorrect because 'state:modified+' needs to be used with the `--select` flag, not `--models`. For more information, refer to the [dbt documentation on state-based selectors](https://docs.getdbt.com/docs/guides/best-practices/using-state).

Question 7

You are tasked with creating a custom test to ensure that the `order_id` field in your `orders` model is unique. Which of the following `schema.yml` configurations correctly implements this test using `dbt-utils`?

A) ```yaml version: 2 models: - name: orders tests: - dbt_utils.unique_combination_of_columns: combination_of_columns: ['order_id'] ```

B) ```yaml version: 2 models: - name: orders columns: - name: order_id tests: - dbt_utils.unique ```

C) ```yaml version: 2 models: - name: orders tests: - unique: columns: ['order_id'] ```

D) ```yaml version: 2 models: - name: orders columns: - name: order_id tests: - unique ```

Show Answer & Explanation

Correct Answer: D

Explanation: Option D is correct because it uses the built-in `unique` test directly on the `order_id` column, which is the standard way to test for uniqueness in dbt using `schema.yml`. Option B incorrectly tries to use `dbt_utils.unique`, which is not a valid test in dbt-utils; instead, `unique` is a built-in test. Option A attempts to use `dbt_utils.unique_combination_of_columns`, which is used for testing uniqueness across multiple columns, not a single column. Option C is incorrect because it misuses the `unique` test as a model-level test instead of a column-level test. [dbt Documentation on Tests](https://docs.getdbt.com/docs/build/tests)

Question 8

You need to write a dbt model using Jinja to dynamically filter records based on the current year. What is the correct way to implement this logic in your SQL model file?

A) ```sql SELECT * FROM sales WHERE year = {{ current_year() }} ```

B) ```sql SELECT * FROM sales WHERE year = {{ get_current_year() }} ```

C) ```sql SELECT * FROM sales WHERE year = {{ run_query("SELECT EXTRACT(YEAR FROM CURRENT_DATE)").columns[0] }} ```

D) ```sql SELECT * FROM sales WHERE year = {{ macros.current_year() }} ```

Show Answer & Explanation

Correct Answer: C

Explanation: Option C correctly uses the `run_query` Jinja function to execute a SQL query that extracts the current year. Option A and B use non-existent functions `current_year()` and `get_current_year()`, respectively. Option D incorrectly references `macros.current_year()`, which is not a built-in macro. It's important to use existing Jinja functions and SQL expressions to achieve dynamic logic. For more details, refer to the [dbt Jinja documentation](https://docs.getdbt.com/docs/building-a-dbt-project/jinja-context).

Question 9

You have a dbt model that transforms raw data. You want to ensure that all values in the 'email' column follow a valid email format. Which test configuration should you add to your `schema.yml` file?

A) ```yaml version: 2 models: - name: my_model tests: - dbt_utils.email_format: column_name: email ```

B) ```yaml version: 2 models: - name: my_model tests: - dbt_utils.expect_column_values_to_match_regex: column_name: email regex: '^\S+@\S+\.\S+$' ```

C) ```yaml version: 2 models: - name: my_model tests: - unique: column_name: email ```

D) ```yaml version: 2 models: - name: my_model tests: - not_null: column_name: email ```

Show Answer & Explanation

Correct Answer: B

Explanation: The correct answer is B. The `dbt_utils.expect_column_values_to_match_regex` test is used to validate that column values match a specific regex pattern. In this case, the regex `'^\S+@\S+\.\S+$'` checks for a valid email format. Option A is incorrect because there is no test named `dbt_utils.email_format`. Option C and D are incorrect because they test for uniqueness and non-null values, respectively, which do not ensure email format validity. Using regex-based tests is a common practice for validating complex patterns. See the [dbt-utils documentation](https://github.com/dbt-labs/dbt-utils) for more custom test examples.

Question 10

A dbt model is failing due to a performance issue related to a large dataset. You suspect that the join operation is causing the slowdown. Which strategy can you use to optimize this model?

A) Convert the model to an ephemeral model

B) Use a CTE to pre-aggregate data before joining

C) Add a unique constraint to the join keys

D) Increase the compute resources of the database

Show Answer & Explanation

Correct Answer: B

Explanation: Option B suggests using a Common Table Expression (CTE) to pre-aggregate data before performing the join, which can reduce the amount of data processed and improve performance. Option A (ephemeral models) is not suitable for large datasets as they are in-memory. Option C (adding a unique constraint) does not directly improve performance but ensures data integrity. Option D (increasing compute resources) may help but does not address the underlying inefficiency in the query. For best practices, see the [dbt performance documentation](https://docs.getdbt.com/docs/guides/best-practices/performance).

Ready to Accelerate Your dbt Analytics Engineering Certification (DBT-AE) Preparation?

Join thousands of professionals who are advancing their careers through expert certification preparation with FlashGenius.

  • ✅ Unlimited practice questions across all DBT-AE domains
  • ✅ Full-length exam simulations with real-time scoring
  • ✅ AI-powered performance tracking and weak area identification
  • ✅ Personalized study plans with adaptive learning
  • ✅ Mobile-friendly platform for studying anywhere, anytime
  • ✅ Expert explanations and study resources
Start Free Practice Now

Already have an account? Sign in here

About dbt Analytics Engineering Certification (DBT-AE) Certification

The DBT-AE certification validates your expertise in data modeling and transformations and other critical domains. Our comprehensive practice questions are carefully crafted to mirror the actual exam experience and help you identify knowledge gaps before test day.

Other Practice Tests: