FlashGenius Logo FlashGenius
Login Sign Up

dbt Analytics Engineering Certification (DBT)-AE Practice Questions: Analytics Engineering Best Practices Domain

Test your dbt Analytics Engineering Certification (DBT-AE) knowledge with 10 practice questions from the Analytics Engineering Best Practices domain. Includes detailed explanations and answers.

dbt Analytics Engineering Certification (DBT-AE) Practice Questions

Master the Analytics Engineering Best Practices Domain

Test your knowledge in the Analytics Engineering Best Practices domain with these 10 practice questions. Each question is designed to help you prepare for the DBT-AE certification exam with detailed explanations to reinforce your learning.

Question 1

Your dbt project has several models, and you want to refactor a large model into smaller, more manageable pieces. You need to ensure that each sub-model correctly references the others and that the project structure follows dbt best practices. Which of the following is the best approach to achieve this?

A) Split the model into multiple SQL files, each representing a sub-model, and use the `ref` function to reference them in a centralized model.

B) Create multiple SQL files in a new directory, and use hard-coded table names to reference them.

C) Use CTEs in a single SQL file to simulate sub-models, maintaining the large model structure.

D) Create separate dbt projects for each sub-model and use cross-project dependencies.

Show Answer & Explanation

Correct Answer: A

Explanation: Option A is correct because splitting the model into smaller SQL files and using the `ref` function to reference them maintains modularity and adheres to dbt best practices. This approach allows for easier testing, documentation, and maintenance. Option B is incorrect because hard-coding table names goes against dbt's dynamic referencing capabilities, which can lead to maintenance challenges. Option C is incorrect because using CTEs in a single file does not effectively refactor the model into manageable pieces. Option D is incorrect because creating separate dbt projects is unnecessary and overly complex for this scenario. [Refer to dbt documentation on refactoring models](https://docs.getdbt.com/docs/building-a-dbt-project/building-models).

Question 2

You want to implement a custom test to ensure that a column named `email` in your `customers` model contains valid email addresses. Which of the following is the best way to define this test using dbt and `dbt-utils`?

A) Define a test in `schema.yml` using `dbt-utils.email_validity`.

B) Create a custom SQL test file using `dbt test` with a regex pattern.

C) Use the built-in `not_null` test in `schema.yml` to check email validity.

D) Write a Jinja macro to iterate over each email and validate its format.

Show Answer & Explanation

Correct Answer: A

Explanation: Option A is correct because dbt-utils provides a `email_validity` test that can be defined in `schema.yml` to check for valid email formats. This approach leverages existing utilities and integrates well with dbt's testing framework. Option B, while possible, is not the best practice as it requires custom SQL logic that could be error-prone and less maintainable. Option C is incorrect because `not_null` only checks for non-null values, not format validity. Option D is incorrect because writing a Jinja macro for this purpose is unnecessarily complex and not the intended use of Jinja in dbt. [Refer to dbt-utils documentation for custom tests](https://github.com/dbt-labs/dbt-utils).

Question 3

You need to implement a continuous integration strategy in your dbt project. Which command would you use to run tests only on models that have been modified or are downstream of modified models?

A) `dbt test --select state:modified+`

B) `dbt run --models modified+`

C) `dbt test --select modified+`

D) `dbt test --state:modified+`

Show Answer & Explanation

Correct Answer: A

Explanation: The command `dbt test --select state:modified+` is used to run tests on models that have been modified or are downstream of modified models, which is an effective strategy for continuous integration. Option B is incorrect because it runs models, not tests. Option C is incorrect because `modified+` is not a valid selector without the `state:` prefix. Option D is incorrect because it uses an invalid syntax `--state:` instead of `--select state:`. For more information, refer to the dbt documentation on [state-based selectors](https://docs.getdbt.com/reference/node-selection/state-based-selection).

Question 4

You have a dbt model that calculates the total sales for each customer. You want to ensure that the sales data is always positive. Which of the following custom tests would you add in your `schema.yml` to validate this requirement?

A) unique

B) not_null

C) dbt_utils.expression_is_true

D) dbt_utils.not_constant

Show Answer & Explanation

Correct Answer: C

Explanation: The `dbt_utils.expression_is_true` test is used to validate that an expression holds true for all rows in a model. In this case, you can use it to ensure that all sales values are positive by passing an expression like `sales > 0`. Option A is incorrect because `unique` checks for uniqueness, not positivity. Option B is incorrect because `not_null` ensures values are not null, but doesn't check if they are positive. Option D is incorrect because `dbt_utils.not_constant` checks for variability in a column, not positivity. For more information, refer to the dbt-utils documentation on [custom tests](https://github.com/dbt-labs/dbt-utils#expression_is_true-source).

Question 5

In your dbt project, you need to ensure that models are only run if they have been modified since the last run. Which dbt command would you use to achieve this during your CI/CD process?

A) dbt run --select state:modified+

B) dbt run --models modified

C) dbt test --select state:modified+

D) dbt build --select state:new+

Show Answer & Explanation

Correct Answer: A

Explanation: The command `dbt run --select state:modified+` allows you to run models that have been modified since the last run. The `state:modified+` selector is part of dbt's state-based selection feature, which is useful in CI/CD pipelines to avoid unnecessary model runs. Option B is incorrect because `--models modified` is not a valid syntax for selecting modified models. Option C is incorrect because `dbt test` is used for running tests, not models. Option D is incorrect because `state:new+` would select newly created models, not modified ones. For more details, see the dbt documentation on [state-based selectors](https://docs.getdbt.com/reference/node-selection/syntax#state-based-selection).

Question 6

Your team uses a CI/CD pipeline to deploy dbt models. You want to ensure that only modified models and their downstream dependencies are tested before deployment. Which dbt CLI command would you use to achieve this?

A) `dbt test --select state:modified+`

B) `dbt test --select state:modified`

C) `dbt test --select state:modified+ state:modified`

D) `dbt test --select state:modified*`

Show Answer & Explanation

Correct Answer: A

Explanation: Option A is correct because `dbt test --select state:modified+` selects all modified models and their downstream dependencies for testing. This ensures comprehensive testing of changes and their impacts. Option B is incorrect because it only selects modified models without their dependencies. Option C is incorrect because it redundantly tries to select modified models twice. Option D is incorrect because `state:modified*` selects both upstream and downstream dependencies, which is not necessary for testing purposes. For more details, refer to the dbt documentation on [state-based selection](https://docs.getdbt.com/reference/node-selection/syntax#state-based-selection).

Question 7

You have a dbt model that transforms data from a source table. You need to ensure that a specific column, `email`, does not contain any null values after the transformation. Which schema test would you use in your `schema.yml` file to enforce this requirement?

A) ```yaml tests: - dbt_utils.not_null: column_name: email ```

B) ```yaml tests: - dbt_utils.unique: column_name: email ```

C) ```yaml tests: - dbt_utils.accepted_values: column_name: email values: ['not_null'] ```

D) ```yaml tests: - dbt_utils.expression_is_true: expression: 'email IS NOT NULL' ```

Show Answer & Explanation

Correct Answer: A

Explanation: Option A is correct because `dbt_utils.not_null` is a schema test specifically designed to check that a column does not contain null values. Option B is incorrect because `dbt_utils.unique` checks for unique values, not null values. Option C is incorrect because `accepted_values` is used for checking specific allowed values, not nullity. Option D, while technically possible, is not the most straightforward or idiomatic way to enforce non-null constraints in dbt. For more information, see the dbt documentation on [schema tests](https://docs.getdbt.com/docs/building-a-dbt-project/tests#schema-tests).

Question 8

During a model run, you notice that a specific incremental model is taking longer than expected to complete. Which of the following strategies would most likely optimize the performance of this model?

A) Switch the model to a full refresh run every time.

B) Review and optimize the SQL logic within the model.

C) Increase the number of partitions in the underlying data source.

D) Disable the incremental logic and run it as a batch model.

Show Answer & Explanation

Correct Answer: B

Explanation: Option B is correct because optimizing the SQL logic can significantly improve the performance of an incremental model. This might involve refining joins, filtering data earlier, or using indexes effectively. Option A is incorrect because full refresh runs counteract the benefits of incremental processing. Option C might help in some cases, but it depends on the data source and is not guaranteed to solve the issue. Option D is incorrect because disabling incremental logic would increase the time needed to process the entire dataset every time. [Refer to dbt documentation on incremental models](https://docs.getdbt.com/docs/building-a-dbt-project/building-models/incremental-models).

Question 9

You need to use a Jinja macro to dynamically filter a dbt model based on a given date range. Which of the following Jinja expressions correctly implements this logic within a SQL statement?

A) {{ date_range }}

B) {% if start_date and end_date %}WHERE date BETWEEN '{{ start_date }}' AND '{{ end_date }}'{% endif %}

C) SELECT * FROM {{ ref('orders') }} WHERE date_range

D) {% set date_range = 'date BETWEEN start_date AND end_date' %}

Show Answer & Explanation

Correct Answer: B

Explanation: The expression `{% if start_date and end_date %}WHERE date BETWEEN '{{ start_date }}' AND '{{ end_date }}'{% endif %}` uses Jinja control flow to conditionally add a `WHERE` clause based on the presence of `start_date` and `end_date` variables. This is a common pattern for dynamic filtering in dbt models. Option A is incorrect because `{{ date_range }}` is a variable substitution, not a conditional logic statement. Option C is incorrect because `date_range` is not defined as a valid condition in the SQL context. Option D is incorrect because setting a Jinja variable without using it in a conditional or loop does not affect the SQL logic. For more information, see the dbt documentation on [Jinja in dbt](https://docs.getdbt.com/docs/building-a-dbt-project/jinja-context/).

Question 10

In a dbt project, you want to create a model that filters records based on a condition using Jinja and SQL. The requirement is to only include records where `status` is 'active'. Which of the following code snippets correctly implements this logic in a dbt model?

A) ```sql select * from {{ ref('source_table') }} where status = 'active' ```

B) ```sql select * from {{ ref('source_table') }} {% if status == 'active' %} where status = 'active' {% endif %} ```

C) ```sql select * from {{ ref('source_table') }} {% set filter_condition = 'active' %} where status = {{ filter_condition }} ```

D) ```sql select * from {{ ref('source_table') }} {% set filter_condition = 'active' %} where status = '{{ filter_condition }}' ```

Show Answer & Explanation

Correct Answer: D

Explanation: Option D is correct because it uses Jinja to set a variable `filter_condition` and then correctly applies it in the SQL `where` clause using Jinja templating syntax. Option A is technically correct but does not demonstrate the use of Jinja, which is the focus of this question. Option B is incorrect because it misuses the Jinja `if` statement in a context where it is not needed. Option C is incorrect because it fails to correctly apply the Jinja variable within quotes in the SQL context. For more guidance, see the dbt documentation on [Jinja](https://docs.getdbt.com/docs/building-a-dbt-project/jinja-context).

Ready to Accelerate Your dbt Analytics Engineering Certification (DBT-AE) Preparation?

Join thousands of professionals who are advancing their careers through expert certification preparation with FlashGenius.

  • ✅ Unlimited practice questions across all DBT-AE domains
  • ✅ Full-length exam simulations with real-time scoring
  • ✅ AI-powered performance tracking and weak area identification
  • ✅ Personalized study plans with adaptive learning
  • ✅ Mobile-friendly platform for studying anywhere, anytime
  • ✅ Expert explanations and study resources
Start Free Practice Now

Already have an account? Sign in here

About dbt Analytics Engineering Certification (DBT-AE) Certification

The DBT-AE certification validates your expertise in analytics engineering best practices and other critical domains. Our comprehensive practice questions are carefully crafted to mirror the actual exam experience and help you identify knowledge gaps before test day.

Other Practice Tests: