FlashGenius Logo FlashGenius
SnowPro Core COF-C03 · Page 4 of 5 · Domain 4

Performance Optimization, Querying & Transformation

SnowPro Core COF-C03 · Domain 4: Performance Optimization, Querying & Transformation · 21% of Exam

3 Cache Layers · Query Profile · Clustering Keys · Materialized Views · Streams · Tasks · Dynamic Tables · UDFs

Study with Practice Tests →

Domain 4 — What the Exam Tests

Domain 4 accounts for 21% of the SnowPro Core COF-C03 exam and covers everything that makes Snowflake fast, flexible, and production-ready. You need to understand all three caching layers, how to read the Query Profile in Snowsight, when to apply clustering keys versus Search Optimization Service, and the mechanics of Snowflake's transformation pipeline toolset — Streams, Tasks, Dynamic Tables, Stored Procedures, and UDFs. The exam frequently tests the boundary between caching types, the cost model for features like Automatic Clustering and Materialized Views, and the CDC pattern built with Streams and Tasks.

Result Cache Metadata Cache Local Disk Cache Query Profile Spill to Disk Micro-Partitions Clustering Keys Automatic Clustering Materialized Views Search Optimization Streams Tasks Dynamic Tables Stored Procedures UDFs / UDTFs

Core Concepts at a Glance

Eight foundational pillars of Domain 4 performance and transformation

Three Caching Layers

Result Cache (24h, cloud services), Metadata Cache (persistent, cloud services), and Local Disk Cache (VW lifetime, SSD). Knowing which layer applies to which scenario is a top exam question.

🔍

Query Profile

Visual DAG in Snowsight showing every operator node — TableScan, Join, Aggregate, Sort. Reveals partition pruning efficiency, spilling to local SSD or remote storage, and exploding joins.

🗂️

Micro-Partitions & Clustering

Snowflake auto-manages columnar micro-partitions (50–500 MB compressed). Clustering keys reorder data for query-pattern-aligned pruning on large tables. Automatic Clustering keeps them fresh.

📐

VW Sizing Strategy

Scale UP (bigger size) for complex queries and large aggregations. Scale OUT (multi-cluster) for high concurrency. Each size step doubles compute and cost — roughly linear for memory-bound work.

🪟

Materialized Views

Pre-computed, auto-refreshed SELECT results stored on disk (Enterprise+). Faster reads at query time. Limitations: no JOINs, no non-deterministic functions, no subqueries in definition.

🔄

Streams & Tasks

Streams track DML changes (INSERT/UPDATE/DELETE) since last consumption. Tasks schedule SQL or stored procedures. Combined = micro-batch CDC pipeline. Check stream data with SYSTEM$STREAM_HAS_DATA().

📊

Dynamic Tables

Declarative transformation: define a SELECT query and a TARGET_LAG. Snowflake auto-refreshes when staleness threshold is reached. Simpler alternative to manual Streams + Tasks orchestration.

🛠️

UDFs & Stored Procedures

Scalar UDFs return one value per row; UDTFs return a table. Stored procedures support full procedural logic (loops, conditionals) via Snowflake Scripting, Python, JavaScript, Java, or Scala.

Three Caching Layers Quick Reference

Cache Type Scope Duration What's Cached
Result Cache Cloud Services Layer 24 hours Exact query results; no VW needed to reuse
Metadata Cache Cloud Services Layer Persistent Object metadata, min/max stats per micro-partition
Local Disk Cache (Data Cache) Virtual Warehouse VW running Decompressed micro-partition data on SSD

Exam Quick Facts

ItemDetail
ExamSnowPro Core COF-C03 (launched Feb 16, 2026)
Domain 4 Weight21% of 100 questions
Duration115 minutes
Passing Score750 / 1000
Cost$175

Deep Dive — Domain 4

Detailed coverage of all exam topics with syntax, nuance, and edge cases

Three Caching Layers

Result Cache

The Result Cache stores the exact output of a query in the Cloud Services Layer for 24 hours. Snowflake returns cached results instantly — no Virtual Warehouse credits consumed. Reuse conditions must ALL be true:

  • Exact same SQL text
  • Same user role (security context)
  • No underlying data changes in the referenced objects
  • No non-deterministic functions (e.g., CURRENT_TIMESTAMP(), RANDOM())

Metadata Cache

Persistent cache in the Cloud Services Layer storing micro-partition statistics — min/max values per column, row counts, and null counts. This enables Snowflake to answer certain queries with zero warehouse credits:

  • SELECT COUNT(*) FROM my_table — no VW required
  • SELECT MIN(col), MAX(col) FROM my_table — no VW required
  • Partition pruning decisions are also driven by metadata cache

Local Disk Cache (Data Cache)

Each Virtual Warehouse node caches decompressed micro-partition data on its local SSD during query execution. Subsequent queries on the same VW can reuse this cached data — reducing remote storage reads.

  • Cache lifetime tied to VW uptime — lost when the VW suspends
  • Key reason to avoid frequent suspend/resume for performance-sensitive workloads
  • Automatically evicted using LRU policy when SSD fills up
  • Larger VW sizes = more SSD capacity per cluster node

Query Profile & Performance Analysis

Reading the Query Profile

Access in Snowsight under Query History. The profile is a visual DAG (directed acyclic graph) where each node is an operator:

  • TableScan: shows partitions scanned vs. total — low % scanned = good pruning
  • Join: reveals join type (hash, merge, nested loop) and row counts
  • Aggregate: grouping and aggregation node
  • Sort: sorting operator — can spill to disk if result set is large

Spilling to Disk

When an operator (Sort, Aggregate, Join) runs out of in-memory space, it spills intermediate data first to local SSD, then to remote cloud storage. This is a major performance bottleneck.

  • Visible in Query Profile as "Bytes spilled to local storage" or "Bytes spilled to remote storage"
  • Fix: scale UP (larger VW size = more memory per node)
  • Also optimize the query: break large aggregations, avoid unnecessary sorts, use clustering keys
  • Remote storage spill is far worse than local SSD spill — indicates severely under-resourced VW

Exploding Joins & Partition Pruning

Exploding join: query produces far more rows than expected. Indicates missing or incorrect join predicates — a Cartesian or near-Cartesian join. Check join node's output row count vs. input.

Partition pruning: if % of partitions scanned is high on a large table, natural clustering doesn't match your filter columns. Consider adding a clustering key.

  • Use SYSTEM$CLUSTERING_INFORMATION('table', '(col1, col2)')' to assess clustering quality before committing
  • Pruning efficiency is the single biggest lever for reducing query cost on large tables

Micro-Partitions & Clustering Keys

Micro-Partition Fundamentals

Snowflake stores data in columnar micro-partitions of 50–500 MB (compressed). They are auto-created, immutable, and managed entirely by Snowflake. Each micro-partition stores metadata (min/max per column) enabling pruning without reading the data.

  • Natural clustering = insertion order — works well if data is appended in filter-column order (e.g., event logs by date)
  • Poor natural clustering = high % of partitions scanned for selective queries

Clustering Keys

Explicitly define columns to physically cluster data around. Best suited for:

  • Very large tables (multi-TB)
  • Tables frequently filtered on specific high-cardinality columns
  • Tables with poor natural clustering relative to query patterns
-- Define clustering key
ALTER TABLE my_table CLUSTER BY (region, order_date);

-- Check clustering quality
SELECT SYSTEM$CLUSTERING_INFORMATION('my_table', '(region, order_date)');
  • Automatic Clustering: Snowflake background service continuously re-clusters the table — uses serverless credits billed separately
  • Clustering depth < 1 = well-clustered; high depth = poor clustering, queries scan too many partitions
  • Use ALTER TABLE my_table SUSPEND RECLUSTER; to pause Automatic Clustering

Virtual Warehouse Sizing Strategy

Scale UP vs. Scale OUT

StrategyMechanismUse Case
Scale UPResize VW (XS → S → M → L → XL…)Complex queries, large aggregations, joins on many partitions — single query performance
Scale OUTMulti-cluster VW (additional clusters spin up)High concurrency — many simultaneous users/queries hitting the same VW
  • Each size step doubles compute resources (and cost) — roughly linear improvement for memory-bound queries
  • Queuing indicator: check QUEUED_PROVISIONING_TIME and QUEUED_OVERLOAD_TIME in QUERY_HISTORY
  • High queuing = Scale OUT. High spill-to-disk / slow single queries = Scale UP

Materialized Views

How They Work

A Materialized View (MV) stores the pre-computed result of a SELECT query on disk. When the base table changes, Snowflake's background service automatically refreshes the MV using serverless credits.

  • Enterprise edition or above required
  • Queries can use the MV transparently — Snowflake may rewrite the query automatically
  • Faster reads than standard views (no recomputation at query time)

Limitations (exam-tested):

  • No JOINs in the MV definition
  • No non-deterministic functions (e.g., CURRENT_TIMESTAMP)
  • No subqueries in the definition
  • Only one base table allowed

Best use case: expensive aggregations (SUM, COUNT, AVG) over large tables, queried frequently with the same filters.

Search Optimization Service

Point Lookup Acceleration

The Search Optimization Service (SOS) is a serverless feature that builds a persistent search access path alongside the table data. It dramatically speeds up equality and IN predicate queries on large tables.

-- Enable Search Optimization
ALTER TABLE my_table ADD SEARCH OPTIMIZATION;

-- Disable
ALTER TABLE my_table DROP SEARCH OPTIMIZATION;
  • Best for: tables with billions of rows, ad-hoc point lookups, highly selective WHERE clauses with equality or IN predicates
  • NOT for: range scans, full table scans, analytical aggregation workloads — use clustering keys instead
  • Incurs additional storage cost for the search access path
  • Serverless credits consumed to build and maintain the access path

Streams (Change Data Capture)

How Streams Work

A Stream tracks DML changes (INSERT, UPDATE, DELETE) on a source table since the last time the stream was consumed. It uses the table's Time Travel metadata to track an offset.

-- Create a standard stream
CREATE STREAM my_stream ON TABLE source_table;

-- Query changes
SELECT * FROM my_stream;

Metadata columns added by streams:

  • METADATA$ACTION: INSERT or DELETE (updates = one DELETE row + one INSERT row)
  • METADATA$ISUPDATE: TRUE if the row is part of an UPDATE operation
  • METADATA$ROW_ID: unique identifier for the row

Stream Types & Staleness

  • Standard stream: captures inserts, updates (as delete+insert pair), and deletes
  • Append-only stream: captures inserts only — more efficient for append-only source tables (event logs, Kafka landing tables)
  • Insert-only stream: for external tables; inserts only

Staleness: a stream becomes STALE if not consumed within the source table's Time Travel retention window (default 1 day, max 90 days on Enterprise). Once stale, the stream must be recreated.

  • Consuming a stream requires a DML statement inside a transaction — this advances the stream offset
  • Check: SYSTEM$STREAM_HAS_DATA('my_stream') — returns TRUE if unconsumed changes exist

Tasks

Scheduling SQL with Tasks

Tasks schedule a SQL statement or a stored procedure call on a fixed interval or cron schedule.

-- VW-based task (every 5 minutes)
CREATE TASK my_task
  WAREHOUSE = my_wh
  SCHEDULE = '5 MINUTE'
AS
  INSERT INTO target_table
  SELECT * FROM my_stream WHERE METADATA$ACTION = 'INSERT';

-- Tasks start SUSPENDED — must resume
ALTER TASK my_task RESUME;
  • Default state: SUSPENDED — you must explicitly ALTER TASK ... RESUME after creation
  • Serverless tasks: omit the WAREHOUSE clause; Snowflake manages compute via USER_TASK_MANAGED_INITIAL_WAREHOUSE_SIZE
  • Task trees: child tasks triggered on parent completion using AFTER parent_task clause
  • Stream + Task pattern: task checks SYSTEM$STREAM_HAS_DATA() before processing to avoid empty runs

Dynamic Tables

Declarative Pipelines

Dynamic Tables take a declarative approach: you define the transformation query and a TARGET_LAG (maximum staleness). Snowflake automatically determines when and how to refresh — no manual stream/task orchestration needed.

CREATE DYNAMIC TABLE customer_summary
  TARGET_LAG = '1 minute'
  WAREHOUSE = transform_wh
AS
  SELECT customer_id, SUM(amount) AS total_spend
  FROM orders
  GROUP BY customer_id;
  • TARGET_LAG: max acceptable staleness — e.g., '1 minute', '1 hour', or DOWNSTREAM (inherit from downstream dynamic tables)
  • Snowflake chooses full or incremental refresh automatically
  • Dynamic Tables can chain: output of one feeds the definition of another
  • Simpler to manage than Streams + Tasks for multi-hop transformation pipelines

Stored Procedures & UDFs

Stored Procedures

Stored procedures execute procedural logic: loops, conditionals, multiple SQL statements, and DDL. Called with CALL my_proc(args);.

  • Snowflake Scripting (SQL-based): recommended approach — native SQL with procedural constructs
  • JavaScript: legacy option, still tested on exam
  • Python, Java, Scala: available for complex logic and library access
  • Can return a scalar value or a table (TABLE return type)
  • Can execute DDL (CREATE TABLE, DROP TABLE) — unlike UDFs
  • Run with caller's rights or owner's rights — controls which role's privileges apply

UDFs (User-Defined Functions)

Scalar UDFs: return one value per input row. Used inline in SQL:

SELECT my_udf(column_name) FROM my_table;

UDTFs (Table Functions): return a table (multiple rows) per input. Called with TABLE():

SELECT * FROM TABLE(my_udtf(arg1, arg2));
  • Languages: SQL, JavaScript, Python, Java
  • Secure UDFs: hide implementation from users who lack OWNERSHIP privilege — required for data sharing scenarios where the UDF logic is proprietary
  • UDFs cannot execute DML or DDL — use stored procedures for that

Memory Hooks

Six high-yield mnemonics to lock in Domain 4 concepts before exam day

🧊

The Three Cache Layers

"Result–Metadata–Disk"

Result cache: 24h, cloud services, exact query results, no VW needed. Metadata cache: persistent, cloud services, min/max stats, enables zero-credit COUNT(*). Local Disk cache: VW lifetime only, SSD, lost on suspend.

💧

Spilling to Disk

"Spill = VW too small. Scale UP."

Spilling to local SSD is bad. Spilling to remote storage is worse. The fix is always Scale UP (bigger VW size = more memory). Scale OUT addresses concurrency, not memory pressure.

🗂️

Clustering vs. Search Optimization

"Ranges = Clustering. Equality = Search Optimization."

Clustering keys: range scans, large tables, date/region filters. Search Optimization Service: point lookups, equality and IN predicates, ad-hoc queries on billions of rows.

🔄

Stream + Task Pipeline

"Stream tracks it. Task processes it. Dynamic Table does both automatically."

Stream + Task = micro-batch CDC pipeline (manual orchestration). Dynamic Table = declarative, set TARGET_LAG and walk away. Streams go stale if not consumed within the Time Travel window.

⏱️

Stream Staleness Warning

"No consumption = stale stream. Must recreate."

Streams become STALE if not consumed within the source table's Time Travel retention window (default 1 day, up to 90 days on Enterprise). Once stale the stream is gone — you must recreate it and lose change history.

📐

VW Sizing Decision Rule

"UP = complex queries. OUT = more users."

Scale UP (larger VW): slow individual queries, spilling to disk, large aggregations and joins. Scale OUT (multi-cluster): queuing, many concurrent users, high QUEUED_OVERLOAD_TIME in QUERY_HISTORY.

Practice Quiz

10 exam-style questions · Domain 4

Score: 0 / 0
Question 1 of 10
How long does the Snowflake Result Cache retain query results in the Cloud Services Layer?
The Result Cache stores exact query results for 24 hours. No Virtual Warehouse credits are consumed when a cached result is returned.
Question 2 of 10
Which Snowflake cache allows COUNT(*) to return results with zero Virtual Warehouse credits?
The Metadata Cache stores micro-partition statistics (row counts, min/max values) in the Cloud Services Layer. Queries like COUNT(*), MIN, and MAX can be answered purely from metadata — no warehouse needed.
Question 3 of 10
A data engineer notices "Bytes spilled to remote storage" in the Query Profile. What is the most appropriate fix?
Spilling to remote storage means the VW nodes have exhausted both memory and local SSD. The fix is to Scale UP (use a larger warehouse size) to give operators more memory. Adding clusters (Scale OUT) addresses concurrency, not memory pressure.
Question 4 of 10
A 10 TB fact table is frequently queried with WHERE region = 'US' AND order_date > '2025-01-01'. The Query Profile shows 85% of partitions being scanned. What is the best solution?
High partition scan % on range-filtered queries indicates poor clustering. Adding a clustering key on (region, order_date) will physically reorder micro-partitions to match the query pattern, enabling aggressive pruning. Search Optimization is for equality/IN lookups, not range scans.
Question 5 of 10
Which metadata columns does a Snowflake standard stream add to tracked rows?
Snowflake streams add three metadata columns: METADATA$ACTION (INSERT or DELETE), METADATA$ISUPDATE (TRUE when the row is part of an UPDATE), and METADATA$ROW_ID (unique row identifier).
Question 6 of 10
A Snowflake Task is created with CREATE TASK ... SCHEDULE = '1 MINUTE' AS ... What is its state immediately after creation?
Tasks are always created in the SUSPENDED state by default. You must explicitly run ALTER TASK my_task RESUME; to activate them.
Question 7 of 10
A table has 5 billion rows and receives frequent ad-hoc queries filtering on a specific user_id (equality predicate). Which Snowflake feature provides the best acceleration?
Search Optimization Service is designed specifically for point lookup queries (equality and IN predicates) on large tables with high selectivity. Clustering keys are better for range scans on analytical queries.
Question 8 of 10
Which of the following is a valid limitation of Snowflake Materialized Views?
Materialized Views cannot contain JOINs in their definition — they are limited to a single base table with filters and aggregations. They are auto-refreshed by Snowflake (no manual task needed) and require Enterprise edition or above.
Question 9 of 10
In a Dynamic Table definition, what does TARGET_LAG = 'DOWNSTREAM' mean?
TARGET_LAG = DOWNSTREAM means the Dynamic Table does not have its own independent refresh schedule — it inherits and aligns its refresh timing from the Dynamic Tables that depend on it downstream, optimizing the overall pipeline refresh cadence.
Question 10 of 10
Under what condition does a Snowflake Stream become STALE and require recreation?
A stream becomes STALE when it is not consumed within the source table's Time Travel retention window (default 1 day, up to 90 days on Enterprise). The stream tracks changes using Time Travel — once those historical records age out, the stream offset is no longer valid.

Flashcards

Click any card to reveal the answer. Click again to flip back.

Caching
What are the three Snowflake cache types — their scope, duration, and what each caches?
Result Cache: Cloud Services, 24h, exact query results (no VW needed).

Metadata Cache: Cloud Services, persistent, micro-partition stats (min/max, row count) — enables zero-credit COUNT(*).

Local Disk Cache: Virtual Warehouse, VW lifetime, decompressed micro-partition data on SSD — lost on suspend.
Tap to flip back
Query Profile
What does "Bytes spilled to remote storage" in the Query Profile indicate, and how do you fix it?
It means an operator (Sort, Join, Aggregate) exceeded both in-memory capacity AND local SSD capacity, writing to remote cloud storage — the worst spill scenario.

Fix: Scale UP (use a larger VW size to give operators more memory). Also consider optimizing the query or adding clustering keys to reduce data scanned.
Tap to flip back
Clustering
What are the best candidates for Snowflake clustering keys?
Best candidates:
• Very large tables (multi-TB)
• Tables with high % partitions scanned (poor pruning)
• Frequently filtered on specific columns (date, region, status)
• Range scan queries

Not for: point lookup / equality queries — use Search Optimization Service instead.
Tap to flip back
Streams
What are the three metadata columns that Snowflake streams add, and what does each contain?
METADATA$ACTION: INSERT or DELETE (updates appear as one DELETE + one INSERT pair)

METADATA$ISUPDATE: TRUE if the row is part of an UPDATE DML operation

METADATA$ROW_ID: Unique identifier for the source row across stream records
Tap to flip back
Tasks
What state are Tasks in immediately after creation, and how do you activate them?
Tasks are created in SUSPENDED state by default.

Activate with:
ALTER TASK my_task RESUME;

Common pattern: create the task, verify the definition, then RESUME. To stop: ALTER TASK my_task SUSPEND;
Tap to flip back
Dynamic Tables
What does TARGET_LAG mean in a Dynamic Table definition?
TARGET_LAG defines the maximum acceptable staleness of the Dynamic Table's data relative to the source.

Examples: '1 minute', '1 hour'

DOWNSTREAM: inherit refresh schedule from dependent downstream Dynamic Tables — used to optimize multi-hop pipeline refresh cadence.
Tap to flip back
Search Optimization
When should you use Search Optimization Service vs. clustering keys?
Search Optimization Service:
• Equality and IN predicate queries
• Ad-hoc point lookups on billions of rows
• Highly selective WHERE clauses

Clustering Keys:
• Range scans (date ranges, region ranges)
• Analytical aggregation on large tables
• Regularly filtered columns with low-to-medium cardinality
Tap to flip back
Materialized Views
What are the key limitations of Snowflake Materialized Views?
Key limitations:
No JOINs in the definition
No non-deterministic functions (CURRENT_TIMESTAMP, RANDOM)
No subqueries in the definition
• Only one base table allowed
• Requires Enterprise edition or above
• Auto-refresh uses serverless credits
Tap to flip back

Study Advisor

Select a topic area to get targeted study guidance for Domain 4

⚡ Caching Strategy

Mastering Snowflake's Three Cache Layers

  • Memorize the scope of each layer: Result and Metadata caches live in Cloud Services (no VW needed); Local Disk Cache lives in the VW
  • Result cache is invalidated if the underlying data changes OR if a non-deterministic function is used — this is an exam favorite
  • Metadata cache enables zero-credit COUNT(*), MIN, MAX — remember this for "which queries use no credits?" questions
  • Local Disk Cache is lost on VW suspend — the exam may ask why frequent suspend/resume hurts performance
  • Practice distinguishing result cache reuse conditions: same SQL, same role, no data change, no non-deterministic functions
  • Know that the Local Disk Cache uses an LRU (Least Recently Used) eviction policy
  • Larger VW size = more SSD per node = larger Local Disk Cache capacity
🔍 Query Profiling

Reading the Query Profile Like an Expert

  • Open the Query Profile in Snowsight — navigate to Query History, click a query ID, then the Profile tab
  • Each node in the DAG is an operator — the numbers show rows processed and bytes transferred between nodes
  • TableScan node: check "Partitions scanned" vs "Partitions total" — high % = poor pruning = consider clustering key
  • Spill to local storage = VW is under-sized for memory; spill to remote storage = severely under-sized
  • Exploding join: the Join node shows output rows far exceeding input rows — check join predicates for missing conditions
  • Sort is expensive — always check if you can eliminate unnecessary ORDER BY or push it later in the pipeline
  • Practice the fix decision tree: spilling → Scale UP; queuing → Scale OUT; poor pruning → clustering key or SOS
🗂️ Clustering

Clustering Keys, Automatic Clustering & Search Optimization

  • Clustering key use case: large table, range scan queries, poor natural clustering — not for point lookups
  • Search Optimization Service use case: point lookups (equality, IN) on large tables, ad-hoc queries — not for range scans
  • Use SYSTEM$CLUSTERING_INFORMATION('table', '(col)') before adding a clustering key — check average depth
  • Average depth < 1: well-clustered. High depth means data is scattered across many partitions for that key column
  • Automatic Clustering: Snowflake continuously re-clusters in the background — uses serverless credits billed separately from VW credits
  • You can suspend Automatic Clustering: ALTER TABLE my_table SUSPEND RECLUSTER;
  • Materialized Views can also improve performance — but they have strict limitations (no JOINs, Enterprise only)
  • Choose 1–3 clustering columns max; high-cardinality + query-common columns work best
🔄 Streams & Tasks

Building CDC Pipelines with Streams, Tasks & Dynamic Tables

  • Stream offset advances only when a DML statement consuming the stream commits inside a transaction
  • Staleness rule: stream must be consumed within the source table's Time Travel window — default 1 day, Enterprise up to 90 days
  • Append-only stream is more efficient than standard stream for insert-only sources (event logs, Kafka)
  • Tasks are always created SUSPENDED — always remember to ALTER TASK my_task RESUME;
  • Task tree pattern: parent task runs first, child tasks execute AFTER parent_task
  • Common exam question: Stream + Task micro-batch pattern vs. Dynamic Table declarative pattern — know when each is preferred
  • Dynamic Tables simplify multi-hop pipelines — define SELECT query and TARGET_LAG, let Snowflake manage refresh
  • DOWNSTREAM target lag: useful when chaining Dynamic Tables to align refresh schedules automatically
  • Serverless tasks: omit WAREHOUSE clause, set USER_TASK_MANAGED_INITIAL_WAREHOUSE_SIZE instead
🛠️ UDFs & Procedures

UDFs, UDTFs, and Stored Procedures

  • Scalar UDF: returns one value per input row — used inline in SELECT, WHERE, GROUP BY
  • UDTF (Table Function): returns a table (zero or more rows) per input — called with TABLE(my_udtf(args))
  • Stored procedures: execute multiple SQL statements, loops, conditionals — called with CALL my_proc()
  • Stored procedures can execute DDL (CREATE TABLE, DROP) — UDFs cannot
  • Recommended SP language: Snowflake Scripting (SQL-based) — easier for SQL-fluent developers
  • JavaScript stored procedures are legacy — still on exam but Snowflake Scripting is preferred
  • Secure UDFs: hide implementation from users who lack OWNERSHIP — required for data sharing with proprietary logic
  • Caller's rights vs. Owner's rights stored procedures: controls which role's privileges apply at execution time
  • Know that Python UDFs support third-party libraries via Snowpark — referenced in IMPORTS clause

Study Resources

Official and community resources for SnowPro Core COF-C03 Domain 4

Official

SnowPro Core Certification Page

Official Snowflake certification hub — exam guide, study resources, and registration.

learn.snowflake.com →
Documentation

Snowflake Query Profile Docs

Official guide to reading the Query Profile DAG — operator nodes, spill indicators, and performance troubleshooting.

docs.snowflake.com →
Documentation

Caching in Snowflake

Official documentation on Result Cache, Metadata Cache, and Local Disk Cache behavior and configuration.

docs.snowflake.com →
Documentation

Micro-Partitions & Clustering

Deep dive into micro-partition structure, clustering keys, Automatic Clustering, and SYSTEM$CLUSTERING_INFORMATION.

docs.snowflake.com →
Documentation

Streams & Tasks

Complete reference for stream types (standard, append-only), metadata columns, task scheduling, and serverless tasks.

docs.snowflake.com →
Documentation

Dynamic Tables

Official guide to Dynamic Table syntax, TARGET_LAG settings, DOWNSTREAM refresh, and pipeline patterns.

docs.snowflake.com →
Documentation

Stored Procedures & UDFs

Reference for Snowflake Scripting, JavaScript, Python procedures; scalar UDFs and UDTFs; secure UDFs.

docs.snowflake.com →
Practice

FlashGenius Practice Tests

Full-length SnowPro Core COF-C03 practice exams with detailed explanations for every Domain 4 topic.

flashgenius.net →

Ready to Pass the SnowPro Core Exam?

Test yourself with full-length practice exams covering all 5 domains — including Domain 4 Performance & Transformation.

Start Free Practice Tests →