Optimizing Writes to Hybrid Tables
Optimizing Writes to Hybrid Tables
Overview
Note: This guide focuses on Snowflake-side write optimization. For driver-specific patterns (JDBC, Python, Node.js connection pooling and batch APIs), see Connecting Applications to Hybrid Tables.
Hybrid Table write performance depends on three factors: plan compilation, row store write throughput, and index maintenance. Of these, plan compilation is the most common source of avoidable latency. A well-optimized write path compiles a plan once and reuses it thousands of times per second. A poorly optimized path recompiles on every execution, adding 10-100ms of unnecessary overhead to every write.
This quickstart teaches you how to measure and eliminate compilation overhead, choose the right write pattern for your workload, and avoid the DML anti-patterns that silently degrade performance.
What You Will Learn
- How the plan cache works and why bound variables are critical
- How to detect plan cache misses using
query_parameterized_hash - Why stored procedures add overhead and what to use instead
- How COPY INTO behaves on Hybrid Tables (and when to use it)
- How schema qualification inconsistency splits your plan cache
- The performance hierarchy: AUTOCOMMIT > multi-statement txn > stored procedures
- How to measure the write performance of your workload
Prerequisites
- A Snowflake paid account in an AWS or Azure commercial region
- Familiarity with SQL and Snowflake query profiles
- Completion of Getting Started with Hybrid Tables (recommended)
Setup
USE ROLE ACCOUNTADMIN; CREATE OR REPLACE ROLE HT_WRITE_QS_ROLE; GRANT ROLE HT_WRITE_QS_ROLE TO ROLE ACCOUNTADMIN; CREATE OR REPLACE WAREHOUSE HT_WRITE_QS_WH WAREHOUSE_SIZE = XSMALL AUTO_SUSPEND = 300 AUTO_RESUME = TRUE; GRANT OWNERSHIP ON WAREHOUSE HT_WRITE_QS_WH TO ROLE HT_WRITE_QS_ROLE; CREATE OR REPLACE DATABASE HT_WRITE_QS_DB; GRANT OWNERSHIP ON DATABASE HT_WRITE_QS_DB TO ROLE HT_WRITE_QS_ROLE; USE ROLE HT_WRITE_QS_ROLE; CREATE OR REPLACE SCHEMA HT_WRITE_QS_DB.DATA; USE WAREHOUSE HT_WRITE_QS_WH; USE DATABASE HT_WRITE_QS_DB; USE SCHEMA DATA;
Create the Orders Table
CREATE OR REPLACE HYBRID TABLE orders ( order_id NUMBER NOT NULL, customer_id NUMBER NOT NULL, status VARCHAR(20) NOT NULL DEFAULT 'PENDING', region VARCHAR(10) NOT NULL, created_at TIMESTAMP_NTZ NOT NULL, total_amount NUMBER(12,2) NOT NULL, PRIMARY KEY (order_id), INDEX idx_orders_customer (customer_id), INDEX idx_orders_status_region (status, region) ) AS SELECT SEQ4(), UNIFORM(1, 10000, RANDOM())::NUMBER, ARRAY_CONSTRUCT('PENDING','SHIPPED','DELIVERED','CANCELLED')[UNIFORM(0,3,RANDOM())]::VARCHAR, ARRAY_CONSTRUCT('US-EAST','US-WEST','EU','APAC')[UNIFORM(0,3,RANDOM())]::VARCHAR, DATEADD(SECOND, UNIFORM(0,7776000,RANDOM()), DATEADD(DAY,-90,CURRENT_TIMESTAMP()))::TIMESTAMP_NTZ, ROUND(UNIFORM(5.00,2500.00,RANDOM()),2) FROM TABLE(GENERATOR(ROWCOUNT => 100000));
Step 1: Why Plan Compilation Matters
Every SQL statement Snowflake executes goes through two phases: compilation (parsing, optimization, plan generation) and execution (running the plan against storage). For standard analytical queries that run for seconds to minutes, compilation time is negligible. For Hybrid Table OLTP queries that target single-digit milliseconds, compilation can be 50% or more of total latency.
Measure Compilation vs Execution
SET next_id = (SELECT MAX(order_id) + 1 FROM orders); INSERT INTO orders (order_id, customer_id, status, region, created_at, total_amount) VALUES ($next_id, 4200, 'PENDING', 'US-EAST', CURRENT_TIMESTAMP()::TIMESTAMP_NTZ, 99.99);
Open the Query Profile. In the Profile Overview, note:
- Compilation time — time spent generating the query plan
- Execution time — time spent writing to the row store
For a single-row INSERT into a Hybrid Table, you will typically see compilation taking 20-60ms and execution taking 5-20ms. The write itself is fast; the compilation dominates.
The Plan Cache
When you use bound variables (parameterized queries), Snowflake stores the compiled plan in a cache. Subsequent executions with different parameter values skip compilation entirely and reuse the cached plan. This reduces per-query overhead from 30-60ms (compile + execute) to 5-20ms (execute only).
Without bound variables (using string literals), every unique combination of values produces a new query text, which requires a new compilation. At 1,000 queries per second, that is 1,000 compilations per second — each adding unnecessary latency and consuming compilation resources.
Step 2: Bound Variables and the Parameterized Hash
How the Plan Cache Identifies Reusable Plans
Snowflake computes a query_parameterized_hash for each query by replacing literal values in predicates with parameters, then hashing the canonicalized SQL text. Queries that produce the same hash reuse the same compiled plan.
What Shares the Same Hash
Two queries share the same query_parameterized_hash if they differ only in:
- Literal values used in comparison operators (
=,!=,>=,<=) - Whitespace differences
- Comment differences
- Case-insensitive identifier, session variable, and stage name differences
Demonstrating Plan Cache Behavior
Run several INSERTs with different literal values:
INSERT INTO orders VALUES (200001, 1001, 'PENDING', 'US-EAST', CURRENT_TIMESTAMP()::TIMESTAMP_NTZ, 49.99); INSERT INTO orders VALUES (200002, 2002, 'SHIPPED', 'EU', CURRENT_TIMESTAMP()::TIMESTAMP_NTZ, 149.99); INSERT INTO orders VALUES (200003, 3003, 'DELIVERED', 'APAC', CURRENT_TIMESTAMP()::TIMESTAMP_NTZ, 299.99);
Now check the parameterized hash — all three should share the same hash because only the literal values differ:
SELECT query_parameterized_hash, query_text, compilation_time, execution_time, total_elapsed_time FROM TABLE(INFORMATION_SCHEMA.QUERY_HISTORY( DATEADD('minutes', -5, CURRENT_TIMESTAMP()), CURRENT_TIMESTAMP() )) WHERE query_text LIKE 'INSERT INTO orders VALUES%' ORDER BY start_time DESC LIMIT 5;
All three INSERTs should show the same query_parameterized_hash. The first one has compilation time (plan was generated); subsequent ones may show reduced compilation (plan reused from cache).
What Breaks the Hash
These changes produce a DIFFERENT parameterized hash and force recompilation:
-- Different table reference (schema-qualified vs unqualified) INSERT INTO orders VALUES (200004, 4004, 'PENDING', 'US-WEST', CURRENT_TIMESTAMP()::TIMESTAMP_NTZ, 59.99); INSERT INTO HT_WRITE_QS_DB.DATA.orders VALUES (200005, 5005, 'PENDING', 'EU', CURRENT_TIMESTAMP()::TIMESTAMP_NTZ, 79.99);
SELECT query_parameterized_hash, query_text FROM TABLE(INFORMATION_SCHEMA.QUERY_HISTORY( DATEADD('minutes', -5, CURRENT_TIMESTAMP()), CURRENT_TIMESTAMP() )) WHERE query_text LIKE 'INSERT INTO%orders VALUES%' ORDER BY start_time DESC LIMIT 5;
The schema-qualified INSERT produces a different hash from the unqualified one. Your application now has two plan cache entries for what is logically the same operation. Each one compiles independently, reducing your effective cache hit rate.
Rule: Pick one naming convention (fully-qualified or unqualified) and use it consistently across all application queries. Mixing them splits your plan cache.
Step 3: Schema Qualification Consistency
This is one of the most common silent performance issues in Hybrid Table workloads. It produces no errors, no warnings — just doubled compilation overhead that is invisible unless you look at parameterized hashes.
The Problem
If your application has two code paths that write to the same table:
-- Path A (e.g., from a microservice using USE DATABASE/SCHEMA context) INSERT INTO orders VALUES (?, ?, ?, ?, ?, ?); -- Path B (e.g., from a stored procedure or different service using fully-qualified names) INSERT INTO HT_WRITE_QS_DB.DATA.orders VALUES (?, ?, ?, ?, ?, ?);
These produce different parameterized hashes. The plan cache maintains two entries, each warming up independently. Your cache hit rate is cut in half.
How to Detect It
SELECT query_parameterized_hash, COUNT(*) AS execution_count, ANY_VALUE(query_text) AS sample_query, AVG(compilation_time) AS avg_compile_ms, AVG(total_elapsed_time) AS avg_total_ms FROM TABLE(INFORMATION_SCHEMA.QUERY_HISTORY( DATEADD('hours', -1, CURRENT_TIMESTAMP()), CURRENT_TIMESTAMP() )) WHERE query_text ILIKE '%INSERT INTO%orders%' GROUP BY query_parameterized_hash ORDER BY execution_count DESC;
If you see multiple hashes for what is logically the same INSERT statement, check the sample_query column — the difference is usually schema qualification.
The Fix
Standardize on one convention. For applications that set session context (USE DATABASE, USE SCHEMA) at connection time, unqualified names are simpler:
-- At connection init (once) USE DATABASE HT_WRITE_QS_DB; USE SCHEMA DATA; -- All subsequent queries (unqualified, same hash every time) INSERT INTO orders VALUES (?, ?, ?, ?, ?, ?);
Step 4: Stored Procedure Overhead
Snowflake documentation explicitly states: executing with AUTOCOMMIT enabled or multi-statement transactions offers better performance than calling a stored procedure.
Why Stored Procedures Add Overhead
A stored procedure introduces:
- CALL compilation — the SP call itself must be compiled
- Context switching — execution transfers from Cloud Services to the XP layer for the SP runtime
- Child statement compilation — each SQL statement inside the SP is compiled separately
- Round-trip overhead — results pass back through the SP runtime to the caller
For a simple single-INSERT stored procedure, this overhead can be 50-100ms on top of the INSERT's own execution time.
Demonstration
Create a simple SP wrapper:
CREATE OR REPLACE PROCEDURE insert_order( p_order_id NUMBER, p_customer_id NUMBER, p_status VARCHAR, p_region VARCHAR, p_amount NUMBER ) RETURNS VARCHAR LANGUAGE SQL AS BEGIN INSERT INTO orders (order_id, customer_id, status, region, created_at, total_amount) VALUES (:p_order_id, :p_customer_id, :p_status, :p_region, CURRENT_TIMESTAMP()::TIMESTAMP_NTZ, :p_amount); RETURN 'OK'; END;
Compare the two approaches:
-- Direct INSERT (faster) SET direct_id = (SELECT MAX(order_id) + 1 FROM orders); INSERT INTO orders (order_id, customer_id, status, region, created_at, total_amount) VALUES ($direct_id, 7777, 'PENDING', 'US-EAST', CURRENT_TIMESTAMP()::TIMESTAMP_NTZ, 199.99); -- SP wrapper (slower) SET sp_id = (SELECT MAX(order_id) + 2 FROM orders); CALL insert_order($sp_id, 8888, 'PENDING', 'EU', 299.99);
Compare in query history:
SELECT query_text, total_elapsed_time, compilation_time, execution_time FROM TABLE(INFORMATION_SCHEMA.QUERY_HISTORY( DATEADD('minutes', -5, CURRENT_TIMESTAMP()), CURRENT_TIMESTAMP() )) WHERE query_text LIKE 'INSERT INTO orders%' OR query_text LIKE 'CALL insert_order%' ORDER BY start_time DESC LIMIT 5;
The CALL will show higher total_elapsed_time than the direct INSERT for the same logical operation.
When Stored Procedures Are Appropriate
- Complex business logic that cannot be expressed in a single SQL statement
- Error handling and retry logic
- Multi-step workflows that need procedural flow control
For simple DML wrappers (single INSERT, single UPDATE), the SP adds overhead with no benefit. Use direct parameterized SQL instead.
The Performance Hierarchy
From best to worst for Hybrid Table single-statement DML:
- Direct DML with AUTOCOMMIT=TRUE (default) — lowest overhead, plan cache works optimally
- Multi-statement transaction (explicit BEGIN/COMMIT) — slight overhead from transaction management, useful for batching related writes
- Stored procedure — highest overhead, use only when procedural logic is required
Step 5: COPY INTO Behavior on Hybrid Tables
COPY INTO is supported for Hybrid Tables with specific limitations:
Key Behaviors
ON_ERROR: OnlyABORT_STATEMENTis supported.SKIP_FILEandCONTINUEare not available.- Optimized bulk load: COPY INTO an empty Hybrid Table uses the fast path (same as CTAS and INSERT INTO SELECT on empty tables). Loading into a non-empty table uses standard INSERT performance.
- Approximate throughput on non-empty tables: ~1 million rows per minute (varies with row width and index count)
When to Use COPY INTO vs INSERT
| Scenario | Recommended Method |
|---|---|
| Initial table load (empty table) | CTAS (simplest) or COPY INTO (from stage) |
| Batch load from external stage (empty table) | COPY INTO — benefits from optimized path |
| Batch load into populated table | COPY INTO — works, but no fast path |
| Application real-time writes | Direct INSERT with bound variables |
| Kafka/streaming micro-batches | Batch INSERT via driver (see Connectors guide) |
Verify the Optimized Path Was Used
After a COPY INTO or INSERT INTO SELECT on an empty table, check the Query Profile statistics. If the optimized path was used, the statistics will show "Number of rows bulk loaded" rather than "Number of rows inserted."
Demonstration
-- Create a stage and export sample data as CSV CREATE OR REPLACE FILE FORMAT csv_fmt TYPE = CSV FIELD_OPTIONALLY_ENCLOSED_BY = '"'; CREATE OR REPLACE STAGE orders_stage; COPY INTO @orders_stage/orders_export FROM (SELECT order_id, customer_id, status, region, created_at, total_amount FROM orders LIMIT 10000) FILE_FORMAT = csv_fmt HEADER = TRUE; -- Create an empty HT and COPY INTO it (uses optimized bulk load path) CREATE OR REPLACE HYBRID TABLE orders_copy ( order_id NUMBER NOT NULL, customer_id NUMBER NOT NULL, status VARCHAR(20) NOT NULL, region VARCHAR(10) NOT NULL, created_at TIMESTAMP_NTZ NOT NULL, total_amount NUMBER(12,2) NOT NULL, PRIMARY KEY (order_id) ); CREATE OR REPLACE FILE FORMAT csv_load TYPE = CSV FIELD_OPTIONALLY_ENCLOSED_BY = '"' SKIP_HEADER = 1; COPY INTO orders_copy FROM @orders_stage/orders_export FILE_FORMAT = csv_load; SELECT COUNT(*) FROM orders_copy; -- Expected: 10000
Check the Query Profile for "Number of rows bulk loaded" in the statistics to confirm the fast path was used.
Step 6: DML Anti-Patterns
Anti-Pattern 1: String Concatenation Instead of Bound Variables
-- BAD: Every unique value creates a new parameterized hash -- (In application code, this looks like: f"INSERT INTO orders VALUES ({id}, ...)") INSERT INTO orders VALUES (300001, 1001, 'PENDING', 'US-EAST', '2026-06-16'::TIMESTAMP_NTZ, 49.99); INSERT INTO orders VALUES (300002, 1001, 'PENDING', 'US-EAST', '2026-06-16'::TIMESTAMP_NTZ, 49.99);
Both share the same hash in this case because the literals are in comparable positions. But if non-comparable positions change (e.g., table name constructed dynamically), each execution compiles separately.
-- GOOD: Use session variables or driver-level bound parameters SET v_id = 300003; SET v_cust = 1001; INSERT INTO orders VALUES ($v_id, $v_cust, 'PENDING', 'US-EAST', CURRENT_TIMESTAMP()::TIMESTAMP_NTZ, 49.99);
In application code, always use prepared statements (see the Connectors guide).
Anti-Pattern 2: CTAS Repeated Against a Populated Table
-- BAD: CREATE OR REPLACE on a non-empty HT re-creates indexes from scratch -- every execution. Appropriate for initial load, not for repeated refresh. CREATE OR REPLACE HYBRID TABLE orders (...) AS SELECT * FROM source_table;
If you need to periodically refresh a Hybrid Table from a source, use incremental patterns:
- MERGE for upsert semantics
- INSERT for new rows + DELETE for removals
- CTAS+SWAP only if you can tolerate the post-creation compaction window (see Streaming Patterns guide, Step 3)
Anti-Pattern 3: Single-Row INSERTs in a Loop
-- BAD: 1000 round-trips, 1000 compilations -- (In application code: for row in rows: cursor.execute("INSERT ...", row))
Even with bound variables, each execution is a separate network round-trip. Use batch APIs instead:
- Python:
cursor.executemany(sql, list_of_tuples) - JDBC:
stmt.addBatch()+stmt.executeBatch() - Node.js:
binds: [[row1], [row2], ...]
Recommended batch size: 500-1,000 rows per call.
Anti-Pattern 4: BI Tool Running Full-Table CTAS on a HT
-- BAD: BI materialization running GROUP BY directly on HT CREATE OR REPLACE TABLE dashboard_data AS SELECT region, status, COUNT(*), SUM(total_amount) FROM orders -- This is the Hybrid Table GROUP BY region, status;
This scans the entire HT via COLUMN_BASED mode on every materialization cycle. The fix: snapshot HT to a standard table first, then run analytics on the standard table. See Analytics Patterns for Hybrid Tables.
Anti-Pattern 5: Inconsistent Schema Qualification (Covered in Step 3)
Mixing INSERT INTO orders and INSERT INTO db.schema.orders splits the plan cache. Pick one and stick with it.
Step 7: Measuring Write Performance
Use AGGREGATE_QUERY_HISTORY for High-Throughput Workloads
Sub-second HT queries may not appear in QUERY_HISTORY. Use AGGREGATE_QUERY_HISTORY for accurate measurement:
SELECT query_parameterized_hash, ANY_VALUE(query_text) AS sample_query, SUM(calls) AS total_executions, AVG(compilation_time:"avg"::FLOAT) AS avg_compile_ms, AVG(execution_time:"avg"::FLOAT) AS avg_exec_ms, AVG(total_elapsed_time:"avg"::FLOAT) AS avg_total_ms, MAX(total_elapsed_time:"p99"::FLOAT) AS p99_total_ms FROM SNOWFLAKE.ACCOUNT_USAGE.AGGREGATE_QUERY_HISTORY WHERE interval_start_time > DATEADD(HOUR, -1, CURRENT_TIMESTAMP()) AND warehouse_name = 'HT_WRITE_QS_WH' AND query_text ILIKE '%INSERT%' GROUP BY query_parameterized_hash ORDER BY total_executions DESC;
Key Metrics to Track
| Metric | Healthy | Investigate |
|---|---|---|
| avg_compile_ms / avg_total_ms | < 30% | > 50% (plan cache not warming) |
| Distinct parameterized hashes for same logical operation | 1 | > 1 (schema qualification issue) |
| P99 / P50 ratio | < 5x | > 10x (compaction or throttling interference) |
hybrid_table_requests_throttled_count | 0 | > 0 (too many concurrent requests) |
Quick Diagnostic
If writes are slower than expected:
- Check
compilation_timevsexecution_timein query profile — if compilation dominates, it's a plan cache problem - Check
query_parameterized_hashcardinality — multiple hashes for the same operation means cache splitting - Check for SP wrapper overhead — compare CALL latency vs child INSERT latency
- Check for throttling —
hybrid_table_requests_throttled_count > 0in AGGREGATE_QUERY_HISTORY
Cleanup
USE ROLE ACCOUNTADMIN; DROP DATABASE IF EXISTS HT_WRITE_QS_DB; DROP WAREHOUSE IF EXISTS HT_WRITE_QS_WH; DROP ROLE IF EXISTS HT_WRITE_QS_ROLE;
Conclusion and Resources
You can now:
- Use bound variables to achieve plan cache reuse and eliminate unnecessary compilation
- Detect plan cache splits caused by schema qualification inconsistency
- Avoid stored procedure overhead for simple DML operations
- Use COPY INTO correctly (understanding ON_ERROR and fast-path limitations)
- Measure write performance using
query_parameterized_hashand AGGREGATE_QUERY_HISTORY - Diagnose whether compilation or execution is the latency bottleneck
Write Optimization Checklist
| Check | Action |
|---|---|
| Using bound variables? | All application queries should use prepared statements |
| One parameterized hash per logical operation? | Standardize schema qualification |
| SP wrapping simple DML? | Remove SP, use direct INSERT |
| Compilation > 50% of total latency? | Plan cache not warming — check hash cardinality |
| Loading into non-empty HT? | No fast path available — ~1M rows/min throughput |
| BI tool scanning HT directly? | Snapshot to standard table first |
Related Resources
FAQ
Q: How do I know if my plan cache is working?
Check compilation_time in query profiles. After the first execution, subsequent executions of the same parameterized query should show near-zero compilation time if the plan cache is active. You can also check AGGREGATE_QUERY_HISTORY — if compilation_time:"avg" is consistently high across thousands of executions, the cache is not being used.
Q: Should I worry about plan cache eviction?
The plan cache has a finite size. Extremely diverse workloads (thousands of distinct query shapes) may experience eviction. For typical OLTP workloads with a small number of distinct INSERT/SELECT/UPDATE patterns, eviction is unlikely. Reducing the number of distinct parameterized hashes (by fixing schema qualification issues) reduces eviction pressure.
Q: Can I force a plan cache clear?
No public mechanism exists to explicitly clear the plan cache. Suspending and resuming the warehouse will clear warehouse-level caches, but the plan cache (managed at the Cloud Services layer) has its own lifecycle.
Q: Is there a plan cache hit rate metric?
No public metric exposes plan cache hit rate directly. The closest proxy is monitoring compilation_time across repeated executions of the same parameterized hash — consistently low compilation time indicates the cache is working.
Q: Why does COPY INTO only support ON_ERROR = ABORT_STATEMENT for Hybrid Tables?
Hybrid Table writes are transactional and enforce constraints (PK uniqueness, FK integrity, NOT NULL). Partial-file loading (SKIP_FILE, CONTINUE) would require rolling back some rows while keeping others, which conflicts with the row-level constraint enforcement model. The entire COPY operation must succeed or fail atomically.
Q: What batch size should I use for INSERT?
There is no single answer — it depends on row width and index count. A practical starting point is 500-1,000 rows per batch call. Profile your specific workload: too small (10-50 rows) wastes round-trips; too large (10,000+) risks hitting query size limits or holding transactions too long.
This content is provided as is, and is not maintained on an ongoing basis. It may be out of date with current Snowflake instances