The future of AI, revealed live

Stream Summit keynotes free June 1–2.

Snowflake for Developers/Guides/Build an End-to-End Application Using CoCo on Snowflake
Quickstart

Build an End-to-End Application Using CoCo on Snowflake

Dash Desai

Overview

In this hands-on lab, you'll build a complete AI-powered retail analytics platform entirely within Snowflake — no external infrastructure required. Using Snowflake CoCo as your AI-assisted development environment, you'll work through the full data lifecycle: stream real-time orders via Snowpipe Streaming, MERGE them into production tables with Gen2 Warehouses, transform them through a 3-tier Dynamic Tables pipeline, and serve them with Interactive Tables for low-latency point lookups.

You'll build analytical models with dbt, monitor data quality with Data Metric Functions, and create custom CoCo skills for reusable workflows. Tie it all together with Snowflake CoWork — a conversational AI interface where a Cortex Agent orchestrates Cortex Analyst and Agentic Search to answer "what happened" and "why" from both structured and unstructured data. Finally, evaluate your agent with ground-truth datasets, implement row-level security, and expose your agent as a managed MCP server for external AI clients.

What You'll Learn

  • Accelerate development with Snowflake CoCo (AI-assisted SQL, deployment, and data exploration)
  • Stream real-time data with Snowpipe Streaming and transform with Dynamic Tables
  • Serve low-latency queries with Interactive Tables and Gen2 Warehouses
  • Build analytical models with dbt
  • Monitor data quality automatically with Data Metric Functions
  • Create and query managed Iceberg V3 tables (deletion vectors, row lineage) (optional)
  • Create custom CoCo skills for reusable team workflows
  • Build a Cortex Agent with Cortex Analyst (semantic view + verified queries) and Agentic Search (multi-index Cortex Search)
  • Evaluate agent quality with ground-truth datasets and LLM judges
  • Expose agents as managed MCP servers for external AI clients
  • Implement transparent row-level security with Row Access Policies

What You'll Build

A production-grade AI-powered retail analytics platform on Snowflake — from raw data to conversational AI insights, entirely within a single platform. You'll create dynamic transformation pipelines, interactive low-latency tables, dbt analytical models, a Cortex Agent that answers questions across structured and unstructured data, row-level security that works transparently through AI, and an MCP server that exposes your agent to external clients.

Prerequisites

  • Access to a Snowflake account
  • Python 3.8+ installed locally
  • Git installed locally
  • Basic familiarity with SQL and command-line tools

Setup

Install Snowflake CLI

The Snowflake CLI (snow) lets you run SQL, deploy apps, and manage Snowflake objects from your terminal.

macOS (using Homebrew):

If you don't have Homebrew installed, first install it by opening Terminal and running:

/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

Then install Snowflake CLI:

brew install snowflake-cli

Windows:

pip install snowflake-cli

Linux:

pip install snowflake-cli

Verify the installation:

snow --version

You should see output like Snowflake CLI version: 3.x.x.

Install Snowflake CoCo

Snowflake CoCo is an AI-powered coding assistant that runs in your terminal. It helps you write SQL, build pipelines, deploy apps, and explore your data using natural language prompts.

macOS (using Homebrew):

brew install cortex

Windows / Linux:

pip install cortex

Verify the installation:

cortex --version

Configure Snowflake Connection

Before you can run any commands against Snowflake, you need to configure a connection. This tells the CLI which Snowflake account to connect to and how to authenticate.

Run the interactive connection wizard:

snow connection add

You'll be prompted for the following values (enter them one at a time):

PromptWhat to enterExample
Connection nameA short name for this connectionhol
Account identifierYour Snowflake account URL (without .snowflakecomputing.com)myorg-myaccount
UserYour Snowflake usernamejsmith
PasswordYour Snowflake password(hidden)
RoleACCOUNTADMINACCOUNTADMIN
WarehouseLeave blank (will be created by setup)(leave empty)
DatabaseLeave blank (will be created by setup)(leave empty)

Tip: Your account identifier is the part before .snowflakecomputing.com in your Snowflake URL. For example, if you log in at https://myorg-myaccount.snowflakecomputing.com, your account identifier is myorg-myaccount.

Test that your connection works:

snow connection test -c hol

You should see Status: OK.

Clone the Lab Repository

git clone https://github.com/Snowflake-Labs/automated-intelligence-dev-day-2026-hol.git
cd automated-intelligence-dev-day-2026-hol

Run Infrastructure Setup

Launch Snowflake CoCo and verify your connection:

cortex

What to expect: CoCo will start an interactive session in your terminal. You'll see your active connection, role, and warehouse displayed. You can type natural language prompts and CoCo will translate them into SQL or actions.

Then run the core infrastructure script (this takes ~10-15 minutes):

snow sql -f setup.sql -c hol

This creates the database, schemas, warehouses, tables, Dynamic Tables pipeline, Interactive Tables, Cortex Search Services, Semantic View, seed data (10M orders, 25M order items, 2M customers), and Row Access Policy.

Gen2 Warehouse: Optima Indexing

Gen2 Warehouses introduce Optima Indexing — an automatic indexing layer that prunes partitions at query time without explicit clustering keys. Gen2 warehouses learn from your query patterns and build internal indices that accelerate point lookups and filtered scans.

Demonstrate Optima Indexing in action:

Prompt CoCo:

"Run a point lookup for customer_id 5000 on the Gen2 warehouse"

Open the query profile in Snowsight to see partition pruning — only a fraction of partitions scanned despite no explicit clustering key. This is Gen2's Optima Indexing in action.

Dynamic Tables Pipeline

Dynamic Tables are declarative data pipelines — you define the target state as a SQL query and Snowflake handles incremental refresh automatically. The TARGET_LAG parameter controls freshness: set a time-based lag (e.g., 1 minute) for Tier 1 tables, or use DOWNSTREAM for tables that refresh only when their upstream sources change.

Prompt CoCo:

"Show me the Dynamic Tables pipeline status — names, target lag, last refresh time, and row counts for each tier"

CoCo displays the 3-tier pipeline:

  • Tier 1 (1-min lag): enriched_orders (10M rows), enriched_order_items (25M rows)
  • Tier 2 (DOWNSTREAM): fact_orders (25M rows)
  • Tier 3 (DOWNSTREAM): daily_business_metrics (118 rows), product_performance_metrics (4 rows)

Note: Row counts shown are for the default data_scale = '10M'. If you chose '50M' in setup.sql, expect ~50M enriched_orders, ~161M order_items, and 365 daily metrics rows.

Explore Results

Ask CoCo:

"Show me a sample of the daily business metrics — top 5 days by revenue"

Expected: Top-5 days are in September 2025 (back-to-season peak), each with ~$183M revenue and ~117K orders.

Interactive Tables

Interactive Tables are purpose-built for low-latency, high-concurrency point lookups. They maintain pre-computed results with clustering optimized for equality predicates, delivering sub-second response times for dashboard filters and application queries that would otherwise require full table scans.

Point Lookups

Run queries in Snowsight to observe sub-second latency:

USE WAREHOUSE hol_interactive_wh;
ALTER SESSION SET USE_CACHED_RESULT = FALSE;

-- Point lookup by customer ID
SELECT * FROM dash_automated_intelligence_db.interactive.customer_order_analytics
WHERE customer_id = 1;

-- Point lookup by order ID
-- Run: SELECT order_id FROM dash_automated_intelligence_db.interactive.order_lookup LIMIT 5; to get UUIDs to use
SELECT * FROM dash_automated_intelligence_db.interactive.order_lookup
WHERE order_id = '<any-order-uuid>';

Concurrency Load Test

Prompt CoCo:

"Run the interactive tables load test at interactive/load_test.py"

This fires 200 concurrent sessions (1000 queries total) against both Interactive and Standard warehouses, then compares P50/P90/P99 latencies. You should see notably lower latency and higher throughput on the Interactive warehouse.

Run the load test a second time to observe the effect of warm caches. Results may vary depending on account, region, and data scale.

Data Quality

Data Metric Functions (DMFs) let you attach automated quality checks directly to table columns. Snowflake runs them on a schedule and stores results in SNOWFLAKE.LOCAL.DATA_QUALITY_MONITORING_RESULTS. Built-in DMFs include NULL_COUNT, DUPLICATE_COUNT, UNIQUE_COUNT, and FRESHNESS — or you can write custom ones.

The setup script injected ~200 NULL values into orders.total_amount and order_items.quantity, plus ~150 NULLs into order_items.product_name. DMFs detect the first two — but there's a gap.

Discover the Gap

Prompt CoCo:

"Check the data quality monitoring results and show me which columns have NULL violations"

CoCo shows that TOTAL_AMOUNT (200 NULLs) and QUANTITY (200 NULLs) have violations — but product_name NULLs are going undetected.

"Are there any NULL values in order_items.product_name? Is that column being monitored?"

CoCo finds ~150 NULLs and reveals the DMF is mis-attached to product_category instead of product_name.

Fix the Coverage

"Fix the DMF — remove the NULL check from product_category and add it to product_name instead"

ALTER TABLE order_items DROP DATA METRIC FUNCTION SNOWFLAKE.CORE.NULL_COUNT ON (product_category);
ALTER TABLE order_items ADD DATA METRIC FUNCTION SNOWFLAKE.CORE.NULL_COUNT ON (product_name);

This demonstrates the real-world workflow: monitor, discover gaps, fix coverage.

dbt Analytics

dbt-snowflake transforms raw data into analytics-ready models using SQL SELECT statements. The dbt-snowflake adapter integrates natively with Snowflake, supporting incremental models, snapshots, and built-in testing. In this HOL, dbt creates staging views and mart tables for customer lifetime value, segmentation, product performance, and support analytics.

Prerequisites: dbt-core and dbt-snowflake must be installed (pip install dbt-snowflake). CoCo will handle this for you if not already installed.

Prompt CoCo:

"Install dbt dependencies and build all models in the dbt-analytics project"

CoCo runs dbt deps then dbt build to create all staging views and mart tables (9+ models). CoCo automatically injects your active Snowflake connection into the dbt profile — no manual configuration needed.

Expected output: 71 tests pass, 1 warning (the source_not_null_raw_orders_total_amount test detects the 200 NULLs we injected for the Data Quality exercise — this is working as designed).

Explore Results

"Show me the customer lifetime value segments — how many customers are in each value tier?"

CoCo Custom Skill

CoCo Custom Skills let you package repeatable workflows into named commands that any team member can invoke. A skill is a Markdown file (.cortex/skills/<name>/SKILL.md) that defines triggers, parameters, and step-by-step instructions CoCo follows when the skill is activated.

Create a reusable skill that automates table profiling:

Prompt CoCo:

"Create a custom CoCo skill called 'profile-table' that takes a table name, counts rows, checks for NULL columns, shows distinct value counts, and flags potential data quality issues"

CoCo creates .cortex/skills/profile-table/SKILL.md with the skill definition, triggers, and step-by-step instructions.

Note: After creating the skill, restart CoCo (type /quit then cortex) for the new skill to become active.

Test It

"$profile-table DASH_AUTOMATED_INTELLIGENCE_DB.RAW.ORDERS"

This demonstrates how teams package repeatable workflows as shareable CoCo skills.

Snowflake CoWork

Cortex Agents are multi-tool AI orchestrators that route questions to the right data source. They combine Cortex Analyst (text-to-SQL over Semantic Views) with Cortex Search (vector + keyword search over unstructured data) to answer both "what happened" and "why" from a single conversational interface.

Create the Agent

Prompt CoCo:

"Run snowflake-cowork/create_agent.sql to create the Business Insights Agent"

Test Agent Routing

Open the Snowflake CoWork interface in Snowsight: navigate to AI & ML → CoWork (or search "CoWork" in the global search bar). Select the BUSINESS_INSIGHTS_AGENT agent. Then try each question to demonstrate different tool routing:

QuestionTools Used
"Show me monthly revenue trend from June to September 2025"Cortex Analyst (text-to-SQL)
"Which month had the lowest revenue, and what do customer reviews say about that period?"Cortex Analyst + Agentic Search
"Find reviews mentioning wrong size with a rating below 3"Agentic Search (filtered)
"Why are customers returning ski boots?"Agentic Search (reviews + tickets)
"What is our total revenue and customer count by state?"Cortex Analyst (text-to-SQL)
"What are the top complaint themes in support tickets?"Agentic Search (filter + AI_AGG)
"How many reviews mention sizing issues, and which products are most affected?"Agentic Search (search + breakdown)

This is the capstone moment — the agent routes across structured data (text-to-SQL) and unstructured data (Cortex Search) to answer "what happened" and "why."

Security and Governance

Row Access Policies enforce row-level security declaratively — you define a boolean expression that determines which rows are visible to which roles, and Snowflake applies it transparently to every query (including those generated by AI agents). No application code changes required.

The Row Access Policy and WEST_COAST_MANAGER role were created by setup.sql. Demonstrate how row-level security transparently filters data based on role:

As ACCOUNTADMIN (full access):

USE ROLE ACCOUNTADMIN;
USE WAREHOUSE HOL_WH;
SELECT c.state, SUM(o.total_amount) AS total_revenue, COUNT(DISTINCT c.customer_id) AS customer_count
FROM dash_automated_intelligence_db.raw.orders o
JOIN dash_automated_intelligence_db.raw.customers c ON o.customer_id = c.customer_id
GROUP BY c.state ORDER BY total_revenue DESC;

Result: all 10 states visible.

As WEST_COAST_MANAGER (restricted):

USE ROLE WEST_COAST_MANAGER;
USE WAREHOUSE HOL_WH;
SELECT c.state, SUM(o.total_amount) AS total_revenue, COUNT(DISTINCT c.customer_id) AS customer_count
FROM dash_automated_intelligence_db.raw.orders o
JOIN dash_automated_intelligence_db.raw.customers c ON o.customer_id = c.customer_id
GROUP BY c.state ORDER BY total_revenue DESC;

Result: only CA, OR, WA appear — the Row Access Policy transparently filters data.

Key insight: Same query, same tables — different results based on who's asking. Row-level security enforces data boundaries without changing application logic.

Verify Through CoWork

Switch to the WEST_COAST_MANAGER role in Snowsight, then open CoWork and ask the agent:

"What is our total revenue and customer count by state?"

The agent returns results for only CA, OR, and WA — the Row Access Policy filters data transparently, even through AI-generated SQL.

Streamlit Dashboard

Streamlit in Snowflake lets you build and deploy interactive data applications directly within your Snowflake account — no external infrastructure needed. Apps run securely inside Snowflake, with native access to your data and governed by the same role-based access control.

Prerequisites: The dbt models must be built first (previous section). The dashboard queries DBT_ANALYTICS and DBT_STAGING tables.

Prompt CoCo:

"Deploy the Streamlit dashboard to Snowflake"

CoCo runs snow streamlit deploy from the streamlit-dashboard/ directory. Once deployed, open the app URL in Snowsight to explore:

  • Summary — Revenue KPIs, order trends, customer counts
  • Customer & Product Analytics — Lifetime value segments, product performance
  • Pipeline Health — Dynamic Tables refresh status, data freshness monitoring

Agent Evaluation

Agent Evaluation lets you measure agent quality using ground-truth datasets and LLM-as-judge scoring. You define input queries and expected answers, then Snowflake runs the agent against each question and scores responses on Answer Correctness (does the response match ground truth?) and Logical Consistency (are the reasoning steps internally coherent?).

The evaluation dataset (7 questions + ground truth) was created by setup.sql. Run the evaluation in Snowsight:

Run via Snowsight UI

  1. Switch to the ACCOUNTADMIN role in Snowsight (top-left role selector)
  2. Navigate to AI and ML > Agents > BUSINESS_INSIGHTS_AGENT > Evaluations tab
  3. Click New evaluation run, name it (e.g. hol-eval-run-1), click Next
  4. Select Create new dataset from table
  5. Under Source table, set Database and schema to DASH_AUTOMATED_INTELLIGENCE_DB.SEMANTIC, then select AGENT_EVALUATION_DATA
  6. Under New dataset location, keep DASH_AUTOMATED_INTELLIGENCE_DB.SEMANTIC
  7. Set Dataset name: hol_eval_dataset
  8. Click Next
  9. Under Define metrics, confirm Input query = INPUT_QUERY
  10. Toggle on Answer Correctness, set Expected answer = GROUND_TRUTH
  11. Toggle on Logical Consistency
  12. Click Create — evaluation starts automatically (~3 min)

Interpret Results

  • Answer Correctness — Did the agent's response match ground truth? Scored 0-1 per question.
  • Logical Consistency — Were planning steps, tool calls, and response internally consistent? (Reference-free.)
  • Per-question drill-down — Select any row to see the full thread: planning, tool invocations, response generation.

Improve Scores (Stretch)

If questions score low on logical consistency:

  1. Click a low-scoring row and view Thread details
  2. Look for vague reasoning about tool selection in the Planning step
  3. Update the agent's instructions to be more explicit
  4. Recreate the agent and re-run the evaluation

MCP Server

Snowflake MCP Servers expose your Cortex Agents, Semantic Views, and Search Services as tools discoverable via the open Model Context Protocol (MCP). Any MCP-compatible client (CoCo CLI, Claude Desktop, custom apps) can connect to the server endpoint and invoke tools programmatically — turning your Snowflake AI stack into a reusable service layer.

Expose the Business Insights Agent as a managed MCP server:

Prompt CoCo:

"Create a Snowflake-managed MCP server that exposes our Business Insights Agent, semantic view, and customer feedback search as tools"

CoCo creates the MCP server:

Note: CoCo generates tool names (like revenue-analytics, customer-feedback-search) based on your prompt. Your names may differ — what matters is that the type and identifier point to the correct objects.

CREATE MCP SERVER business_insights_mcp
  FROM SPECIFICATION $$
    tools:
      - name: "business-insights-agent"
        type: "CORTEX_AGENT_RUN"
        identifier: "DASH_AUTOMATED_INTELLIGENCE_DB.SEMANTIC.BUSINESS_INSIGHTS_AGENT"
        description: "AI agent that answers business questions using structured data and customer feedback"
        title: "Business Insights Agent"

      - name: "revenue-analytics"
        type: "CORTEX_ANALYST_MESSAGE"
        identifier: "DASH_AUTOMATED_INTELLIGENCE_DB.SEMANTIC.BUSINESS_ANALYTICS_SEMANTIC"
        description: "Text-to-SQL for revenue, orders, customers, and product metrics"
        title: "Revenue Analytics"

      - name: "customer-feedback-search"
        type: "CORTEX_SEARCH_SERVICE_QUERY"
        identifier: "DASH_AUTOMATED_INTELLIGENCE_DB.RAW.CUSTOMER_FEEDBACK_SEARCH"
        description: "Search across product reviews and support tickets"
        title: "Customer Feedback Search"
  $$;

Connect from CoCo

cortex mcp add business-insights https://<account_url>/api/v2/databases/DASH_AUTOMATED_INTELLIGENCE_DB/schemas/SEMANTIC/mcp-servers/BUSINESS-INSIGHTS-MCP --type http

Now any MCP-compatible client (CoCo, Claude Desktop, custom apps) can discover and call these tools via the standard MCP protocol.

Optional: Iceberg V3 Features

Note: This section is optional. It demonstrates Iceberg V3 capabilities (deletion vectors, default values) using CoCo-generated SQL. No other sections depend on it.

Note: CoCo may take a few attempts to generate correct SQL for Iceberg V3 features (these are newer APIs). If you see "error executing SQL," let CoCo retry — it will self-correct and the end result will work.

Create a Managed Iceberg Table

Prompt CoCo:

"Create a managed Iceberg table from RAW.ORDERS with clustering by year and month, then query it to show partition pruning"

CoCo creates the table with CATALOG='SNOWFLAKE' (no external volume needed) and demonstrates partition pruning on filtered queries.

Explore V3: Deletion Vectors

"Create an Iceberg V3 table from RAW.ORDERS (ICEBERG_VERSION=3) with merge-on-read enabled, insert 1000 rows, then update 10 of them to demonstrate deletion vectors"

CoCo creates a V3 table with ENABLE_ICEBERG_MERGE_ON_READ = TRUE, inserts data, then runs an UPDATE that uses deletion vectors instead of full file rewrites.

Explore V3: Default Values

"Add a new column 'priority' with default value 'STANDARD' to the V3 table and show that existing rows get the default without a backfill"

This demonstrates V3 schema evolution without rewriting data files.

Optional: Streaming Ingestion

Note: This section is optional. The setup.sql script already loads all 50M orders directly. This section demonstrates how you would stream data in production using the Snowpipe Streaming Python SDK.

Generate RSA Key Pair

Generate keys for Snowpipe Streaming authentication:

# Generate private key (unencrypted PEM)
openssl genrsa 2048 | openssl pkcs8 -topk8 -inform PEM -out rsa_key.p8 -nocrypt

# Generate public key
openssl rsa -in rsa_key.p8 -pubout -out rsa_key.pub

# Upload public key to your Snowflake user (replace <your-username>)
snow sql -q "ALTER USER <your-username> SET RSA_PUBLIC_KEY='$(grep -v -- '-----' rsa_key.pub | tr -d '\n')'" -c hol

# Verify
snow sql -q "DESC USER <your-username>" -c hol | grep RSA_PUBLIC_KEY_FP

Stream Data

cd snowpipe-streaming-python
pip install -r requirements.txt

# Copy and configure profile
cp profile.json.template profile.json

Edit profile.json and set your account, user, private_key (contents of rsa_key.p8), and role.

# Stream 10,000 orders
python src/automated_intelligence_streaming.py 10000

Verify Data Landed

SELECT COUNT(*) FROM dash_automated_intelligence_db.staging.orders_staging;
SELECT COUNT(*) FROM dash_automated_intelligence_db.staging.order_items_staging;

You should see 10,000 orders and ~50,000 order items in staging.

Merge into Production

Use CoCo to merge the streamed data:

"Switch to the Gen2 warehouse, check how many rows are in staging, then merge them into RAW and show me the results"

Conclusion And Resources

Congratulations! You've successfully built a complete AI-powered retail analytics platform on Snowflake — from streaming ingestion through conversational AI and MCP server exposure.

What You Learned

  • How to stream real-time data with Snowpipe Streaming and merge with Gen2 Warehouses
  • How to build incremental transformation pipelines with Dynamic Tables
  • How to serve sub-second point lookups with Interactive Tables
  • How to create and query Iceberg V3 tables with deletion vectors and default values
  • How to monitor and fix data quality gaps with Data Metric Functions
  • How to build dbt analytical models on Snowflake
  • How to create custom CoCo skills for repeatable workflows
  • How to build a Cortex Agent that routes across structured (Analyst) and unstructured (Search) data
  • How to evaluate agent quality with ground-truth datasets
  • How to implement transparent row-level security through AI agents
  • How to expose AI capabilities as managed MCP servers

Related Resources

Documentation:

Updated 2026-06-03

This content is provided as is, and is not maintained on an ongoing basis. It may be out of date with current Snowflake instances