AI & ML

Cortex Agents: The Platform Powering Snowflake Intelligence and Enterprise AI Agents

Snowflake Intelligence is redefining how business users work with data—turning insights into decisions and actions across the enterprise. Powering this experience are Cortex Agents—Snowflake’s platform for building and governing enterprise AI agents. Cortex Agents orchestrate multi-step work across data, workflows and external systems. Teams can also use them to directly build and scale custom agents. 

Enterprise teams are moving from prototyping AI agents to pushing them to production. Now teams want to measure agent efficiency and accuracy across business functions, so they can move from experimentation to driving real business outcomes, all within a trusted, governed environment.

With that, agent developers need a platform that lets them build reliably, iterate quickly and ship to production with confidence. Account admins need scalable guardrails — spending controls, data isolation and governance that work across every team. 

Cortex Agents supports capabilities that span the full lifecycle of enterprise agent development: 

  • Build agents that may connect to any external tool via MCP as well as run code in a sandbox within the Snowflake perimeter.
  • Scale them across thousands of users while enforcing per-tenant data isolation using familiar Snowflake row access policies. 
  • Govern with the same policies and cost management through comprehensive budgets. 
  • Iterate with an evaluation framework to continuously improve quality. 

Together, they represent the infrastructure enterprises need to move agents from pilots to production at scale with a managed solution that originates inside Snowflake's security and governance perimeter—the same foundation that powers Snowflake Intelligence.

“With Snowflake Intelligence, our teams across more than 1,600 locations can use natural language to better understand operational performance and access real-time insights without relying on analysts. This is accelerating decision-making and creating stronger alignment across the business, grounded in a single source of governed data. Looking ahead, Cortex Code is helping us build and scale AI agents to accelerate sales growth and improve fleet availability, advancing how we operate every day.”

Tony Leopold
Chief Technology and Strategy Officer, United Rentals

Build: Connect to everything, run anything

Thousands of users rely on AI to query their data daily. Taking action — creating a Jira ticket, updating a Salesforce record, running an analysis or generating a chart — is the harder step: it requires connecting to external systems, executing code and turning  domain knowledge into repeatable workflows. Until now, this meant writing custom tool integrations, standing up infrastructure and maintaining it all yourself. The integration burden pushed many teams to defer production deployments. Snowflake Intelligence now natively brings together MCP connectors, code execution and skills, making it easy to build agents that answer questions and take real action across enterprise systems.

MCP connectors: A standard for external data integration

Model Context Protocol (MCP) has emerged as the open standard for connecting AI agents to external tools and services. Cortex Agents now support MCP connectors (generally available soon) — a native integration layer that lets you connect your agent to any MCP-compatible server, including Atlassian (Jira and Confluence), GitHub, Salesforce, Google Workspace and Slack, with minimal configuration.

The setup follows a standardized interface and straightforward pattern. A sales ops team can have their Salesforce and JIRA instances connected quickly and interact through the Snowflake Intelligence interface, powered by Cortex Agents, to track pipeline, upload customer reports and create  tickets automatically.

The same governance model that applies to Snowflake objects — roles, grants, audit logging — applies to external MCP servers which simplifies management. 

sales-1

Fig 1: MCP Connectors: Seamless Integration with Enterprise Tools

Agent skills: Modular, reusable task packages

Agents built solely for data retrieval can only answer questions. Agent skills (generally available soon) allow agents to perform repeatable, multistep tasks — running forecasts, generating reports, executing workflows — using modular packages that users define and deploy once.

Enterprises can now codify the expertise spread across the organization into skills that can be reused by multiple teams. For example, a forecasting skill built by the analytics team can be used by agents in operations, sales, marketing without duplicating code. 

Agents equipped with the right skills can execute those tasks on demand — freeing up employees to focus on higher-value work. As more skills are contributed and shared across the organization, the agent becomes progressively more capable without additional development effort, compounding productivity gains across teams.

sales-2

Fig 2: Agent Skills: Reusable Workflows Across Teams

“​​Snowflake has become a core part of how we’re applying AI across our operations. With Snowflake Intelligence, our teams can analyze manufacturing performance, surface insights faster, and even anticipate equipment and process issues before they happen. We’ve already deployed dozens of AI agents across manufacturing, quality, supply chain, and finance, giving teams faster access to trusted data and critical knowledge. This is helping us improve efficiency and accelerate insights enabling faster actions on the factory floor. It’s a meaningful step forward in how we operate and scale as a business.”

Priya Almelkar
CIO, Wolfspeed

Code Execution Tool: Sandboxed Python in every agent

Some tasks require code that doesn't exist yet — generated and implemented in line with the planned task. The Code Execution Tool (public preview soon) gives agents a sandboxed Python environment to generate and run code as part of  a conversation. Developers can use it to analyze data, solve problems and generate documents in formats such as PDFs.

The sandbox enforces session-level isolation, so agents can't access data beyond what's passed into the current conversation. The agent decides when code execution is the right approach and invokes it accordingly. If a user asks for a revenue trend chart, the agent generates the Python, runs it and returns the visualization — without any additional configuration.

For more advanced use cases, the sandbox can be extended. Additional Python packages are available through the Snowflake Artifact Repository, which hosts curated packages approved for use within Snowflake. External network access can be granted via network rules and external access integrations, enabling the agent to reach APIs, pull data from external sources and push results to downstream systems.

When skills reference Python scripts, they rely on the Code Execution Tool to run them — the two capabilities are designed to work together. An organization can build a library of skills that execute code, and the Code Execution Tool handles the runtime.

bash

Fig 3: Code Execution: Secure, On-Demand Python in Every Agent

Scale: Deploy across teams without compromise

A single successful pilot agent often creates pressure to expand: Can we deploy this to all 5,000 sales reps? Can we share it across multiple sales regions? The answers usually expose a gap between what works for one team and what works at scale. Multi-tenant deployments require data isolation between groups. Rolling out updates requires testing without disrupting production. Both problems have historically required weeks of custom engineering.

Multi-tenancy: One agent, many tenants, isolated data

Multi-tenancy (generally available) enables a single Cortex Agent to serve multiple tenants — different teams, regions or customers — while enforcing strict data isolation between them without deploying separate agent instances.

The model uses session attributes and row access policies. When calling the agent the application passes tenant specific values that are persisted in the session. Before the agent performs any SQL, those values are set as session attributes. A row access policy on your table references those attributes to filter rows — so a regional sales agent can only see data for the region the calling application specified.

Snowflake supports immutable session attributes, meaning tenant context is set once and cannot be modified by any generated SQL, code execution or tool invocation during the session. Snowflake strongly recommends using immutable session attributes for row access policies so that even if an adversarial query attempts to modify the tenant context, the isolation requirement holds.

Agent versioning: Lifecycle management for production

If a new application configuration breaks something in production, the question isn't, "What went wrong?" — it's "How quickly can I roll back?" Agent versioning (generally available) solves this by introducing a commit-based lifecycle model that separates development from production.

Every agent has a live version — a mutable working copy for development — and can have any number of named versions — immutable snapshots created by committing the live version. The commit creates a system-assigned identifier (VERSION$1, VERSION$2 and so on). Named versions cannot be modified; their immutability is the foundation of reliable deployments.

Aliases provide human-readable routing labels — production, staging, canary — that you assign to named versions. Reassigning an alias from one version to another redirects all traffic without any change to the calling application. Promoting a new version is a one-line command; rolling back is the same command pointing to the previous version.

Teams managing agent configurations in Git can create named versions by importing directly from a Git-connected stage, bypassing the live version entirely. The result is an agent deployment model that engineers already know: Build in development, test in staging, promote to production and roll back if needed.

version

Fig 4: Agent Versioning: Safe Deployments with Instant Rollback

Govern: Control costs at every level

AI agent usage translates directly into credits consumed. Enterprises that deploy agents broadly need answers to questions that aren't yet standard in most platforms: How much is the finance team spending on the shared analytics agent? What happens when a team exhausts its monthly budget — does access get cut off automatically? Can I give the sales team a limit independent of engineering's limit on the same agent? With comprehensive cost management and usage tracking teams can feel confident in scaling AI without worrying about budget overruns. 

Resource budgets: Agent-level spending control

Resource budgets for Cortex Agents (generally available) apply a monthly credit spending limit to a specific Cortex Agent object, tracked through Snowflake's tag-based cost attribution model. You can set and track team-level and org-wide budgets, so spending is visible, controlled and aligned to usage. Snowflake tracks credit consumption for the tagged agent and performs configured actions when thresholds are reached.

The threshold action model is flexible. At 80%, you can trigger an alert to the team lead. At 100%, a stored procedure revokes access. For teams with needs that exceed the limit—an earnings season for a financial analysis agent, for example — you can configure a reinstatement action at a threshold beyond 100%, followed by a hard stop at 200%. A cycle-start action reinstates access automatically at the beginning of each new budget period.

Budgets give engineering, finance and all other teams the same control over AI spending that they already have over compute and storage. Runaway agent usage doesn't go unnoticed until the month-end bill.

agent-5

Fig 5: Resource Budgets: Control Agent Spend with Precision

Shared resource budgets: Per-team spending on shared agents

Resource budgets operate at the agent level — they apply to all usage of that agent, regardless of which team is running it. Shared resource budgets (generally available) operate at the user group level: Multiple teams can share the same agent while each team works against an independent spending limit. Budgets are enforced at the team level, so each group’s usage is tracked and controlled independently. Snowflake tracks credit consumption for tagged users against the shared resource independently of other groups using the same instance.

When a user is subject to multiple budgets — a resource-level budget on the agent and a shared resource-level budget for their team — each budget is evaluated independently, and the user is stopped by whichever threshold is reached first. For example, a finance team that exhausts its 500-credit shared budget will lose access even if the overall agent hasn't reached its 1,000-credit resource limit. Other teams continue unaffected.

This model gives platform teams the flexibility to deploy shared infrastructure while giving individual business units control over their own consumption.

Iterate: Measure what matters, fix what breaks

LLMs are inherently nondeterministic. Tool orchestration can fail silently — the agent might hallucinate data retrieval and still produce a plausible looking answer. Generic quality metrics don't capture domain-specific failures. Without structured measurement, teams can't tell whether an agent is getting better or worse over time. With Cortex Agent Evaluations, you can now monitor and iterate agentic applications directly within the secure Snowflake boundary. 

Cortex Agent Evaluations: From subjective review to quantified performance

Cortex Agent Evaluations are now generally available. The framework lets you define a data set of test queries and expected agent behaviors, run evaluations against your agent and receive quantified metrics for every run.

Evaluations are powered by Snowflake's Agent GPA (Goal-Plan-Action) framework — a research-backed approach that surfaces three built-in metrics:

  • Tool selection and execution accuracy: Did the agent choose the right tools at the right stages and execute them correctly with the expected inputs and outputs?
  • Answer correctness: How closely did the agent's response match the expected answer?
  • Logical consistency: Are the agent's instructions, planning and tool calls internally consistent? 

The GPA framework's performance on the TRAIL/GAIA benchmark speaks to its reliability: in benchmark testing, GPA judges captured 95% of human-annotated errors (versus 55% baseline) and localized errors to specific trace spans with 86% accuracy. This isn't a pass/fail system — it tells you exactly where in the reasoning chain the agent broke down.

The practical benefit is that teams can move from subjective human review to structured, repeatable quality measurement. Before the first production deployment, you run an evaluation to establish a baseline. After every configuration change, you run it again to confirm quality held. 

6-record

Fig 6: Agent Evaluations: Measure, Diagnose, and Improve with GPA

The Future of Work, Powered by Snowflake Intelligence

​​Snowflake Intelligence brings a new way of working to the enterprise — where insights don’t stop at answers, but drive real action across your business.

Behind that experience is a fully managed platform designed to meet every requirement for production AI: Connectivity to enterprise tools, secure code execution, multiuser scalability with data isolation, granular cost governance and built-in evaluation to improve quality.

These capabilities remove the hardest barriers to deploying AI at scale — eliminating integration complexity, ensuring governance and cost control and providing the foundation for trusted, measurable outcomes.

Snowflake Intelligence sits at the center of this system. Users ask questions, explore data and take action in one place — while the platform securely connects data, workflows and systems behind the scenes. This integration is what allows enterprises to move from experimentation to real impact, without stitching together multiple tools or compromising on governance.

Because it runs on Snowflake, every interaction is grounded in governed data, existing access controls and unified policies. Teams don’t need to rebuild governance for AI — they extend what they already trust.

At its core, Snowflake Intelligence acts as a personal work agent — one that understands your context, helps you reason through complexity and takes action on your behalf. It enables teams to move faster and collaborate more effectively and ultimately work at the speed of AI.

Snowflake Intelligence is production-ready, and turns data into decisions and decisions into action — securely, at scale — built on the data platform you already rely on.

Get started

These capabilities are generally available. To learn more: 

Subscribe to our blog newsletter

Get the best, coolest and latest delivered to your inbox each week

Where Data Does More

  • 30-day free trial
  • No credit card required
  • Cancel anytime