Core Platform

Snowflake Introduces Prompt Injection and Jailbreak Prevention with Cortex AI Guardrails

As enterprises transition AI workloads from experimental pilots to full-scale production, it is becoming clear that high-quality models and clean data are only half the battle. True operational readiness requires granular control over AI behavior, defining its boundaries, auditing its outputs and defending it against adversarial manipulation.

To address this, we are introducing Cortex AI Guardrails in Horizon Catalog. This new governance and security capability provides organizations with centralized, policy-driven control over interactions across Snowflake AI.

With today's launch, we are introducing advanced prompt injection and jailbreak prevention, real-time protection against adversarial attempts to override system instructions and manipulate model behavior in Cortex Code. At Snowflake, security is a core pillar, not an afterthought. We provide native, enterprise-grade defense-in-depth controls that harden every layer of your AI stack against emerging vulnerabilities.

Snowflake's layered approach

Snowflake takes a defense-in-depth approach, combining built-in baseline protection with an advanced guardrails layer for zero-day style defense. All AI workloads on Cortex Code and Snowflake Intelligence are protected by an always-on security baseline, covering indirect attack protection and semantic-matching against known injection techniques, with no configuration required. Cortex AI Guardrails extend this baseline by applying advanced, contextual reasoning to detect and block sophisticated, zero-day style injection and jailbreak attempts in real time, even those originating from entirely new attack vectors.

How it works

Cortex AI Guardrails operate at the agent orchestration layer, intercepting tool responses before they reach the underlying model. The guardrail engine utilizes a specialized LLM post-trained on adversarial prompt injection attacks, enabling it to detect zero-day style attacks, such as malicious instructions hidden in open source repositories. Guardrails monitor the tool responses resulting from the input prompts for malicious attempts and notify the underlying LLM of malicious intent. This way, malicious inputs are blocked while the clean requests pass through untouched. All blocked prompts are recorded within their respective service's logs for full auditability. Guardrails run in parallel with the agent loop to ensure protection with no added latency or impact on performance.

Centralized governance in Horizon Catalog

Cortex AI Guardrails are managed through Snowflake Horizon Catalog, the universal catalog that provides unified governance capabilities for AI across all data, clouds and formats. By integrating guardrails into the same control plane that secures your tables, views and Iceberg data, platform teams can treat guardrails as a first-class part of their broader security and governance strategy rather than a separate, bolt-on middleware layer. This centralization makes it easier to prove compliance, standardizes risk posture across business units, and allows for the rollout of new protections without rewriting application code. Because these guardrails are native to the Horizon Catalog rather than tied to a specific model or gateway, your security and governance scales automatically as your AI footprint grows. This enables "define once, enforce everywhere" protection that stays consistent as you build for the future.

Getting started

Cortex AI Guardrails are available today for Enterprise or above edition customers. Account administrators can enable advanced prompt injection guardrails in minutes using a simple account-level configuration, no infrastructure changes or custom middleware required. To get started, visit the Cortex AI Guardrails documentation.

Share Article

Subscribe to our blog newsletter

Get the best, coolest and latest delivered to your inbox each week

Where Data Does More