Product and Technology

Announcing DeepSeek-R1 in Preview on Snowflake Cortex AI

We are excited to bring DeepSeek-R1 to Snowflake Cortex AI! As described by DeepSeek, this model, trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT), can achieve performance comparable to OpenAI-o1 across math, code and reasoning tasks. Based on DeepSeek’s posted benchmarking, DeepSeek-R1 tops the leaderboard among open source models and rivals the most advanced closed source models globally. Customers can now request an early preview of DeepSeek-R1 on Cortex AI.  

As part of the private preview, we will focus on providing access inline with our product principles of ease, efficiency and trust.

  • The model is available in private preview for serverless inference for both batch (SQL function) and interactive (Python and REST API). To request access during preview please reach out to your sales team. The model will be available only in the requested account.

  • The model is hosted in the U.S. within the Snowflake service boundary. We do not share data with the model provider. 

  • Once the model is generally available, customers can manage access to the model via role-based access control (RBAC). Account admins can restrict access by selecting the models approved per governance policies.

Snowflake Cortex AI

Snowflake Cortex AI is a suite of integrated features and services that include fully-managed LLM inference, fine-tuning, and RAG for structured and unstructured data, to enable customers to quickly analyze unstructured data alongside their structured data, and expedite the building of AI apps. Customers can access industry-leading LLMs, both open source and proprietary and integrate these easily into their workflows and applications. Snowflake embraced the open source ecosystem with the support for multiple LLMs from Meta, Mistral and Snowflake. We believe this open access and collaboration will pave the way for expedited innovation in this space. 

DeepSeek-R1

Based on DeepSeek’s GitHub post, they directly applied reinforcement learning (RL) to the base model without relying on supervised fine-tuning (SFT) as a preliminary step. This approach allowed the model to explore chain-of-thought (CoT) for solving complex problems, resulting in the development of DeepSeek-R1-Zero. They further mention that the initial model demonstrated capabilities such as self-verification, reflection and generating long CoTs but encountered challenges such as endless repetition, poor readability and language mixing. To address these issues the DeepSeek team describes how they incorporated cold-start data before RL for enhanced reasoning performance.

Deepseek accuracy benchmarks
DeepSeek benchmark results

The team implemented low-precision FP8 training and an auxiliary-loss-free load-balancing strategy, leading to state-of-the-art performance with significantly reduced training computational costs. 

Using DeepSeek-R1 in Cortex AI

With Snowflake Cortex AI, accessing large language models (LLMs) is easy. You don’t have to manage integrations or API keys. Governance controls can be implemented consistently across data and AI. You can access the models in one of the supported regions. Also, you can access from other regions with cross-region inference enabled. You can enable Cortex Guard to filter out potentially inappropriate or unsafe responses. Guardrails strengthen governance by enforcing policies aligned to filter out harmful content. 

SQL and Python

The model can be integrated into a data pipeline or a Streamlit in Snowflake app to process multiple rows in a table. The COMPLETE function, accessible in both SQL and Python, can be used for this integration. Within the Cortex AI COMPLETE function that is used for LLM inference applications, simply add ‘guardrails: true’ to filter out harmful content. Also, you can access DeepSeek models from a Snowflake Notebook or your IDE of choice using OAuth for custom clients. Access additional templates and details on how to use the SQL function here or learn about the syntax in Python here.

SELECT SNOWFLAKE.CORTEX.COMPLETE('deepseek-r1', 
   [{'content': CONCAT('Summarize this customer feedback in bullet points:<feedback>', content ,'</feedback>')}], 
    {'guardrails': true}
);

Once you activate Cortex Guard, language model responses associated with harmful content — such as violent crimes, hate, sexual content, self-harm and others — will be automatically filtered out, and the model will return a  "Response filtered by Cortex Guard" message.  For more information on Snowflake’s perspective on AI safety, read our white paper on our AI Security Framework. 

REST API 

To enable services or applications running outside of Snowflake to make low-latency inference calls to Cortex AI, the REST API interface is the way to go. Here is an example of what that looks like:

curl -X POST \
    "model": "deepseek-r1",
    "messages": [{ "content": "Summarize this customer feedback in bullet points: <feedback>”}],
    "top_p": 0,
    "temperature": 0.6,
    }' \
https://<account_identifier>.snowflakecomputing.com/api/v2/cortex/inference:complete

What’s next?

Per DeepSeek, this is the first opensource model to demonstrate that the reasoning capabilities of LLMs can be incentivized purely through RL, without the need for SFT. Cortex AI provides easy integration via SQL functions and REST APIs, and Cortex Guard allows customers to implement the requisite safety controls. The Snowflake AI Research team plans to enhance DeepSeek-R1 to further reduce the inference costs. Customers can achieve cost-performance efficiencies with DeepSeek-R1 and expedite the delivery of generative AI applications. This breakthrough paves the way for future advancements in this area.

 

Note: This article contains forward-looking statements, including about our future product offerings, and are not commitments to deliver any product offerings. Actual results and offerings may differ and are subject to known and unknown risk and uncertainties. See our latest 10-Q for more information.

Photo illustration of two women collaborating near a computer and other images representing tech and teamwork
Ebook

Secrets of Gen AI Success

Discover how leaders like Bayer and Siemens Energy use gen AI to increase revenue, improve productivity and better serve customers.

Subscribe to our blog newsletter

Get the best, coolest and latest delivered to your inbox each week

Start your 30-DayFree Trial

Try Snowflake free for 30 days and experience the AI Data Cloud that helps eliminate the complexity, cost and constraints inherent with other solutions.