Product and Technology

FEB 03, 2026|8 min read

Modernizing the ML Stack: Announcing Agentic, Multimodal and Real-Time Workflows

Traditional machine learning (ML) remains critical in today's AI landscape, serving as the backbone for the predictive insights that drive core business value — from supply chain optimization to real-time fraud detection. However, the path from experimentation to production remains challenging, with fragmented tools across ecosystems that require complex setup, multiple iterations to optimize and ongoing maintenance. At Snowflake, we are committed to delivering a modern ML platform that is tightly integrated with your data for unified security and accelerated workflows that scale with your business needs.

We are thrilled to announce that the following capabilities are now available for your model workflows in Snowflake ML:

Automate development of fully functional ML pipelines with simple natural language prompts from a Jupyter-based environment in Snowflake Notebooks (generally available) with Cortex Code in Snowsight (generally available soon)
Efficiently deploy the best-performing model by utilizing natively integrated Experiment Tracking (generally available) to easily identify, share and reproduce top results across training runs
Serve low-latency predictions in milliseconds with the online Snowflake Feature Store (generally available) and online ML inference (generally available) to power real-time use cases such as personalized recommendations and fraud detection
Run inference workloads for multimodal models (public preview) for large-scale inference with unstructured data such as images and audio

Agentic model development

At Snowflake, we are continuing to invest in modern development experiences that improve developer productivity. Today, we are reimagining production ML with the launch of agentic ML capabilities that are integrated with a brand new integrated development environment (IDE) experience in Snowflake Notebooks.

Cortex Code for ML Pipelines

Data scientists often spend lengthy cycles on developing and troubleshooting their ML workflows, leading to operational bottlenecks and fewer ML models making their way to production. Now Snowflake is bringing agentic AI to ML workflows from Cortex Code in Snowsight (generally available soon) for ML workflows in Snowflake Notebooks to autonomously iterate, adjust and generate a fully executable ML pipeline from simple natural language prompts.

Gif showing Cortex Code automating ML workflows

Figure 1: Cortex Code automates ML workflows with simple natural language prompts.

Cortex Code breaks down problems associated with ML workflows into distinct steps, such as data analysis, data preparation, feature engineering and training. Combining advanced techniques such as multistep reasoning, contextual understanding and action execution, Cortex Code provides verified solutions in the form of fully functional ML pipelines that can be easily executed from a Snowflake Notebook. With suggested improvements, or with user-provided follow-ups, Cortex Code helps users easily iterate to the next-best version. By automating this tedious work, data science teams save hours of time that they would typically spend on experimentation or debugging — and can instead focus on higher-impact initiatives.

Snowflake Notebooks

Cortex Code can be leveraged directly from Snowflake Notebooks for building and iterating on production workflows. Next-gen development for Snowflake Notebooks is now generally available in Workspaces. With these Jupyter-based notebooks, you can bring existing notebooks, scripts and model training into Snowflake’s unified platform while keeping your preferred libraries, Jupyter runtime features and familiar IDE attributes and file-based organization within Workspaces.

Figure 2: Supercharge your data science and advanced model development workflows in Snowflake Notebooks.

This new development experience includes the following enhancements:

Managed Jupyter/IPython kernel: Notebooks now run on a Snowflake managed Jupyter/IPython kernel, ensuring compatibility with magics and existing notebooks. This includes running SQL, Python and Markdown code and easily transferring data across cells. View your results in the Results Explorer underneath each cell, complete with tables and a visualization builder.
Workspace-native organization: Notebooks can now be created directly in Workspaces alongside your SQL files, dbt projects and Python utilities and any other assets you use to develop on Snowflake. This lets you organize everything in one place and makes multi-file workflows feel natural. You can refactor logic into helpers, break flows into smaller components and stitch them together as needed. Our new terminal and variable explorer also give you a faster, more productive development loop.
Seamless Git-backed collaboration: Git-backed Workspaces now let you work seamlessly across your entire repo — branching, committing and diffing right from Snowflake. And if Git isn’t your preferred workflow, Shared Workspaces provide an alternative for your teams: role-based-access-control-governed collaboration on a set of files with built-in versioning and change tracking.
Ability to run on Snowflake Container Runtime (CPU and GPU): Our new experience runs exclusively on Snowflake Container Runtime, a prebuilt environment for data science and machine learning running directly on Snowpark Container Services. This offers a set of popular ML frameworks and multiple Python versions, and it speeds up training and data loading by distributing compute resources. The runtime version you use to prototype is the same one you’ll use to schedule and productionize, eliminating common “but it worked on my laptop ...” issues.

Global companies such as Aimpoint Digital, a leader in data and AI consulting, are using Snowflake Notebooks to drive production-ready development workflows.

“The GA of Snowflake Notebooks is a revolutionary moment for the developer experience. We've been able to develop & productionalize ML workloads from dynamic pricing to graph-based user behaviour prediction for clients with ease. Development on notebooks in Workspaces means we can centralise common code, while decentralising what developers build on top of it. Being able to reference your SQL cells in Python, and vice versa, and to parameterize your Notebooks are a paradigm shift. Gone are the days of scheduling stored procedures, Notebooks enable ultimate flexibility for dynamic workflows whether they're ML, AI, or engineering."

Christopher Marland

Snowflake Practice Lead, Aimpoint Digital

To get started with Snowflake Notebooks, try this topic modeling quickstart.

Experiment Tracking

After building and iterating with Snowflake Notebooks and Cortex Code, you can quickly move from raw hypothesis to a high-performing model using natively integrated Experiment Tracking (now generally available). This allows ML teams to systematically identify, share and reproduce the best-performing models across training runs. This simplifies collaboration, improves reproducibility and accelerates model iterations across the enterprise. With the latest release of Snowflake’s Experiment Tracking, you can seamlessly log millions of metrics generated during large training runs along with model parameters, artifacts and metadata.

Real-time feature and model serving enables low-latency predictions in milliseconds.

Figure 3: Easily identify the best-performing model to visualize and compare model versions with natively integrated Experiment Tracking.

Many enterprises are using Experiment Tracking to store, track, and compare critical information for model training across runs including EnergyHub, which empowers utilities and their customers to create a clean, distributed energy future.

“As early adopters of Snowflake's Experiment Tracker, we've found it meets our needs while eliminating the hassle of maintaining a separate MLFlow server. Consolidating ML experiment tracking within our existing Snowflake platform has been a significant operational win. Plus, Snowflake has been incredibly responsive to feedback, rolling out enhancements at an impressive pace.”

Dr. Wiliam Franklin

Principal Machine Learning Scientist, EnergyHub

Real-time serving

Once you have trained your model in Snowflake or any other external platform, you can easily deploy it for inference on Snowflake data to generate predictive results. We are introducing new production-ready online ML capabilities in general availability to open up real-time use cases such as personalized recommendations and fraud detection — no extra infrastructure or complicated configuration required. Developers can now eliminate the latency, cost and security risks associated with exporting sensitive data to external platforms by unifying batch and online ML use cases on a single platform.

Easily identify the best-performing model to visualize and compare model versions with natively integrated Experiment Tracking.

Figure 4: Real-time feature and model serving enables low-latency predictions in milliseconds.

Snowflake Feature Store

We are excited to announce the general availability of online feature serving from Snowflake Feature Store. Snowflake Feature Store is an integrated solution for data scientists and ML engineers to create, store, manage and serve ML features for model training and inference. It consists of Python APIs and SQL interfaces for defining, managing and retrieving features, along with managed infrastructure for feature metadata management and continuous feature processing. With online feature serving capability, Snowflake Feature Store serves as a unified solution for batch as well as low-latency online use cases, serving features in 30 ms.

Snowflake Feature Store is seamlessly integrated to your Snowflake data, features and models so that large-scale ML pipelines can be productionized easily and efficiently. This eliminates redundancy and duplication of feature pipelines, ensuring that customers have consistent, updated, accurate features available with enterprise-grade security and governance capabilities. A centralized UI in Snowsight interface for feature store serving makes it easy to search for and discover features and models and visualize data flow through lineage.

You can get started with Snowflake Feature Store today for online feature serving by trying this quickstart.

Online ML inference

Online ML inference is now also generally available, enabling you to serve models from the Snowflake Model Registry for real-time inference in under 100 ms.

To meet the rigorous demands of production workloads, online ML inference integrates intelligent autoscaling, low-latency performance and comprehensive observability into a cohesive workflow. This starts with cost-efficient performance; our autoscaling logic handles massive traffic spikes instantly and scales to zero when demand drops, eliminating the expensive overhead of over-provisioned GPUs. Crucially, as traffic ramps back up, the system is designed to respond immediately, ensuring that models scale up rapidly to maintain sub-100-ms performance.

Deployment is equally resilient, allowing users to transition to new model versions through automated rolling updates that ensure application traffic is never dropped, backed by the safety of easy version rollbacks. Teams can also leverage shadow mode to safely validate new models by monitoring performance in a parallel environment, isolated from the production version, before committing to a full cutover. Snowflake is also making it easy to get out-of-the-box visibility into latency, throughput and error rates with integrated observability that logs all requests and responses directly to a Snowflake table for deep debugging and long-term auditing.

Inference for multimodal models

Additionally, running large-scale online and batch inference is now easy with Snowflake's inference support for open source multimodal models from hubs such as Hugging Face. Inference support for unstructured data is now in public preview and includes data types such as images, video and audio. This capability unlocks AI use cases such as object detection, visual Q&A and automatic speech recognition on Snowflake without complex pipelines or data movement.

Snowflake supports both real-time and batch processing needs. Users can deploy multimodal models as a service for online inference via REST APIs or log them in the Snowflake Model Registry for immediate batch invocation. By utilizing Snowflake’s distributed compute layer, teams can run massive inference tasks over large data sets without leaving their familiar environment.

Getting started

With the latest round of innovations in agentic, online and multimodal capabilities, Snowflake ML further accelerates your machine learning from prototype to production on the same platform as your governed data.

Check out our product documentation and try Snowflake ML today with this intro quickstart from the 30-day free trial.

Quickstart

Build an End-to-End ML Workflow in Snowflake

learn how to build and deploy a complete machine learning workflow entirely within Snowflake ML.

start now

Authors

Pavan Pothukuchi

Annissa Alusi

Lucy Zhu

Just For You

Product and Technology

Take Ideas to Production Faster in Snowflake with New Data-Native Development Tools

Siddharth Dwivedi|Matt Nunn

FEB 03, 2026|6 min read

Product and Technology

Turning AI Innovation into Reliable, Production-Ready Applications with Snowflake

Arun Agarwal|Baris Gultekin|Santiago Giraldo

FEB 03, 2026|10 min read

Modernizing the ML Stack: Announcing Agentic, Multimodal and Real-Time Workflows