Product and Technology

Build and Deploy ML with Ease Using Snowpark ML, Snowflake Notebooks, and Snowflake Feature Store

Build and Deploy ML with Ease Using Snowpark ML, Snowflake Notebooks, and Snowflake Feature Store

Snowflake has invested heavily in extending the Data Cloud to AI/ML workloads, starting in 2021 with the introduction of Snowpark, the set of libraries and runtimes in Snowflake that securely deploy and process Python and other popular programming languages. 

Since then, we’ve significantly opened up the ways Snowflake’s platform, including its elastic compute engine can be used to accelerate the path from AI/ML development to production. Since Snowpark takes advantage of that scale and performance of Snowflake’s logically integrated but physically separated storage and compute, our customers are seeing a median of 3.5 times faster performance and 34% lower costs for their AI/ML and data engineering use cases. As of September 2023, we’ve already seen many organizations benefit from bringing processing directly to the data, with over 35% of Snowflake customers using Snowpark on a weekly basis. 

To further accelerate the entire ML workflow from development to production, the Snowflake platform continues to evolve with a new development interface and more functionality to securely productionize both features and models. Let’s unpack these announcements! 

Figure 1. Snowflake’s latest ML announcements

Develop interactively with SQL and Python in Snowflake Notebooks

Snowflake Notebooks, in private preview, is a new development interface that offers an interactive, cell-based programming environment for Python and SQL users to explore, process and experiment with data in Snowpark. Snowflake’s built-in notebooks allow developers to write and execute code, train and deploy models using Snowpark ML, visualize results with Streamlit chart elements, and much more — all within Snowflake’s unified, secure platform. And because the Notebook is natively integrated into Snowflake’s role-based access controls (RBAC), it’s easy to securely share and collaborate on your code and results without compromising on any enterprise data. For data science and machine learning, the cell-based layout in Snowflake Notebooks unlocks experimentation and exploration tasks, as developers can write and execute code, visualize results, capture notes and share insights all in one place. 

Figure 2. Snowflake Notebooks provide cell-based SQL and Python development directly in Snowsight

Streamline AI/ML workflows with the Snowpark ML library

Snowpark ML includes the Python library and underlying infrastructure for end-to-end ML workflows in Snowflake, including the Snowpark ML Modeling API and the Snowpark ML Operations API. Snowpark ML unifies data pre-processing, feature engineering, model training and integrated deployment into a single, easy-to-use Python library. We recently announced Snowpark ML Modeling API (generally available soon), which enables the use of popular ML frameworks such as Scikit-learn and XGBoost for feature engineering and model training without moving data out of Snowflake. Snowpark ML enables intuitive model development using these frameworks through familiar Python APIs. Behind the scenes, Snowpark ML parallelizes data processing operations by taking advantage of Snowflake’s scalable computing platform.

For Snowpark ML Operations, the Snowpark Model Registry (public preview soon) enables scalable, secure deployment and management of models in Snowflake, and includes expanded support for deploying deep learning models from Tensorflow and Pytorch and open-source LLMs from Hugging Face to Snowpark Container Services (which includes GPU compute pools). The Snowpark Model Registry now builds on a native Snowflake model entity with built-in versioning support, role-based access control and a SQL API for more streamlined management catering to both SQL and Python users.

Store, manage and automate feature pipelines with the Snowflake Feature Store

The Snowflake Feature Store (in private preview) is an integrated solution for data scientists and ML engineers to create, store, manage and serve ML features for model training and inference. It consists of Python APIs accessible through the Snowpark ML library, and SQL interfaces for defining, managing and retrieving features, along with managed infrastructure for feature metadata management and continuous feature processing. By using the Snowflake Feature Store, ML teams can maintain a single and up-to-date source of truth for features used in model training and inference.  

What’s Next?

Snowflake continues to make it easier for customers to seamlessly and securely build and deploy features and models on a single platform, unlocking the ability to bring more AI/ML development to the data. Check out the Snowpark ML demo from Snowday to see the latest launches in action. In addition, Snowflake is also making it easier for any user to get value from GenAI. You can learn more about the recent announcements including Snowflake Cortex here, and the LLM-powered experiences built on Snowflake Cortex here

Resources:

Intro to Machine Learning with Snowpark ML for Python

Share Article

Start your 30-DayFree Trial

Try Snowflake free for 30 days and experience the AI Data Cloud that helps eliminate the complexity, cost and constraints inherent with other solutions. 


Copyright 2022, Snowflake Site. All rights reserved.