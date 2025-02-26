To build models while maintaining the privacy of sensitive data sets or to easily generate new data to enrich training, Snowflake also supports easy and secure synthetic data generation (public preview). This is a powerful capability that allows data scientists to build pipelines and models on data without compromising sensitive attributes and without waiting on lengthy and cumbersome approval processes. The synthetic data set has the same characteristics as the source data set, such as name, number and data type of columns, and the same or fewer number of rows.

Serving models in production

No matter where your model is built, Snowflake ML makes it easy to run production-scale inference and manage the model lifecycle with built-in security and governance. After a model is logged in the Snowflake Model Registry, it can be seamlessly served for distributed inference using Model Serving in Snowpark Container Services (SPCS). With this capability, your inference workloads can take advantage of GPU compute clusters, run large models such as Hugging Face embeddings or other transformer models and use any Python packages from open source or private repositories. You can also deploy models to a REST API endpoint for your applications to invoke your model inference for low-latency applications (online endpoint is in public review). With model registry and inference solutions, users can now easily use any ML model trained within or outside Snowflake, using one of the built-in model types, or using the custom model API to bring in any other type of model, including pre- and post-processing pipelines and partitioned models, to run scalable, distributed inference either in virtual warehouses or in SPCS depending on the workload needs.