Product and Technology

Sharing and Monetizing AI Models Safely and Securely in the AI Data Cloud

The rise of generative AI models are spurring organizations to incorporate AI and large language models (LLMs) into their business strategy. After all, these models open up new opportunities to extract greater value from a company’s data and IP and make it accessible to a wider audience across the organization.

One key to successfully leveraging gen AI models is the ability to share data. Companies with valuable data that can be used to fine-tune LLMs want to be able to monetize it and use it for fine-tuning without granting access to the original data sources. They also want to ensure that all usage is appropriately attributed back to them. 

Unfortunately, many of the currently available solutions do not give enterprises the tools to share data safely and securely while:

  • Ensuring that an organization’s valuable data is always managed by that organization and not made available to other parties, which may result in inappropriate or possibly malicious use

  • Ensuring that third-party models used within the enterprise are safely sandboxed

  • Carefully monitoring access to data and models

At Snowflake, we are tackling these challenges head-on and making it easier for developers to deliver trusted AI with enterprise data.

Diagram showing the span of Snowflake Collaboration capabilities from within an organization to between organizations
Collaboration in the AI Data Cloud enables enterprises to discover, share and monetize data, apps and AI products across clouds.

At our recent BUILD 2024 dev conference, we highlighted three features to help you share your fine-tuned LLMs, share data sets to train your LLMs, and share traditional AI/ML models safely and securely both within and outside your organization across the AI Data Cloud. We provided an overview of these features in a previous blog post, but now let’s take a closer look at how you can put them to work in your projects.

Share Snowflake Cortex AI fine-tuned LLMs from Meta and Mistral AI

To fully leverage foundational AI models, enterprises need to customize and fine-tune them to their specific domains and data sets. This task generally comes with two mandates: no data leaves their premises at any time, and no heavy investments are made in building infrastructure. 

Snowflake now offers enterprises the ability to fine-tune leading models from Meta and Mistral AI using data within their own security perimeter and without having to manage any infrastructure. Better yet, developers can easily govern and manage their custom LLMs with Snowflake Model Registry.

With Secure Model Sharing (currently in public preview), you can fine-tune and share custom foundation models in three steps:

  1. Select the base model and provide your training data set as part of the FINETUNE function or by using the no-code experience in Snowflake AI & ML Studio. The fine-tuned models can be used through the COMPLETE function.

  2. Securely share your fine-tuned models with other Snowflake accounts in your region.

  3. Replicate your fine-tuned models across regions within your organization.

Screenshot of using Mistral to fine-tune an LLM in Snowflake
Enterprises can easily fine-tune and share custom AI models with Secure Model Sharing.
SNOWFLAKE.CORTEX.FINETUNE(
	‘CREATE’
	<model_name>,
	<base_model>,
	<training_data>,
	<validation_data>
);

Unlock the power of Cortex LLMs with Cortex Knowledge Extensions

Enterprises want an easy way to augment their foundation models with domain-specific information to make them provide more relevant responses. Traditionally, it takes a lot of time and effort to find and procure the right data sets, and then more time and technical skill to prepare the data for consumption and fine-tune the LLMs. Snowflake has already streamlined the first part of that process — locating appropriate data — with Snowflake Marketplace, which offers one central location to quickly find, try and buy more than 2,900 data sets, apps and data products (as of October 31, 2024). Now, with Cortex Knowledge Extensions (currently in private preview), we’re making it easier to prepare and transform third-party data.

Cortex Knowledge Extensions give customers an “easy button” for augmenting their chosen foundation model with up-to-date information in a particular domain without requiring additional technical expertise to fine-tune and massage raw data from a content provider. Critically, customers will have the confidence that they are using officially licensed content.

Cortex Knowledge Extensions allow gen AI applications to draw responses from providers' unstructured, licensed data while giving them appropriate attribution and isolating the original full dataset from exposure. This helps providers monetize and participate in gen AI while minimizing the risk of their content being used for model training purposes. 

To make their data available on Snowflake Marketplace, the content provider sets up a Cortex Search service on their data and publishes to Snowflake Marketplace. Once published, a consumer can find the listing and acquire the data from Snowflake Marketplace. Consumers can then use Cortex AI APIs to prompt LLMs with the acquired Snowflake Marketplace data.

Share traditional AI/ML models in the AI Data Cloud

More and more enterprises are building custom AI/ML models for specific tasks such as predicting churn or forecasting revenues. These models may be developed within the organization by data scientists or externally by partners. Enterprises can now unlock the power of these models and share them with partners, customers and users within the enterprise using Snowflake Native Apps on both Internal Marketplace and external-facing Snowflake Marketplace. 

With Snowflake Secure Data Sharing, organizations can allow end users to run ML models securely within fine-grained role-based access control on their data. The data itself never leaves the organization’s security boundary. Packaging the models with Snowflake Native Apps ensures that they inherit Snowflake Native Apps’ security posture, including security scanning, sandboxing and access to local or external resources based on specific privileges granted to the model.

Sharing a model is as simple as adding model artifacts to an application package and granting application-specific consumer usage privileges. Consumers are then free to install the application and invoke model functions.

Diagram showing the process involved in sharing an AI model via Snowflake Native Apps
The process of sharing AI models, from model provider to consumers

With Snowflake collaboration and data sharing, enterprises can easily create and share AI/ML models — both traditional models and fine-tuned LLMs — and share their benefits with the rest of the enterprise. To learn more and try out some of these features, check out these resources: 

A woman with dark hair and glasses sits at a desk using a laptop, with a graphic of web-like connected dots overlaying the image on the right side
Data Cloud Academy

Snowflake Native App Bootcamp

Learn how to build, operate, maintain and monetize Snowflake Native Apps in 120 minutes of expert-led sessions, hands-on labs and customer examples.
Share Article

Subscribe to our blog newsletter

Get the best, coolest and latest delivered to your inbox each week

Start your 30-DayFree Trial

Try Snowflake free for 30 days and experience the AI Data Cloud that helps eliminate the complexity, cost and constraints inherent with other solutions.