Product and Technology

Sharing and Monetizing AI Models Safely and Securely in the AI Data Cloud

The rise of generative AI models are spurring organizations to incorporate AI and large language models (LLMs) into their business strategy. After all, these models open up new opportunities to extract greater value from a company’s data and IP and make it accessible to a wider audience across the organization.

One key to successfully leveraging gen AI models is the ability to share data. Companies with valuable data that can be used to fine-tune LLMs want to be able to monetize it and use it for fine-tuning without granting access to the original data sources. They also want to ensure that all usage is appropriately attributed back to them. 

Unfortunately, many of the currently available solutions do not give enterprises the tools to share data safely and securely while:

  • Ensuring that an organization’s valuable data is always managed by that organization and not made available to other parties, which may result in inappropriate or possibly malicious use

  • Ensuring that third-party models used within the enterprise are safely sandboxed

  • Carefully monitoring access to data and models

At Snowflake, we are tackling these challenges head-on and making it easier for developers to deliver trusted AI with enterprise data.

Diagram showing the span of Snowflake Collaboration capabilities from within an organization to between organizations
Collaboration in the AI Data Cloud enables enterprises to discover, share and monetize data, apps and AI products across clouds.

At our recent BUILD 2024 dev conference, we highlighted three features to help you share your fine-tuned LLMs, share data sets to train your LLMs, and share traditional AI/ML models safely and securely both within and outside your organization across the AI Data Cloud. We provided an overview of these features in a previous blog post, but now let’s take a closer look at how you can put them to work in your projects.

Share Snowflake Cortex AI fine-tuned LLMs from Meta and Mistral AI

To fully leverage foundational AI models, enterprises need to customize and fine-tune them to their specific domains and data sets. This task generally comes with two mandates: no data leaves their premises at any time, and no heavy investments are made in building infrastructure. 

Snowflake now offers enterprises the ability to fine-tune leading models from Meta and Mistral AI using data within their own security perimeter and without having to manage any infrastructure. Better yet, developers can easily govern and manage their custom LLMs with Snowflake Model Registry.

With Secure Model Sharing (currently in public preview), you can fine-tune and share custom foundation models in three steps:

  1. Select the base model and provide your training data set as part of the FINETUNE function or by using the no-code experience in Snowflake AI & ML Studio. The fine-tuned models can be used through the COMPLETE function.

  2. Securely share your fine-tuned models with other Snowflake accounts in your region.

  3. Replicate your fine-tuned models across regions within your organization.

Screenshot of using Mistral to fine-tune an LLM in Snowflake
Enterprises can easily fine-tune and share custom AI models with Secure Model Sharing.
SNOWFLAKE.CORTEX.FINETUNE(
	‘CREATE’
	<model_name>,
	<base_model>,
	<training_data>,
	<validation_data>
);

Unlock the power of Cortex LLMs with Cortex Knowledge Extensions

Enterprises want an easy way to augment their foundation models with domain-specific information to make them provide more relevant responses. Traditionally, it takes a lot of time and effort to find and procure the right data sets, and then more time and technical skill to prepare the data for consumption and fine-tune the LLMs. Snowflake has already streamlined the first part of that process — locating appropriate data — with Snowflake Marketplace, which offers one central location to quickly find, try and buy more than 2,900 data sets, apps and data products (as of October 31, 2024). Now, with Cortex Knowledge Extensions (currently in private preview), we’re making it easier to prepare and transform third-party data.

Cortex Knowledge Extensions give customers an “easy button” for augmenting their chosen foundation model with up-to-date information in a particular domain without requiring additional technical expertise to fine-tune and massage raw data from a content provider. Critically, customers will have the confidence that they are using officially licensed content.

Cortex Knowledge Extensions allow gen AI applications to draw responses from providers' unstructured, licensed data while giving them appropriate attribution and isolating the original full dataset from exposure. This helps providers monetize and participate in gen AI while minimizing the risk of their content being used for model training purposes. 

To make their data available on Snowflake Marketplace, the content provider sets up a Cortex Search service on their data and publishes to Snowflake Marketplace. Once published, a consumer can find the listing and acquire the data from Snowflake Marketplace. Consumers can then use Cortex AI APIs to prompt LLMs with the acquired Snowflake Marketplace data.

Share traditional AI/ML models in the AI Data Cloud

More and more enterprises are building custom AI/ML models for specific tasks such as predicting churn or forecasting revenues. These models may be developed within the organization by data scientists or externally by partners. Enterprises can now unlock the power of these models and share them with partners, customers and users within the enterprise using Snowflake Native Apps on both Internal Marketplace and external-facing Snowflake Marketplace. 

With Snowflake Secure Data Sharing, organizations can allow end users to run ML models securely within fine-grained role-based access control on their data. The data itself never leaves the organization’s security boundary. Packaging the models with Snowflake Native Apps ensures that they inherit Snowflake Native Apps’ security posture, including security scanning, sandboxing and access to local or external resources based on specific privileges granted to the model.

Sharing a model is as simple as adding model artifacts to an application package and granting application-specific consumer usage privileges. Consumers are then free to install the application and invoke model functions.

Diagram showing the process involved in sharing an AI model via Snowflake Native Apps
The process of sharing AI models, from model provider to consumers

With Snowflake collaboration and data sharing, enterprises can easily create and share AI/ML models — both traditional models and fine-tuned LLMs — and share their benefits with the rest of the enterprise. To learn more and try out some of these features, check out these resources: 

A woman with dark hair and glasses sits at a desk using a laptop, with a graphic of web-like connected dots overlaying the image on the right side
Data Cloud Academy

Snowflake Native App Bootcamp

Learn how to build, operate, maintain and monetize Snowflake Native Apps in 120 minutes of expert-led sessions, hands-on labs and customer examples.
Share Article
Snowflake Build Event
Virtual Event

Watch [BUILD] On Demand

Join developers, data scientists, engineers and all data professionals for exclusive product announcements, “how to” technical sessions, and hands-on labs focused on Snowflake’s latest innovations.

The Easy Button for Context-Rich AI Agents: Snowflake Intelligence & Cortex Knowledge Extensions

Discover how Snowflake Intelligence and Cortex Knowledge Extensions help enterprises unlock insights from data within their organizations as well as third-party expertise.

Diskover, Backed by Snowflake Ventures, Empowers Enterprises with Full Visibility into Their Legacy Data Estates

Learn how Diskover, backed by Snowflake Ventures, empowers enterprises to gain full visibility into legacy data estates, enhancing data readiness for AI and analytics.

Key Security Must-Haves for Safely Integrating Your Data with LLMs

Snowflake Cortex AI addresses essential security measures, freeing developers to build with AI models from Anthropic, OpenAI (coming soon), Mistral, DeepSeek and Meta.

Data Clean Rooms Explained: Privacy-First Data Collaboration

Learn how data clean rooms enable privacy-first data collaboration across industries like advertising, healthcare and finance while ensuring compliance.

How Composable CDPs Empower Healthcare with Secure Data Insights

Discover how Composable CDPs in healthcare and life sciences enable secure and data-driven decisions while ensuring compliance with regulations like HIPAA.

The Power of Trust: Building a Privacy-First Marketing Strategy

Discover how a privacy-first marketing strategy builds trust, improves data accuracy, and enhances customer experiences while ensuring compliance and relevance.

Snowflake Arctic Cookbook Series: Instruction-Tuning Arctic

Explore the Snowflake Arctic Cookbook Series and get insights from AI research team on fine-tuning MoE models and optimizing AI performance.

Snowflake and Workday Accelerate the AI-Driven Enterprise

Snowflake and Workday announce a zero-copy, bidirectional partnership, enabling customers to securely share data and interoperate AI agents for faster, unified insights.

Latest Changes to How Snowflake Handles OCSP

Learn how Snowflake client drivers use OCSP to check HTTPS certificate revocation for maximum security, ensuring secure connections to various endpoints.

Subscribe to our blog newsletter

Get the best, coolest and latest delivered to your inbox each week

Where Data Does More

  • 30-day free trial
  • No credit card required
  • Cancel anytime