BUILD: The Dev Conference for AI & Apps (Nov. 4-6)

Hear the latest product announcements and push the limits of what can be built in the AI Data Cloud.

What Is a Data Clean Room? How It Works and Use Cases

Data clean rooms enable secure, privacy-compliant data collaboration. Learn how they work, their benefits, and how businesses use them to drive growth.

  • Overview
  • What Is a Data Clean Room?
  • How Does a Data Clean Room Work?
  • Benefits of Distributed Data Clean Rooms
  • Use Cases for Data Clean Rooms
  • Real-World Example: Merkle’s Merkury Clean Rooms
  • Resources

Overview

Sharing data while adhering to privacy regulations has always been challenging. But by using distributed data clean rooms, it’s now possible to collaborate with data in a secure way that aligns with privacy rules. The capabilities of distributed data clean rooms are especially beneficial as advertisers and the media industry face signal loss due to reduced access of key data signals such as cookies or device IDs. Data clean rooms allow organizations to manage, de-identify and share data effectively. Let’s explore what a data clean room is and how it works, the benefits you can experience with a data clean room, and how companies use data clean rooms for business growth.

What is a data clean room?

A data clean room is a secure and controlled environment that allows multiple companies, or divisions of a company, to bring data together for joint analysis. Internal clean room guidelines can be established to keep data handling and sharing aligned with core privacy regulations such as GDPR, HIPAA and the CCPA. In the data clean room, personally identifiable information (PII) can be anonymized.

The most popular use case of data clean rooms is to link anonymized marketing and advertising data from multiple parties for attribution. Data clean rooms can be set up to prevent the exposure of data that could identify specific users, thereby helping clean room users comply with privacy requirements.

How does a data clean room work?

Data clean room configurations control what data comes in, how it can be joined with other data in the clean room, the types of analytics each party can perform on the data and what data can leave. Data clean rooms may permit PII loaded into the clean room to be secured and encrypted. Data clean rooms generally grant data owners full control over their data in the clean room, while approved partners can get a feed with anonymized data.

Traditional data clean rooms versus distributed data clean rooms

It’s important to distinguish between traditional data clean rooms and distributed data clean rooms. In traditional data clean rooms, all data is stored in a single physical location, limiting how the data can be shared. With the developments of cloud technology, distributed data clean rooms eliminate the need to move data from one local storage location to another, since the data can live in the cloud. This allows each partner to control their own data while enabling governed analytics with another partner, or even with multiple other partners, simultaneously.

Benefits of distributed data clean rooms

Data clean rooms offer key advantages for advertisers, media companies and retailers, including:

1. Enhanced data access

Data clean rooms enable media companies and publishers to combine their audience data with partners’ data without exposing personally identifiable information, while advertisers can improve attribution tracking.

2. Custom audience creation

Data clean rooms facilitate the creation of custom audiences for advertising platforms, allowing marketers to optimize their ad targeting.

3. Advanced data analysis

Organizations can use data clean rooms to perform in-depth analysis on combined data sets, gaining insights into customer behavior, segmentation and customer lifetime value.

Use cases for data clean rooms

Let’s look at three specific use cases for data clean rooms.

Audience insights for advertising

Suppose a company has its own first-party data containing attributes about its customers and their associated sales SKUs. In that case, the company can use a data clean room to improve audience insights for advertising. Let’s say the company wants to find new customers with the same attributes as its best customers and to combine those attributes with other characteristics to drive upsell opportunities. 

To build target segments while complying with privacy regulations, a company can upload its data into a clean room hosted either by the company itself or by an advertising partner. It can use the clean rooms to implement privacy-enhancing technologies that allow participants to securely join and analyze first-party data without exposing raw user identities. Without the configurable settings provided by a clean room, data sharing between parties would be more rigorously restricted due to privacy laws, regulations and competitive concerns.

Monetizing proprietary data

The omnichannel customer journey is complex, and it rarely starts with a brand’s advertisement. For example, if a consumer is planning an upcoming purchase of a kitchen appliance, the journey is likely to start with online review sites. A reviews site collects top-of-funnel data that would be invaluable to the appliance brand. With a data clean room that can manage PII, the site owner could create a privacy-preserving third-party data product.

Retail and consumer packaged goods (CPG) industry collaboration

Data clean rooms allow retailers and CPG companies to collaborate with brands that advertise with them. For example, a retailer can share transaction data in a privacy- and governance-minded manner to provide insights into conversion signals and enable better targeting, personalization and attribution.

Real-world example: Merkle’s Merkury Clean Rooms

A real-world example of an organization using a data clean room to achieve business growth is Snowflake customer Merkle.  With Merkury Clean Rooms, powered by Snowflake Data Clean Rooms, Merkle’s clients and partners combine data from multiple sources without gaining unauthorized access to sensitive data.

Sharing data — and gleaning insights — more safely

A growing number of Merkle’s clients and partners have adopted Snowflake, which opens up new opportunities to collaborate. “When I joined the customer-facing side of our business three years ago, we were frequently having to consider competing offerings,” says John Gajewski, Senior Vice President of Architecture at Merkle. “Now the conversation has pivoted a lot to using Snowflake first.”

It used to be the case that working together securely across organizations, clouds and regions required moving data — and introducing unwanted risk. Merkle now can use Snowflake Secure Data Sharing for live access to data for Merkle’s clients and partners with reduced extract, transform and load (ETL) processes or SFTP. And Snowflake Data Clean Rooms — on which Merkle’s Merkury Clean Rooms are built — allow multiple parties to have appropriate permissions and can analyze data without exposing PII and other sensitive information.

Snowflake for Data Clean Rooms

Snowflake enables companies to securely and privately share data within data clean rooms for efficient, real-time analytics and deeper analysis. Participants can "list" the data they want to share — making it visible only to authorized parties — without necessarily needing to move the data.

Essentially, Snowflake provides a platform which companies can configure to enable data to be securely accessed and analyzed by selected partners while preserving data privacy and security.

A Guide to AI Models: What They Are and How They Work

Artificial intelligence (AI) is transforming how businesses operate, enabling faster decisions, deeper insights, and scalable automation. This guide breaks down what AI models are, how they differ from machine learning (ML) and deep learning (DL), and how leading companies are applying them to drive real results.

What Is an Operational Data Store (ODS)? Complete Guide

Learn how an operational data store works, the potential benefits of using one, and how it can give businesses access to the data they need more quickly and efficiently.

What Is OLAP? A Guide to Online Analytical Processing

What is online analytical processing (OLAP)? Learn how OLAP databases enable multidimensional analysis with real-world OLAP examples and use cases.

What Are OLAP Cubes? OLAP Meaning and Use Cases

What are OLAP cubes? Learn OLAP meaning, use cases, and how data cubes help power fast, multidimensional analysis in business intelligence.

Scala vs. Python: Key Differences & Use Cases

Compare Scala vs. Python for data engineering. Explore their use cases, performance, and how both languages work together in modern workflows.

What Is a Data Catalog?

Discover the significance of data catalogs, metadata management, and the benefits and key features that make them crucial for modern enterprises.

What Is an AI Pipeline? A Complete Guide

An AI pipeline comprises a series of processes that convert raw data into actionable insights, enabling businesses to make informed decisions and drive innovation.

What Is Row-Level Security (RLS)? Benefits and Use Cases

Row-level security (RLS) restricts access to specific rows in a database based on user roles. Learn how it works, why it matters and see examples in action.

What Is Data Discovery? Best Practices and How to Implement

Data discovery is the process of exploring and analyzing data to identify patterns, trends and opportunities that can drive smarter decisions.