Distributed Data Clean Rooms Powered by Snowflake
Jan 27, 2020 | 4 Min Read
Author: Justin Langseth
Modern Data Sharing, Snowflake Private Data Exchange
Under the pressure of increased privacy regulation in the marketing world, many Snowflake customers are becoming interested in the concept of data clean rooms. A data clean room is a safe place that allows multiple companies, or divisions of a single company, to bring data together for joint analysis under defined guidelines and restrictions that keep the data secure.
Data clean rooms have use cases in marketing attribution and sales. For example, if you know customers saw an advertisement and later bought products, you can understand if you’re getting a return on your marketing dollars. While these types of attribution analytics have happened in the past, the new California Consumer Privacy Act (CCPA) and GDPR regulations now block or complicate this analysis.
Whether a clean room contains PII or anonymized data, data privacy practices are critical. Even anonymized data can often be tied back to actual people through creative analytics.
Snowflake Distributed Data Clean Rooms
Data clean rooms must control the following:
- What data comes in
- How the data in the clean room can be joined to other data in the clean room
- What types of analytics each party can perform on the data
- What data, if any, can leave
Snowflake and its customers have started to realize that Snowflake Cloud Data Platform is ideal for building and operating data clean rooms. Snowflake’s data sharing technology, private data exchange platform, secure function and secure join capabilities, and underlying multi-tenant cloud data platform provide the foundation needed to run a data clean room.
And, unlike other companies, Snowflake can power distributed data clean rooms where each participant controls its own data while allowing governed, controlled analytics with another party—or even with multiple other parties simultaneously. Snowflake’s data sharing capabilities allow this without copying all the data into one database, and without having to trust a single party with all of the data.
An Advertising Example
For example, let’s say a brand has its own first party data containing attributes about its customers and their associated sales SKUs. The brand wants to advertise to find new customers with the same attributes, and to combine those attributes with other characteristics to drive upsell opportunities. To create the target segments and comply with privacy requirements, the brand uploads its data into a clean room operated either by the brand or its ad partner. Participants can securely join any first party data without exposing IDs.
In a scenario like this, only limited amounts of data can flow between the various companies due to data privacy, regulation, and competitive concerns. In particular, it has been difficult to link a campaign with a business outcome for a customer. Analyzing this data can be opaque and inefficient because parties cannot share personal information about individuals, and due to the strict controls between online and offline advertising.
How a Distributed Data Clean Room Can Help
To make analytics more efficient and real time, and to allow for deeper analysis, a company can configure a Snowflake Private Data Exchange where the participants can privately “list” the data that would be useful for this analysis—in a place where only the right parties can see it—without moving the data out of its database account. Each participant then can configure access controls, use secure functions, and leverage secure joins to properly protect the data, while still permitting joint data analysis. This can happen immediately if both parties have Snowflake accounts, or if the Snowflake customer sets up a secure sub-account for a participant who is not a Snowflake customer.
For example, an advertising platform could use secure functions to allow analysis of advertising displays by geography, but prevent results from geographic areas or sample sizes that are too small. Secure functions can also disallow queries that may violate privacy principles.
And secure joins can establish linkages at the individual level (people, devices, cookies, or other identifiers) without exchanging or making visible any PII.
How to Get Started Building Your Data Clean Room
You can use Snowflake to enable data clean rooms today. Follow these steps:
- Get a Snowflake account for each company or group (or have an existing Snowflake customer provide a secure sub-account of their account).
- Load data into Snowflake (done by each company or group).
- Establish a Snowflake Private Data Exchange between the participants.
- Configure secure functions and secure joins to protect the data.
- Perform analysis on the joint data using standard analysis tools, while respecting each party’s privacy policies and boundaries.
Let Us Know About Your Data Clean Room Needs
Although this article focused on marketing, the same approaches apply to other verticals like healthcare, financial services, government, high tech, media, and telco. Let us know if you are thinking about establishing a data clean room—we’d be happy to help!