Snowflake on Snowflake: How We Strengthened Data Governance Using Dynamic Data Masking
Aug 26, 2020
Author: Trask Dunlap | Contributing Authors: Falguni Sonawala, Tamir Rozenberg, and Andrew Seitz
Data Governance, Snowflake Technology
Managing access to sensitive data is the name of the game when it comes to security and data governance. It’s required to protect sensitive data from unauthorized changes or exposure, and it’s now a mandate as part of privacy regulations such as GDPR and the California Consumer Privacy Act (CCPA). Companies all over the world are now focused on protecting sensitive PII associated with their customers and employees.
Traditional role-based access control (RBAC) can be used to enforce and control least-privilege access. But what happens when traditional RBAC controls don’t allow for the right flexibility? At Snowflake, we used the new Dynamic Data Masking feature to design an RBAC model which gives us granular control over our employees’ viewing permissions to support our data governance model.
Role-Based Access Control
Snowflake offers a robust RBAC system where administrators can create custom roles to meet their organization’s requirements. Snowflake’s internal RBAC design is based on the principle of least privilege, where users are granted access to only the privileges required for their job duties. Roles are designed to create a clear separation of duties between read-only access and write access. Snowflake distinguishes this separation between enterprise user roles (read-only access) and administrative roles (write access). More details of this design will be a topic for a future blog post.
However, even though Snowflake’s RBAC model provided a solid framework to solve most access control requirements, it still presented limitations:
- RBAC worked well to control write access to data, but read-only access offered a different set of challenges.
- If roles and permissions are too granular, role management becomes very manual and unruly.
- If roles and permissions are too broad, they do not adhere to the least-privilege model.
Managing access to sensitive data using the RBAC model meant that we had to require and track special approvals for each access request. We could create multiple views, which would cut out the information that a certain group of users should not see. However, this would require an administrator to manage access to multiple slices of the same set of data. As you can imagine, this approach gets messy really quickly.
The Pressure Was On
Confronted with the need to create a powerful and flexible access control framework, our teams put their heads together to enumerate all possible methods. Should we revoke access to the tables and re-create views with the sensitive columns removed? Should we then adjust the permissions of the roles? Or should we just revoke the roles from the users and then see who complains? Could these approaches break any existing workflows and reports?
Everything seemed like a patchwork approach that would cause disruption and also make the environment more cluttered and obscure. However, with Snowflake, there is always a smarter way to solve tough challenges.
Data Masking to the Rescue
Snowflake recently released the Dynamic Data Masking feature, which allows a designated administrator to create and apply column-level masking policies (see Figure 1). These policies can be defined specifically to restrict access to data in the columns (of tables or views) on which the policy is applied. Based on the policy, certain authorized roles (green-lit roles) will see the column values “as is,” while the other roles (red-lit roles) will see obfuscated values. The definition of the policy also allows for much flexibility. For example, if you don’t want to elaborate the names of every role that shouldn’t see the exact value of the column cell, you can name only green-lit or red-lit roles. Users with the red-lit roles would still be able to query the table or view but would not see the actual values of the masked column.
This approach was the solution we had been looking for! We could create a set of policies to mask every column with sensitive data across an entire account. Only authorized roles assigned to the individuals who required access would be able to see those columns, and others would just see the value defined in the policy (for example, “000000”). The best part was that applying the new data masking policy did not negatively affect or break any queries or reports. Nothing changed for the majority of users.
This approach was a quick, efficient, and completely thorough solution for a tough problem, implemented in days instead of weeks. No overhaul of RBAC or the current setup of databases and tables were required. Instead, we just had to write a precise set of policies and apply them to the right tables and right roles. This new feature showed us the power of Snowflake: It can make data accessible and also make it accessible to the right people.
“Who can see what data?” is an imperative question for any business to answer. With Dynamic Data Masking as an addition to the traditional RBAC kit of problem solving tools, Snowflake account and database administrators have reason to rejoice. This feature is a perfect addition to the security and data governance wheelhouse, working harmoniously with RBAC. To learn more about Dynamic Data Masking and how it can complement your existing Snowflake access controls, refer to the following Snowflake documentation: