Govern an Open Lakehouse with Snowflake Open Catalog, a Managed Service for Apache Polaris™

Polaris Catalog - Open Catalog

To enhance security and ease operational burden, many organizations with data lakes or lakehouses want flexibility to securely integrate their tools of choice on a single copy of data. An open standard for storage format and catalog API has helped, but there’s still a need for open standards for the catalog, including a consistent way to apply security access controls to data. Many catalog options don’t support a standard, meaning security is still siloed, and access controls are enforced in multiple places for different engines.

To address these security and governance challenges, Snowflake worked with the community to form open standards with Apache Polaris™ (incubating). Now, we are thrilled to announce that Snowflake Open Catalog is generally available (GA). As a managed service for Apache Polaris, Snowflake Open Catalog helps you more effectively integrate and secure open lakehouses. Teams in your organization can finally collaborate on data lakes in a governed manner with consistent access controls for many engines — both readers and writers. Snowflake Open Catalog leverages Apache Polaris, which allows you to more easily adapt as your organization grows and its needs evolve by integrating new engines and applying consistent governance controls.

"We're big believers in the Apache Iceberg ecosystem because of the flexibility and power it provides users, which only grows as more tools and products adopt it. The catalog is a critical piece of this ecosystem, and Snowflake Open Catalog is the most fully functional, managed solution for Apache Polaris on the market today. It's simple to use, offers RBAC out of the box, and easy to integrate with core services of the Iceberg ecosystem, like Upsolver's real-time data ingestion and Iceberg table optimizer." – Roy Hasson, VP of Product and Marketing, Upsolver

Built on the Apache Polaris open source project

Apache Polaris, which is currently undergoing incubation at the Apache Software Foundation, was first open sourced in July 2024 as Polaris Catalog. Since open sourcing, more people have become involved to help improve documentation, fix bugs and propose new features. As seen with Apache projects, such as Apache Iceberg™, transparent and vendor-neutral project governance is a powerful way to build a community — translating to greater longevity and more optionality for adopters.

Snowflake Open Catalog uses the same catalog implementation as Apache Polaris, so you don’t have to worry about feature compatibility, which could lead to potential lock-in. As the functionality in the Apache Polaris project evolves, so too will Snowflake Open Catalog. And with Snowflake Open Catalog, you gain the reliability, security, scalability and support of a Snowflake managed service, freeing your team from the complexities of setup, maintenance and updates, so you can focus on innovation rather than infrastructure.

How Snowflake Open Catalog enables governed interoperability

While some catalogs have added an “open front door” by supporting the Apache Iceberg REST catalog API for reads and writes, they use a proprietary implementation. These catalogs offer flexibility for many engines you could integrate, but the proprietary implementation may make it harder to switch catalogs if you ever wanted to.

Other catalogs have built a “one-way front door” by supporting a read-only Apache Iceberg REST endpoint, and they provide an open source implementation. But the one-way front door significantly hampers your flexibility, limiting which engines you could use to write. And the open source implementation lacks transparency for vendor-neutral control. 

Apache Polaris provides an “open, two-way front door,” granting freedom to use a variety of engines to both read and write. Any engine that you can connect will integrate with the access controls defined in Apache Polaris, giving you a unified security plane across all the Apache Iceberg tables in your data lake or lakehouse. When any of these engines try to query a table, write to a table or see a list of namespaces and tables, Apache Polaris will first check if they have permission to do so. If they do, then Apache Polaris will provide a scoped set of temporary storage credentials needed to perform the operation.

In addition to these benefits by way of Apache Polaris, Snowflake Open Catalog provides more functionality, like multi-factor authentication (MFA), availability in multiple clouds and regions, and a web interface to make it even more secure and easier to use.

Snowflake Open Catalog can be integrated with Snowflake’s engine but is not required. Snowflake Open Catalog can be used as a catalog to provide governed interoperability for any Apache Iceberg REST-compatible engines.

What’s new in GA?

Based on customer feedback from public preview, this GA release of Snowflake Open Catalog comes with some new features to meet enterprise security requirements:

  • Multi-user support: You can now create multiple users to manage a Snowflake Open Catalog account.

  • Multi-factor authentication: Admins of Snowflake Open Catalog accounts can also enforce MFA at the account level.

  • Object-level privileges: Apache Polaris supports scoping privileges at the catalog, namespace or table levels. This is now available in the Snowflake Open Catalog interface.

  • Table details: You can now view table schema and privileges in the Snowflake Open Catalog user interface.

  • Billing: Snowflake Open Catalog accounts can be created in a Snowflake free trial organization. Accounts created in the trial organization will be rate limited and throttled at an average of five requests per second allowing a burst rate of 50 requests per 10 second interval. Snowflake Open Catalog accounts created in a billable Snowflake organization don’t have this rate limit and are charged per number of requests per month. Snowflake Open Catalog will provide the service for free for a limited period of time. For more details on billing, please refer to the documentation.

Snowflake Open Catalog, a Managed Service for Apache Polaris™ multi user capabilities
Snowflake Open Catalog, a Managed Service for Apache Polaris™ table details schema

Get started with Snowflake Open Catalog

To learn more about Snowflake Open Catalog, visit our documentation, or try it out for yourself with a tutorial. Learn more about Apache Polaris on GitHub, and discover how you can contribute to the project. For updates, follow Apache Polaris on LinkedIn and X.

 

The Essential Guide to Modernizing Data Lakes for AI with Snowflake

Our “Essential Guide to Modernizing Data Lakes for AI with Snowflake” gives organizations guidance from experts to help them create the necessary data foundation to unlock the full potential of their data with AI. Download your copy now.

Subscribe to our blog newsletter

Get the best, coolest and latest delivered to your inbox each week

Start your 30-DayFree Trial

Try Snowflake free for 30 days and experience the AI Data Cloud that helps eliminate the complexity, cost and constraints inherent with other solutions.