Data Vault

What is a Data Vault?

The Data Vault System of Business Intelligence or simply Data Vault (DV) modeling provides a method and approach to modeling your enterprise data warehouse (EDW) that is agile, flexible, and scalable.

The formal definition as written by the inventor Dan Linstedt:

«The Data Vault is a detailed oriented, historical tracking, and uniquely linked set of normalized tables that support one or more functional areas of business. It is a hybrid approach encompassing the best of breed between 3rd normal form (3NF) and star schema. The design is flexible, scalable, consistent, and adaptable to the needs of the enterprise. It is a data model that is architected specifically to meet the needs of today’s enterprise data warehouses.»

The DV was developed specifically to address agility, flexibility, and scalability issues found in the other main stream data modeling approaches used in the data warehousing space. It was built to be a granular, non-volatile, auditable, historical repository of enterprise data.

How does Snowflake support Data Vault?

The Snowflake Data Cloud offers ANSI SQL RDBMS with pay-as-you-go pricing. It supports tables and views like all the relational solutions on the market today. Since, from a data modeling perspective, Data Vault is specific way and pattern for designing tables for your data warehouse, there are no issues implementing one in Snowflake.

In fact, with the combination of MPP compute clusters, optimized columnar storage format, and a patent-pending Adaptive Data Warehouse technology, you may get better results with your Data Vault loads and queries with less effort than with legacy data warehouse solutions. Remember that with Snowflake you don’t need to pre-plan partitioning or distribution keys, or build indexes to get great performance. That is all handled as part of our Dynamic Query Optimization feature that uses our secure cloud-based metadata store and sophisticated feedback loop to monitor and tune your queries based on data access patterns and resource availability among other things.