A data hub is a hub-and-spoke system for data integration in which data from multiple sources and with various requirements is reconfigured for efficient storage, access and delivery of information.
A data hub differs from a data lake by homogenizing data and possibly serving data in multiple desired formats, rather than simply storing it in one place, and by adding other value to the data such as de-duplication, quality, security, and a standardized set of query services. A data lake tends to store data in one place for availability, and allow/require the consumer to process or add value to the data.
A data hub is similar to a data lake in that both approaches involve transferring data from disparate silos into a single new system. That’s where the similarities end, though. Traditional data lakes unite data in one place, but the data is incompatible. Without compatibility, querying data is difficult and real-time processing cannot occur.
Snowflake and Data Hubs
The Snowflake Cloud Data Platform allows organizations to easily acquire, publish, or share data sets without deploying cumbersome ETL processes. With both public and private options, the Snowflake Data Exchange creates a powerful internal or external data hub that redefines how companies acquire, utilize, share, and monetize data in the Cloud economy.