Data Warehouse

How to Make Data Protection and High Availability for Analytics Fast and Easy

How to Make Data Protection and High Availability for Analytics Fast and Easy

When moving enterprise data warehouse analytic workloads to the cloud, it’s important to consider data protection and high availability (HA) services that will keep your valuable data preserved and your analytics running. Possible events such as human error, infrastructure failure, an unfortunate act of nature or any activity that places your data at risk can’t be ignored.

All the while, data protection and HA should be fast and easy. It shouldn’t take days, weeks or months, nor an army of technical specialists or a big budget, to make these critical safety measures happen.  

What data-driven companies depend on

With real-time dashboards, organizations depend on data analytics to communicate the status of business operations. Increasingly, companies embed self-service analytics into customer-facing applications. Therefore, enterprises spend enormous amounts of effort, energy and resources to gather and cultivate data about their customers. With all this activity around data, loss of any data or processing capability could result in catastrophic consequences for an organization.

How Snowflake protects your data and services

For these reasons, Snowflake innovates and integrates data protection and HA as core capabilities of our cloud-built data warehouse-as-a-service. In Figure 1, you’ll see that what makes Snowflake different. Our protection capabilities are all built-in and orchestrated with metadata across your entire service. The figure also illustrates, how Snowflake resilience is automatically distributed across three availability zones.

  • Built-in data protection: Over and above standard cloud data protection, Snowflake Time Travel enables you to recover data from any point, up to 90 days.  In addition, it’s all accomplished automatically. Other than specifying the number of days for Time Travel retention at setup (default is 24 hours for Snowflake Enterprise and above), you do not have to initiate a thing or manage snapshots.


        

      

This brings significant advantages for business analysts performing scenario-based analytics on changed datasets, or for data scientists who want to train new models and algorithms on old data.

  • Built-in service protection against node failures:  The impact of node failures can be tricky to figure out with different cloud implementations offered by different cloud data warehouse vendors. While other cloud data warehouse or querying services may provide some level of redundancy for current data, mechanisms to protect against data corruption or data loss in the event of a node failure vary.

    In most cases, the burden is on you to create a cluster (i.e., a system with a node count of greater than one) to protect against node failures. Typically, this means added cost (hardware, storage, and software instances), as well as added complexity, to account for the additional nodes. Some competing services may have a performance penalty on data writes. This exists because, under the covers, redundant nodes are being written using compute resources. We see this most frequently with on-premises data warehouse environments retrofitted for the cloud. Moreover, there also could be hidden costs in the form of your cluster going down and not being accessible for queries or updates during the time a failed node is being reconstructed.

    Because the Snowflake architecture separates the compute, storage and service layers, Snowflake assures resiliency and data consistency in the event of node failures. Depending on the severity of failures, Snowflake may automatically reissue (retry) without a users’ involvement. And there is also no impact on write (or read) performance. In addition, you can take advantage of lower cost storage. Competing services may highly encourage or restrict you to use premium-cost storage.
  • Built-in high availability: Providing an even higher degree of data protection and service resilience, within the same deployment region, Snowflake provides standard failover protection across three availability zones (including the primary active zone). Your data and business are protected. As you ingest your data, it is synchronously and transparently replicated across availability zones. This protection is automatically extended from Snowflake to customers, at no added charge.

    Further, all the metadata, the magic of Snowflake services, is also protected.

Summary

Bottom line, within the same deployment region, you do not have to configure or struggle with manually building an HA infrastructure. Our data warehouse-as-a-service takes care of this for you, automatically. Snowflake makes data protection and high availability fast and easy. You can mitigate risks with speed, cost-effectiveness, confidence and peace of mind.

Share Article

Snowflake’s Unistore Unifies Transactional and Analytical Data with the General Availability of Hybrid Tables

Snowflake's Hybrid Tables unify transactional and analytical data for simplified architecture and governance, now generally available.

Announcing New Innovations for Data Warehouse, Data Lake, and Data Lakehouse in the Data Cloud

Explore data innovations in Snowflake's Data Cloud for data warehouse, data lake, and data lakehouse architectures.

Introducing Snowflake Interactive Analytics for Modern Data Analytics

Meet Snowflake Interactive Analytics, powered by Interactive Tables and Warehouses-built for high concurrency and real-time insights on governed data.

Enabling Data Mesh Principles for Organizational Agility

Learn how Roche overcame legacy data architectures, enabled self-service data use, and achieved efficiency and security with Snowflake and Immuta.

How DTCC Achieves Data Resiliency with Snowflake’s Snowgrid

Learn how DTCC achieved data resiliency with Snowflake’s Snowgrid Technology and AWS, ensuring business continuity and minimizing disruptions.

Saving Time & Space: Simplifying DevOps with Fast Cloning

Snowflake Elastic Data Warehouse fast cloning allows you to have multiple copies of your data without the additional cost of storage usually associated with replicating data.

Usage-Based Pricing Delivers a Budget-Friendly Data Cloud

Snowflake provides cloud elasticity with its consumption-based pricing model; the cost is primarily based on two metrics: usage of compute and data storage.

How Marriott Modernized Their Data Architecture with Snowflake

Learn how Marriott revamped their data architecture with Snowflake, simplified complexity, and fueled business growth.

Data Cloning | Snowflake Fast Clone | Snowflake Blog

Find out how Snowflake Elastic Data Warehouse enables users to clone tables, schemas, and entire databases almost instantly with no additional storaged

Subscribe to our blog newsletter

Get the best, coolest and latest delivered to your inbox each week

Where Data Does More

  • 30-day free trial
  • No credit card required
  • Cancel anytime