Product and Technology

OCT 03, 2023|3 min read

Configure and Manage Data Pipelines Replication in Snowflake with Ease

We are excited to announce the availability of data pipelines replication, which is now in public preview. In the event of an outage, this powerful new capability lets you easily replicate and failover your entire data ingestion and transformations pipelines in Snowflake with minimal downtime.

Turnkey data pipelines replication and failover

Snowflake provides a best-in-class experience for data engineering workloads. Thousands of customers leverage external stages, internal stages, Snowpipes, and storage integrations to develop ingest pipelines, which are critical to ensure that users and applications have access to the most recent, accurate data.

Being able to instantly failover and seamlessly resume data pipelines during an outage is critical for businesses that prioritize high availability and data integrity. Until now, customers either had to take on the tedious task of manually recreating these crucial objects or developing custom scripts. To achieve zero data loss, customers have to replay their data pipelines and ensure de-duplication of data in their application layer. These steps not only introduced complexity but also increased the likelihood of user errors occurring during an outage, leading to high recovery times, which is suboptimal for mission-critical workloads. Now, Snowflake will ensure you get idempotent loads even after you failover, so you can replay your pipelines with minimal effort.

This new functionality in Snowflake now enables the replication and failover of storage integrations, Snowpipe, internal stages, external stages, and load history. When you now failover to a secondary region due to an outage, your data pipelines will resume and continue to ingest data while guaranteeing idempotent loads, thereby allowing customers to achieve zero data loss. This further simplifies the business continuity experience in Snowflake by removing complex custom tooling and workarounds in place, and allows customers to achieve lower recovery point objective (RPO) and recovery time objective (RTO).

Let's take an example where primary ‘Account A’ is in the US West region. This account has an auto-ingest pipe configured to continuously load data from the same storage bucket into a database in Snowflake. During normal operations, replication is configured to periodically replicate the pipeline objects from Account A to Account B (secondary in US East). When an incident occurs that impacts availability of Account A, you can now failover to your secondary Snowflake account that is located in US East. On failover, the auto-ingest pipe will automatically begin to load data with idempotent guarantees, ensuring zero data loss.

Snowflake is committed to further simplifying the cross-region and cross-cloud replication experience. With this new feature, Snowflake is making it easier than ever for customers to configure and operationalize replication at scale, allowing you to safeguard your critical workloads from outages. We can't wait for you to try out this new feature and see how it can streamline your business continuity experience.

Get started

Data pipeline replication and failover is available in public preview in all Snowflake regions. To get started today, you can review the documentation here.

Do More for Less: Announcing New Snowpipe Pricing — and 9 Other Ways to Save on Data Engineering Costs

From ingestion to pipelines, learn 10 ways to cut data engineering costs. We also outline the new Snowpipe pricing model and how it supports cost predictability.

Operationalizing Data Pipelines With Snowpark Stored Procedures

Snowpark Scala stored procedures are now available to all customers: a feature that lets you simplify your pipelines by hosting them right inside Snowflake.

Streamlit in Snowflake: Build Data and AI Apps with Python

Learn how Streamlit in Snowflake revolutionizes Python data app development and allows data practitioners to create interactive apps with ease.

Saving Time & Space: Simplifying DevOps with Fast Cloning

Snowflake Elastic Data Warehouse fast cloning allows you to have multiple copies of your data without the additional cost of storage usually associated with replicating data.

Configure and Manage Data Pipelines Replication in Snowflake with Ease

Turnkey data pipelines replication and failover

Get started

Authors

How to Configure AWS Glue with Snowflake for Data Integration

Best Practices for Data Ingestion with Snowflake: Part 2

Reimagine Batch and Streaming Data Pipelines With Dynamic Tables

Do More for Less: Announcing New Snowpipe Pricing — and 9 Other Ways to Save on Data Engineering Costs

Operationalizing Data Pipelines With Snowpark Stored Procedures

Streamlit in Snowflake: Build Data and AI Apps with Python

Saving Time & Space: Simplifying DevOps with Fast Cloning

How to Configure a Snowflake Account to Prevent Data Exfiltration

Snowflake Data Pipelines

Subscribe to our blog newsletter
Get the best, coolest and latest delivered to your inbox each week

Where Data Does More

Configure and Manage Data Pipelines Replication in Snowflake with Ease

Turnkey data pipelines replication and failover

Get started

Authors

Share Article

RelatedContent

How to Configure AWS Glue with Snowflake for Data Integration

Best Practices for Data Ingestion with Snowflake: Part 2

Reimagine Batch and Streaming Data Pipelines With Dynamic Tables

Do More for Less: Announcing New Snowpipe Pricing — and 9 Other Ways to Save on Data Engineering Costs

Operationalizing Data Pipelines With Snowpark Stored Procedures

Streamlit in Snowflake: Build Data and AI Apps with Python

Saving Time & Space: Simplifying DevOps with Fast Cloning

How to Configure a Snowflake Account to Prevent Data Exfiltration

Snowflake Data Pipelines

Subscribe to our blog newsletterGet the best, coolest and latest delivered to your inbox each week

Where Data Does More

Subscribe to our blog newsletter
Get the best, coolest and latest delivered to your inbox each week