Business continuity is a big deal. Business operations rely on access to data and the insights it can provide, and it’s an imperative that has become even more difficult to guarantee with the upsurge of unpredictable events. Everything from human error, power outages, and shifting weather patterns to wide-scale hardware malfunctions can constitute a “disaster,” disrupting a company’s computing capacity and ability to serve its customers and workforce.
As more data workloads move to the cloud, old playbooks need to be rethought. The old backup and restore plans need to be revised for the cloud era. Disasters will strike whether a company is ready or not, so it’s imperative to develop a sound, tested, and coordinated strategy well in advance.
Below are five steps to carry over into your disaster recovery and business continuity planning, each with actionable best practices and parameters.
Step 1: Lay out the potential types of risk
First, it’s crucial to understand the potential failures that could occur in your company. Take time to assess the most common risks across all organizations to see where your time and effort are best spent.
Five of the most common types of failure you can encounter are human error, single instance failure, zone failure, region failure, and multi-region failure. Many of these can be remediated with availability zones, built-in redundancy, and cross-cloud replication.
Once you’ve assessed the potential risks and challenges in your organization, it’s time to choose the regions for your primary and secondary storage. Keeping a primary and secondary region in sync, known as replication, mitigates the risk of failure at any level. Consider the following parameters when selecting regions:
- Decide between a single cloud or a multi-cloud strategy.
- Choose regions far enough away that a disaster won’t affect both.
- Consider what data you’re storing and whether regional regulations apply.
- Consider moving consumption to more affordable regions and across cloud providers.
- Capitalize on regional footprints to leverage the best cloud provider by region.
Step 2: Consider the impact of different risks and how to respond
Now that you’ve laid out the potential risks to your organization, it’s time to assess how they may impact your business and create a plan for a timely response to data disruptions. Begin by quantifying the business impact of an outage to each stakeholder. The questions to ask may include: What would happen when a daily sales report is delayed or an inventory dashboard is not refreshed on schedule? What data supports critical or client-facing applications?
One important step is ranking your business use cases by criticality. How does each case compare in terms of recovery time objective (RTO), recovery point objective (RPO), and granularity? This exercise will help you build a business continuity plan that addresses each failure scenario.
Your approach will determine which actions must be taken, how to inform an application team before initiating failover, what dependent systems need to be activated, and how to share these answers to your stakeholders to revise your plan.
And last but not least, be sure to run disaster recovery drills frequently to identify the weakest link in your end-to-end failover plan. Assumption is the mother of all failure.
Step 3: Specify rules, roles, and responsibilities
A well-defined hierarchy that decides who is able to access data and which data is available to whom is crucial during a disaster. Before one strikes, make sure to map out which databases will be protected, which regions will serve as primary backup centers, and who will make these governance decisions if some level of data is inaccessible.
The same thing applies during an ongoing recovery or failover. Authentication tools can be lifesavers during these challenging periods, and the most successful business continuity plans will support all forms of authentication. Maintaining access and privileges during these periods can be done in four steps:
- Keep role-based access controls consistent across accounts.
- Keep data masking policies consistent and in sync across accounts.
- Maintain resource allocations and govern consumption across replicas.
- Establish point-in-time consistency.
Step 4: Assess the true range of costs for business continuity
Budgeting for business continuity can seem daunting. Your first step can be to establish your company’s minimum needs during an outage—in other words, its RPO. The two main cost drivers will be the number of databases that will be replicated and the frequency of that replication.
Some questions you should consider are: What functionalities and teams must continue operating at all costs? Are there functionalities and teams a firm could live without for periods of time? How much data can you afford to lose, and for how long?
Other helpful actions that could inform your budgeting decisions include examining your industry’s potential vulnerabilities, determining which data matters most, and running real-world testing to gauge the cost of an outage.
Step 5: Explore ways to make contingency plans pay off
Business continuity isn’t just a way to plan for disaster—it can also be a way to boost performance via data sharing, collaboration, and insights.
While some recovery systems lay dormant until a company needs them, more are popping up that do more. Some systems serve as value-creating assets with clear ROIs regardless if recovery is utilized. And for companies trying to eliminate silos, these systems can move petabytes of data quickly and maintain incremental synchronization.
Remember that you can do more with your recovery system than just recover. For more help building the best business continuity plan, read How to Build a Successful Business Continuity Strategy in the Cloud in 5 Steps for detailed steps, diagrams, and best practices.