Disasters happen. Unfortunately, they're a fact of life - and business - that sometimes not even the most prepared companies can avoid. Whether through error, malicious act or other circumstances that may be completely beyond a firm's control, you have to expect the unexpected.
But while you can't always prevent disasters, what you can do is be prepared to ensure any disruption is minimal, and this is where having a disaster recovery plan comes in. This is essential in getting critical IT systems and applications up and running again, and as more activities within enterprises become digitally-dependent, no firm can afford to be operating without contingencies in place.
The importance of a having a clear disaster recovery plan
It should go without saying that technology is at the heart of any business today, so any incident that disrupts access to data, or leaves firms unable to use critical applications can have serious consequences. But do you know just how damaging an incident could be?
According to one 2019 study by ITIC, 98% of firms say an hour of downtime will cost them at least $100,000, while 86% of businesses report the cost for one hour of downtime as $300,000 or higher. One in three firms even say being offline for an hour will cost them between $1 million and £5 million.
This only covers the direct costs, without taking into account other expenses, such as litigation, regulatory fines for any breaches, or reputational damage leading to longer-term lost business. It's therefore clear that firms need to react quickly to mitigate these effects.
7 steps to develop a disaster recovery plan
This means having a clear, comprehensive plan for what to do in the event of a disaster. While prevention is always better than cure, it won't always be possible, so being able to refer to a detailed document that sets out exactly what needs to be done to get up and running again helps take much of the stress and confusion out of any situation.
Here are a few key steps you should follow in order to create a disaster recovery (DR) plan.
1. Initial steps - setting up your plan
The first step in formulating any disaster recovery plan is determining how broad its scope will be, who needs to be involved in the process, and how it should be documented. Setting out your goals - such as how long you expect a recovery to take - will guide every stage of the development of your plan, so it's important to go in with a clear objective and your eyes open.
This will often take the form of a DR strategy document that details the aims of the plan, as well as names and contact information for key personnel. It should also set out the resources that’ll be needed, including technology and budgetary requirements, to ensure you stay on track.
2. Understanding the types of disasters
Assess what kind of disasters pose a risk to the business, and the relative likelihood of each. Different issues will demand very different responses, so you need to make sure your plan covers all of them. While there are a huge range of potential problems that may arise, they tend to fall into a few key categories, including:
- Hardware failures - this is one of the most common causes of disasters for businesses, accounting for 45% of incidents
- Power failure - power disruptions to your data center or issues that affect how your network communicates can leave firms isolated and unable to operate at all
- Data loss or corruption - whether due to human error or misconfiguration, lost or corrupted data is responsible for almost a quarter of issues
- Criminal activity - this can include data breaches, ransomware, theft of intellectual property or even sabotage, and can originate inside or outside the business
- Natural disasters - fire, flooding, or even hurricanes or earthquakes may impact your ability to do business, depending on where you're located
3. Identifying your risks
Many large enterprises will have substantial IT estates, and not all of these will need the same amount of attention, so it's vital to identify where your highest risk areas are and what the likelihood is of anything going wrong. This should involve looking at all assets to determine both the range of disasters that could affect them, and the potential business impact of an incident.
This allows you to prioritize mission-critical assets, such as data centers or applications that the business cannot afford to be without for even the shortest amount of time. However, it should also ensure you're devoting an appropriate amount of resources to them - there's little point in spending large amounts of time and money safeguarding against scenarios that have a very small chance of happening if there are other situations that are more likely to occur.
4. Determining downtime
Once you know what your critical assets are, you can clearly see how quickly you'll need to get them back up and running. All DR plans should include a recovery time objective (RTO), which defines the maximum amount of time that a system, network, or application can be offline after a disaster occurs before serious harm results. Depending on the system in question, this can be anything from a few seconds to a day or more.
A related, but different, concept is recovery point objective, or RPO, which determines how long a system can be down before the amount of data lost becomes unacceptable. For example, if a system has an RPO of 30 minutes, then this is how often it’ll need to be backed up, as recovered data older than this will no longer be useful. Knowing both your RPO and RTO is essential in formulating an effective recovery plan.
5. Setting out responsibilities
A DR strategy must also set out who’ll be responsible for taking control of any recovery procedure, and what everyone else's role will be in the disaster recovery team. When speed is of the essence, firms cannot afford to waste time hunting down contact details or ringing around various professionals before they reach the person with the right skills and experience to deal with the problem.
This must extend beyond internal personnel. For example, if there are third-party suppliers such as cloud computing providers or backup specialists that need to be involved, there should be clear information about who’s responsible for contacting these people and what the procedure will be. Having these responsibilities and details set out clearly in the plan is essential in ensuring swift resolutions to whatever the issue is.
6. Testing, testing, testing
Perhaps the most important part of any DR strategy is the testing phase. This is essential, as you don't want to be in the middle of a crisis, only to find your carefully-designed plan has flaws that make it impossible to implement. Therefore, it's vital that every part of the process is rigorously tested and evaluated before it’s put into action.
Just like having regular fire drills, this is something that needs to be done on a frequent basis. There are many ways to do this, from paper-based walkthroughs to full simulations, but however you do it, the results must be scrutinized closely for any weaknesses or delays, which can be used to improve the strategy, as well as eliminate any errors.
7. Refining, improving and documenting
An IT system is a constantly evolving environment, so one of the key factors testing will highlight is if any of your processes have become outdated or no longer relevant. Similarly, if new applications or hardware solutions are being brought into the business or retired, businesses must ensure that these have been fully factored into their DR strategy.
Therefore, your DR plan must also constantly evolve. At the same time, it's essential that all of the updates you make are fully documented and communicated to everyone involved in implementing the plan. If the worst should happen, your employees will need to know exactly where they have to look to get access to the DR plan, and be confident that the documents they're referring to are fully up to date.