In the previous years Business Continuity and Disaster Recovery have been big buzz words. All companies small and large vowed to launch initiatives to implement either or both in their current IT strategies. My question is what happened? Why is it that I rarely see organizations that have implemented or even have a plan to implement Disaster Recovery?
Is it a lack of understanding? Is it that most companies believe it is to expensive or complicated to implement? Well it doesn’t have to be either. Most companies that are undergoing virtualization initiatives already have half if not more of what they need to implement Disaster Recovery. The simple fact is if you already have at least two data centers and are virtualizing you are a prime candidate. Here are some common question and my answers regarding this subject:
1.) Do I need to utilize SAN replication to implement Disaster Recovery in a virtualized environment?
No! There are other option to achieve Disaster Recovery without SAN replication. If you are running VMware you can utilize some of what you already have. VMware VCB in conjunction with VMware converter can be used to implement Disaster Recovery. Now this wouldn’t be as elegant as doing SAN replication but you could implement scheduled V2V’s of your Virtual Machines from one site to another and it’s a very simple solution to implement.
What about the hardware right….where do we get the additional hardware? The answer is simple reuse what you already have. Take those old servers you just freed up and put them to some good use. Beef them up! Need more ram in them tear ram out of some and add it to other, do the same with CPU’s to make a number of more power servers that you can use for DR. Granted you may need more of the reused servers to host all the vm’s needed but at the end of the day you would have a disaster recovery plan.
2.) What if I can’t do SAN replication but want synchronous and asynchronous replication?
This can still be achieved using software based replication in your virtual machines. Software like NSI Doubletake and Replistor provide this functionality at a a relatively low cost. With virtualization you can cut cost even more. With physical servers you traditionally needed to have a 1 to 1 mapping for replication which required a license for each host. With virtualization you can take a many to one approace cutting down on the licenses you need to replicate your data.
With this approach I would still use VCB or VMware converter to make weekly copies of your virtual machine OS drives. You can then utilize one of the mentioned applications (Doubletake or Replisor) to synchronous replication of your data volumes. You can achieve this and save licenses by installing say Doubletake on each of the source systems. The you would create a virtual machine at the DR site and add a drive to it for each of the source systems data volumes and replicate each source data to a different data volume on the destination vm. If you ever need to fail over just dismount the volumes from the destination vm and attach each one it’s respective vm that was created through the use of VCB or VMware converter.
3.) These methods are great but what would it take to bring an environment back up using them?
That’s rather hard to say because it depends on the size of your environment and how many vm’s you are relocating to your DR site. If your environment is large and you have specific SLA’s to adhere to regarding RTO (Recovery Time Objective’s) and RPO (Recovery Point Objectives) then you should consider SAN to SAN replication and utilizing something like VMware SRM which does an outstanding job of handling this. VMware SRM also allows you to run disaster recovery simulations to determine the effectiveness of your DR strategy that allows you to determine if you are meeting your SLA.
If you are doing DR on the cheap the real answer is to this question is you will be able to recover your systems a heck of a lot quicker than if had to restore via backups of rebuild your systems.
4.) This is great but where do we begin?
Don’t know where to begin, the answer is easy. Start small and grow into it. Find at least 2 servers that you can reuse beef’em up determine a configuration for them and deploy ESX to the servers. You need to have some infrastructure in place at your DR location to make DR work so that is a good place to start. You need to add the following service at your DR location:
- Active Directory Servers
- DNS Servers
- NTP Servers
- Virtual Center Server
It may be required to to deploy additional servers for your specific environment but I think you get the idea.
Next pick a few development machines or test machines that you can replicate to the DR site. Develop a plan and schedule down time and perform a test fail over to the remote site. Once you have work out the kinks and have a written DR plan determine your first phase of servers to incorporate into your DR site. Generally at this point you would want to pick some of your most valuable servers to ensure they are protected.
You can then break all your servers that need to be replicated into phases and determine the host requirements at the DR site and develop a plan for each phase of your DR implementation. It would be a good idea to have a remote replication vm for every 20 or so source vm’s. This really would depend on the data chance rate of your servers but 20 is a good starting point.
This article is obviously not all inclusive and is very high level but hopefully it inspires some of you to start developing a DR strategy and at least start testing some of these solutions in your environment because data is a terrible thing to waste.