DailyHypervisor Forums are online.

We have just launched our DailyHypervisor Forum located at http://www.dailyhypervisor.com/forum. Stop by, contribute and be a part of our community. The DH Forum is intended to be for all things cloud. Currently we have forums created for vCAC, vCD, vCO, Cloud General, and Openstack. More forum categories will be coming based on demand. If you have a category you would like to see shoot us a note and let us know.

Our goal is to create a common place where anyone can come to learn, get help, share ideas, or just about anything that will help foster knowledge regarding cloud computing. Considering this very blog is the announcement of our forum you could image there isn’t a whole lot happening yet so what are you waiting for, be the first. Go ask a question, post an issue, share a thought and let’s get things rolling.

Keep it simple stupid – registering unregistered vm's

Last week my boss came to me and asked if I could write a script for a customer to register VM’s after being replicated from once VI environment to another.  I agreed to take on the project and go for it.

Like everything I do these days I decided to use powershell to write the script.  I have taken a liking to it and the fact that I can run the scripts on both ESX and ESXi hosts saves me from having to re-create scripts all the time.  So I plugged away to 3am wrote the script, tested it inside out and sideways in my lab.  I was confident in the scripts ability to register all vm’s form all datastores I went ahead and sent it off to the customer.

A few days later I was on a conference call with the customer.  They were having problems with the script.  It wasn’t registering all the vm’s.  After a few hours of troubleshooting I realized that I needed to go back and try to recreate the problem’s in my lab to fix the script, but the customer didn’t have that kind of time.

A short while after getting off the meeting with the customer I received an email from them stating not to worry they had gotten a shell script that worked.  Then I started to think…….  I went in to my lab and created a shell script that would do the job.  The shell script was 5 lines long as oppose to powershell script that is about 40 lines.

The shell script if anyone needs it looks like this:

for v in ‘find /vmfs/volumes/ -name “*.vmx” `
echo “Registering $v” >> /log/registeredvms.log
vmware-cmd -s register $v

So the short of the story is sometimes it is best to keep it simple stupid.  Utilizing powershell for this problem was just too much overkill and in the end there were issues that were overlooked that I still can’t reproduce in my lab.  A simple shell script is all that was required and what I should have originally decided on.

So in the end this is a lesson learned and hopefully it will prevent someone else from making the same mistake.

Business Continuity and Disaster Recovery with Virtualization

In the previous years Business Continuity and Disaster Recovery have been big buzz words. All companies small and large vowed to launch initiatives to implement either or both in their current IT strategies. My question is what happened? Why is it that I rarely see organizations that have implemented or even have a plan to implement Disaster Recovery?

Is it a lack of understanding? Is it that most companies believe it is to expensive or complicated to implement? Well it doesn’t have to be either. Most companies that are undergoing virtualization initiatives already have half if not more of what they need to implement Disaster Recovery. The simple fact is if you already have at least two data centers and are virtualizing you are a prime candidate. Here are some common question and my answers regarding this subject:

1.) Do I need to utilize SAN replication to implement Disaster Recovery in a virtualized environment?

No! There are other option to achieve Disaster Recovery without SAN replication. If you are running VMware you can utilize some of what you already have. VMware VCB in conjunction with VMware converter can be used to implement Disaster Recovery. Now this wouldn’t be as elegant as doing SAN replication but you could implement scheduled V2V’s of your Virtual Machines from one site to another and it’s a very simple solution to implement.

What about the hardware right….where do we get the additional hardware? The answer is simple reuse what you already have. Take those old servers you just freed up and put them to some good use. Beef them up! Need more ram in them tear ram out of some and add it to other, do the same with CPU’s to make a number of more power servers that you can use for DR. Granted you may need more of the reused servers to host all the vm’s needed but at the end of the day you would have a disaster recovery plan.

2.) What if I can’t do SAN replication but want synchronous and asynchronous replication?

This can still be achieved using software based replication in your virtual machines. Software like NSI Doubletake and Replistor provide this functionality at a a relatively low cost. With virtualization you can cut cost even more. With physical servers you traditionally needed to have a 1 to 1 mapping for replication which required a license for each host. With virtualization you can take a many to one approace cutting down on the licenses you need to replicate your data.

With this approach I would still use VCB or VMware converter to make weekly copies of your virtual machine OS drives. You can then utilize one of the mentioned applications (Doubletake or Replisor) to synchronous replication of your data volumes. You can achieve this and save licenses by installing say Doubletake on each of the source systems. The you would create a virtual machine at the DR site and add a drive to it for each of the source systems data volumes and replicate each source data to a different data volume on the destination vm. If you ever need to fail over just dismount the volumes from the destination vm and attach each one it’s respective vm that was created through the use of VCB or VMware converter.

3.) These methods are great but what would it take to bring an environment back up using them?

That’s rather hard to say because it depends on the size of your environment and how many vm’s you are relocating to your DR site. If your environment is large and you have specific SLA’s to adhere to regarding RTO (Recovery Time Objective’s) and RPO (Recovery Point Objectives) then you should consider SAN to SAN replication and utilizing something like VMware SRM which does an outstanding job of handling this. VMware SRM also allows you to run disaster recovery simulations to determine the effectiveness of your DR strategy that allows you to determine if you are meeting your SLA.

If you are doing DR on the cheap the real answer is to this question is you will be able to recover your systems a heck of a lot quicker than if had to restore via backups of rebuild your systems.

4.) This is great but where do we begin?

Don’t know where to begin, the answer is easy. Start small and grow into it. Find at least 2 servers that you can reuse beef’em up determine a configuration for them and deploy ESX to the servers. You need to have some infrastructure in place at your DR location to make DR work so that is a good place to start. You need to add the following service at your DR location:

  • Active Directory Servers
  • DNS Servers
  • NTP Servers
  • Virtual Center Server

It may be required to to deploy additional servers for your specific environment but I think you get the idea.

Next pick a few development machines or test machines that you can replicate to the DR site. Develop a plan and schedule down time and perform a test fail over to the remote site. Once you have work out the kinks and have a written DR plan determine your first phase of servers to incorporate into your DR site. Generally at this point you would want to pick some of your most valuable servers to ensure they are protected.

You can then break all your servers that need to be replicated into phases and determine the host requirements at the DR site and develop a plan for each phase of your DR implementation. It would be a good idea to have a remote replication vm for every 20 or so source vm’s. This really would depend on the data chance rate of your servers but 20 is a good starting point.

This article is obviously not all inclusive and is very high level but hopefully it inspires some of you to start developing a DR strategy and at least start testing some of these solutions in your environment because data is a terrible thing to waste.

ESX Datastore sizing and allocation

I have been seeing a lot of activity in the VMTN forums regarding datastore sizing and free space.  That said I decided to write a post about this topic.  There are endless possibilities when it comes to datastore sizing and configurations but I’m going to focus on a few keep points that should be considered when structuring your ESX datastores.

All VM files kept together

In this configuration all VM files are kept together on one datastore.  This includes the vmdk file for each drive allocated to the VM, the vmx file, log files, the nvram file, and the vswap file.  When storing virtual machines this way there are some key considerations that need to be taken into account.  You should always allow for 20% overhead on your datastores to allow enough space for snapshots and vmdk growth if necessary.   When allocating for this overhead you have to realize that when a VM is powered on a vswap file is created for the virtual machine equal in size to the VM’s memory.  This has to be accounted for when allocating your 20% overhead.

For Fiber Channel and iSCSI SAN’s you should also limit the number of VM’s per datastore to no more than 16.  WIth these types of datastores file locking and scsi reservations create extra overhead.  Limiting the number of VM’s to 16 or less reduces the risk of contention on the datastore.  So how big should you make your datastores?  That’s a good question and it will vary from environment to environment.  I always recommend 500GB as a good starting point.  This is not a number that works for everyone but I use it because it helps limit the number of vm’s per datastore.

Consider the following your standard VM template consist of two drives an OS drive and a Data drive.  Your OS drive is standardized at 25Gb and your Data drives default starts at 20Gb with larger drives when needed.  Your standard template also allocated 2Gb of memory to your VM.  Anticipating a max of 16 VM’s per datastore I would allocate as follows:

((osdrive + datadrive) * 16) = total vm disk space + (memory * 16) =vm disk & vswap + (16 * 100Mb(log files) = total VM space needed * 20% overhead

(25 + 20) * 16 = 720Gb + ((2Gb * 16)=32) = 752Gb + ((16 * 100mb) = 1.6Gb) = 753.6Gb * 20% = 904.32Gb Round up to 910Gb needed

Depending on how you carve up your storage you may want to bump this to 960Gb or 1024Gb so as you can see the 500Gb rule was proven wrong for this scenario.  The point is you should have a standardized OS and data partition to properly estimate and determine a standardized datastore size.  This will never be perfect as there will always be VM’s that are anomalies.

Keep in mind if you fill your datastore and don’t leave room for the vswp file that is created when a VM powers on you will not be able to power on the VM.  Also if you have a snapshot that grows to fill a datastore the VM will crash and your only option to commit the snapshot will be to add an extent to the datastore because you will need space to commit the changes.  Extents are not recommended should be avoided as much as possible.

Separate VM vswap files

There are a number of options available in Virtual Infrastructure on how to handle the VM’s vswap file.  You can set the location of this file at the vm, the ESX Server, or the cluster.  You can choose to locate it on a local datastore or one or more shared datastores. Below are some examples:

Assign a local datastore per ESX server for all VM’s running on that server.

This option allows you to utilize a local vmfs datastore to store the VM’s vswap saving valuable disk space.  When using a local datastore I recommend allocating enough storage for all the available memory in the host + 25% for memory over subscription.

Create one shared datastore per ESX cluster.

In this option you can set one datastore at the cluster level for all vswap files.  This allows you to create one large datastore and set the configuration option once and never worry about it again.  Again I would allocate enough space for the total amount of memory for the whole cluster +25% for over subscription.

Multiple shared datastores in a cluster.

In this option you have different scenarios.  You can have one shared datastore per esx hosts in the cluster or one datastore for every two servers in the cluster, etc..  You would need to assign the vswap datastore at the esx host level for this configuration.

Note: When moving the vswap to a separate location it can impact the performance of vmotion.  It could extend the amount of time it takes for the vm to fully migrate from one host to another.

Hybrid Configuration.

Just as it’s possible to locate the vswap on another datastore it is also possible to split the vmdk disks on to separate datastores.  For instance you could have datastores for:

OS Drives
Data Drives
Page Files
vSwap files

To achieve this you would tell the vm where to create the drive and have different datastores allocated for these different purposes.  This is especially handy when planning to implement DR.  This allows you to only replicate the data you want and skip the stuff you don’t like the vswap and page files.  With this configuration you can also have different replication strategies for the data drives an OS drives.

Hope you found this post useful.