Saturday, June 30, 2007

The vmHA Project

Welcome to my blog for the vmHA project. The purpose of this project is to create a Linux-based platform for hosting virtual machines that provides high availability services for those VMs. Right now I'm basing the project on VMware's Server product (formerly GSX Server). I'm most familiar with the VMware Server VM format and the tools used for it, so I'm going to start with that and then hopefully expand out to Xen and some of the other virtualization products. I'm attempting to build this as an alternative to expensive enterprise products, like VMware's ESX Server. Don't get me wrong - I'm a HUGE ESX fan, but not everyone can afford ESX and I'd like to be able to create some sort of alternative for those people that is still very easy to manage and still provides high availability for virtual machines.

Among the things I'll be attempting to implement are a shared/cluster filesystem for storing the VMs, a heartbeat/LinuxHA setup for virtual machines so that VMs running on a server will be automatically restarted on another server in the cluster if one of the servers fails, an easy-to-use web-based management interface, and quick moving of a VM from one server to another via suspend/resume of the VM.

For the base platform I've chosen rPath linux, mainly due to the ease and speed of updates, rollback capabilities, ease of package management and building new packets, and the rPath appliance agent for managing the server. I've already built quite a few packages for this distribution that are missing from the main repositories.

I've decided on OCFS2 as the cluster filesystem that I'll support in this project, at least initially. OCFS2 was one of two choices - GFS2 being the other choice. I initially chose these two filesystems due to their ease of configuration. There are other cluster filesystems out there, but many of them are very conviluted in their configuration, which is especially inappropriate when ease-of-configuration is one of my goals. I eventually chose OCFS2 because GFS2 is currently unstable in the current Linux 2.6 kernels, at least the kernel versions that are considered stable, anyway, and the ones supported by VMware Server. This is unfortunate because I like the fact that GFS2 supports ACLs and extended attributes, but for now OCFS2 is going to have to work.

The heartbeat setup will allow the operator to choose the method(s) of hearbeat communications - serial port, dedicated ethernet port, or unicast over an existing ethernet port. Obviously the operator must provide the right type of cable. When users create VMs via the web interface they will be given the option of making the VM an "HA" VM, in which case a heartbeat configuration will be created for that VM. The heartbeat configuration will allow heartbeat to monitor the status of the VM and the VM host machine periodically and start the VM on a different machine if something should happen to it.

Well, that's about all I can think of for the initial post. I'm sure I'll think of other things, later, and I'll make some additional posts. If you'd like to see what I've done so far on the project, visit the rBuilder project page at http://www.rpath.org/rbuilder/project/vmware-ha/.