To test failover, I configured LifeKeeper on the primary server to fail over upon shutdown, then shut down the server. In less than a minute, my file share was again accessible through the IP address I had assigned to the virtual server name I had created and defined in DNS. The share took about 5 minutes to become available through the DNS resource—the virtual host name that LifeKeeper altered to point to the active server—because my version of the DNS record hadn’t expired, even though LifeKeeper had updated the DNS server with the new IP address. After bringing the primary server back up, I failed the resources back by bringing the top level resource back in service on the primary server. Again, it took only a minute for the IP resource to be accessible again, and a few minutes more for the DNS resource.
In spite of my successful definition of resources and failover testing, the LifeKeeper GUI showed both the primary and standby servers in a “warning” state. Through trial and error, and subsequently confirming this in the documentation, I figured out that LifeKeeper really wants you to define more than one heartbeat communication path. The warning icon changed to the OK icon after I defined an additional heartbeat path on the primary IP network.
Implementing and testing the Microsoft IIS Recovery Kit was similarly easy, although you must accommodate the kit’s prerequisites. First, the kit supports failover only to standby servers on the same logical LAN segment, as it works by using the IP recovery kit to move the Web site’s IP address to the standby server.
Currently the IIS recovery kit supports IIS 5.0 and 6.0. For my test, I configured two virtual servers with a disk volume shared on a common SCSI bus and placed the Web site files on this volume. I selected a free IP address for the Web site, defined a virtual host name for that address in DNS, and configured IIS for the Web site on both servers to use that IP address. Using the LifeKeeper administrative GUI, I created the IP and IIS dependent upon the volume resource. LifeKeeper added the switchable IP address to the primary server’s IP configuration and showed the Web site as “protected.”
I tested failover in several ways—by simulating power-off for the primary server, by bringing the Web site resource “in service” on the standby server, and by configuring the primary server to fail over upon shutdown. In all cases, the switchover proceeded smoothly, making the Web site accessible on the new server in only a minute or two.
LifeKeeper’s documentation is informative and useful. The Planning and Installation Guide thoroughly describes how to install, uninstall, configure, and troubleshoot. It also includes a nice introduction to the online documentation, where more detailed configuration and administration procedures are documented. The documentation for SteelEye Data Replication is similarly well done, though I had little need to reference it. From the perspective of the Protection Suite, the PDF documentation lacks integration, and only the online documentation based on the .chm file seems to reflect the combined feature set.
Worthy of Your Short List
Overall, I found that LPSW worked well and was easy to implement. However, although the documentation for the underlying software components and recovery kits was well organized and easy to follow, it lacked the level of integration you would expect, considering the single-product image that SteelEye is marketing. The wizards made creation of resources and dependencies easy, though you really need to use the documentation to see which resources and dependencies you need to create to protect your application. I liked LPSW’s support for both replicated and shared storage, its ease of configuration for both scenarios, and its support for more complex failover scenarios involving multiple local and remote servers. If you’re looking for an easy-to-implement high-availability solution, I recommend that you put LPSW on your short list.
End of Article