Clustering and Exchange 2000
How does Exchange 2000 use clustering? The basic unit of failover for Exchange is the storage group (SG). When a node fails, all its SGs fail over to another node. This failover mechanism represents an interesting change from Exchange 5.5 clustering, in which the failover unit is a service, such as the Information Store (IS) or Message Transfer Agent (MTA). Exchange 2000's use of an SG as the failover unit simplifies the failover process: The IS on the receiving node simply needs to mount the SG and its databasesinstead of requiring you to start the store and wait for all the logs to play back.
Two DLLs form the basis of Exchange 2000's cluster support. Excluadm.dll ties Exchange to the Windows cluster manager, and exres.dll ties the Exchange services and resources to the cluster service's resource manager. Of course, much more is going on beneath the surface. Each cluster-ready Exchange 2000 component must use the proper APIs and cluster interfaces. Notice my use of the term "cluster-ready." Not all Exchange 2000 services can benefit from clustering. The System Attendant, IS, Routing service, and SMTP service are all cluster-ready. The MTA is cluster-ready but only in active/passive mode; if the MTA fails on one node, you must restart it from scratch on the other node. Services that you can't cluster include the Network News Transfer Protocol (NNTP) server, the Instant Messaging service, the Active Directory Connector (ADC), the chat service, and the Key Management Service (KMS). When you're designing your clustering strategy, keep in mind that a failover might still leave you with lost capacity.
Practical Considerations
The most obvious benefit of clustering is that it can provide better service by minimizing the effect of failures. Because users connect to a virtual serverwith a Messaging API (MAPI) profile or an Internet protocol clientwhen the underlying physical server goes offline, the client reconnects to the virtual server, now running on another box, and keeps working.
A second, less obvious benefit of clustering is that it lets you perform maintenance whenever you want. Consider the process of installing an Exchange service pack or updating your antivirus software: You must take down a production server, which means you need to perform the upgrade on Christmas Day (or another day when users won't scream about inaccessible email) or hurry through the process and hope that nothing goes wrong. By using Exchange clustering, you simply fail the Exchange virtual server over to another node and go about your business. Users continue to work as usual. After you finish your maintenance, you fail the node back to its original hardware.
Clustering has a few limitations that you need to be aware of as you plan. Originally, Microsoft didn't specify any firm limits for the number of concurrent users that active/active clusters can support. So, administrators tried stuffing as many users on a server as they could fit. If you have a two-node cluster, each node of which can handle 2000 concurrent users, one server will end up with 4000 concurrent users when a failover occursnot a recipe for continued Exchange server availability. To help solve this problem (which is exacerbated by some internal Exchange architecture considerations), Microsoft now recommends that you use N+1 clusters (i.e., active/passive clusters for two-node setups) with a maximum of 1500 concurrent users per node. Using an N+1 design will prepare you for future releases of Exchange components that might include improved clustering features.
Clustering isn't a panacea. The primary cause of cluster failures isn't hardware or softwareit's people. Clustering won't solve poor operational practices, such as failing to keep good backups, and it won't protect you from failures in your infrastructure, such as loss of power or Internet connectivity. But if you understand the underlying technologies and clustering's limitations, clustering can provide a more reliable Exchange experience for you and your users.
End of Article
janak April 20, 2004