Internals of SQL Cluster Setup and Troubleshooting Tips
We will discuss and explain the following points in detail in this part of the article series:
- SQL Server Setup Functions
- How does SQL Server cluster setup configure a second node to be cluster-aware?
- What all-registry entries are created during the setup and their importance in a SQL Cluster environment?
I will start with the first topic mentioned in the above list.
SQL Server Setup Functions
SQL Setup mainly uses the SQLCluster.DLL from its BINN Directory to execute the following functions in it:
The explanation of each function is given below:
DoSQLClusterSetUpWork: This function is defined in the SQLCluster.dll module. It enables users to install the new SQL Instance in the cluster. Using this function, the setup performs a couple of tasks including gathering the list of nodes, list of disks, list of network interfaces etc. If you add any new disks to the cluster while the SQL Server setup is running, the disk will not be recognized by the SQL setup process unless you exit the setup and start it again from scratch. This is because the DoSQLClusterSetUpWork function executed from SQLCluster.dll gathers the cluster configuration information before it comes to select the disks or nodes available in the SQL Setup.
DoUnClusterSetup: This function's job is to prepare your setup to uninstall the SQL Instance from the cluster.
CheckDatabasesForInstance: The responsibility of the last function in SQLCluster.DLL is to make sure no instances of the same name are available in the cluster. This function is useful when running SQL Setup in a cluster environment or in a stand-alone environment. You cannot install two duplicate instances of the SQL. This function makes sure there is no duplicate SQL Instance in the cluster. When you run SQL Setup, the function checks the available SQL Instances, and based on that it will make sure the user-supplied instance is not already created in the SQL cluster.
How Does SQL Server Cluster Setup Configure a Second Node to Be Cluster-Aware?
A second node is generally a passive node on which SQL Setup, when installing SQL database instance in a cluster environment, will make it cluster-aware. We will explain this in detail. The process is very tricky for SQL Server setup when configuring a second node to be cluster-aware in a SQL Cluster. The setup performs several steps that are internal to SQL Setup. The On-Screen setup doesn't really tell what setup is actually doing in the background. These are logged in the SQL Setup log files. We will limit our discussion and will not discuss anything about the Setup Log files here.
The next part of this article series will explain SQL Setup log files.
For an SQL Node to be cluster-aware, SQL Setup must perform the following steps to ensure nodes in the cluster will be operable after installation. The Setup makes sure:
- The binary files have been installed and copied, and their modules are not registered with the system.
- The necessary registry entries have been created.
- The necessary SQL Cluster Modules have been copied.
- The necessary SQL Cluster Resource Types have been created and registered with clustering software.
- The SQL Service instances have been created in the registry.
- The necessary client registry entries have been created.
Setup should copy the binary files to the list of nodes to make sure when the failover occurs these binary files re-register themselves automatically and are ready to host the SQL database instance. For example, on a passive node (a node that doesn't hold the SQL resources currently and not serving any client), the SQL Server Agent and SQL Service modules are installed and their services will be in stopped state. Their modules also are not registered with the OS.
In a cluster environment the things work differently. The SQL Agent module located at
HKEY_LOCAL_MACHINESYSTEMCurrentControlsetServicesSQLSERVERAGENT is not registered with the system on a passive node. Rather, I would say the SQL Agent Service is not started. The module for SQLSERVERAGENT is sqlagent.exe. Even if you try to start the Agent service or SQL Instance Service you will get the error message. This is because the things are different in a cluster environment. The SQL Service, when you start, will try to mount its databases, which are located on Active Node this time. The Active node or the cluster will not allow the Passive node to gain control of the disks on which the SQL database is hosted. This is because of the clustering technologies in use. Microsoft clustering is a shared-nothing model in which only one node can gain control of one disk at a time.
When you try to start SQL Service on a passive node the service will search the registry at
HKEY_LCOAL_MACHINESOFTWAREMicrosoftMSSQLServerSetup to find the location of its database. If it can't find them, it will simply fail. In my setup, the SQL Server Service will mount the database from the Z: drive as shown in the below figure:
|SQL Database location in registry|
Note SQL Service will always read the master database and then query the user databases as shown in the below figure. The registry key for master database is located at
|SQL Master.mdf path in the registry|
Troubleshooting Tip: If you get an error message related to the database on an Active Node while starting the SQL Service you must check the location of your database at the above-mentioned registry key.
The SQL Setup copies the SQL binary files to the location you mentioned during the setup. The SQL Services and its sub-components identify the binary files location by querying the value of SQLPath entry in the registry at
What All-Registry Entries Are Created During the Setup and Their Importance in SQL Cluster and Stand-Alone?
Setup also creates the necessary registry entries to make sure SQL Services can search what they want when they are about to be active on a passive node. The purpose of these registry entries is to help client computers and SQL Services to locate the Virtual IP Address of the SQL Cluster and location of the SQL Databases respectively.
The following registry entries, as shown in the below figure, are created by the SQL Setup on each node:
|Non-Cluster-Specific Registry Entries|
As shown in above figure, the non-cluster specific registry entries are created generally by the SQL Setup to make sure the SQL Services can serve the client's requests coming over the network via Port 1433 as defined in the registry entry: DefaultPort. This is the default port client computers use to communicate with the SQL Instance. These registry entries are not related to the cluster in anyway. These registry entries are not important in failover also. This is because the registry entries created under the above key are meant only for the SQL Services (SQL Instance Service and SQL Agent Service).
|Cluster-Specific Registry Entries|
The cluster-specific registry entries play an important role on an active node. A client computer querying a SQL Server over the network will always connect to the Virtual IP Address listed in the "ClusterIPAddr" registry entry as shown in Figure 4. The Setup registers the host-IP mapping with the DNS Server configured. Without these registry entries it is not possible for a client computer to find the location of the SQL Virtual IP Address in the cluster.
When a node becomes active for the first time, the SQL engine configures the "Cluster-specific registry entries" on the Active Node to make sure the client requests coming for "ClusterIPAddr" are served by this node. Generally, these registry entries are created when a failover occurs.
Troubleshooting Tip: If your client computers are unable to access SQL Instance in a cluster, always make sure the above registry entries are created in the Active Node of the cluster where the SQL Instance is currently running.
This article examined the functions of the SQL Server Setup and how SQL Setup configures a node to be cluster-aware in the cluster. We also saw the necessary registry entries created by the SQL Setup. Finally, we saw the importance of the registry entries in a SQL Cluster. The next article of this series will continue to explore the following points:
- How do registry entries play an important role in failover?
- What all SQL "Cluster" modules are copied during the setup and their importance in a SQL Cluster?
- What all SQL modules are loaded when a SQL Server Instance is running as Active Node?
- A technique to manually uninstall SQL Server Instance from a cluster (without running SQL Server Setup wizard)
- What has been changed in SQL Server 2005 setup process for cluster?
- SQL Server Cluster Setup Troubleshooting Techniques.
- SQL Server Setup Log Files.