High Availability (Active/Passive) sipXecs implementation using existing iSCSI shared storage and CentOS clustering.
Caveats
- It should be noted this is a highly complex topic.
- This document does not cover all aspects of clustering and failover scenarios and is not intended to go through the exact installation steps.
- It is advisable to keep the cluster nodes on their own subnet as each node broadcasts "heartbeats" to the other node to ensure it is still alive.
System requirements
- Two high power servers (8 GB RAM minimum) with three NICs in each
- Dedicated networks for each service
- Fencing device on each server such as Dell DRAC or HP iLO. This is so if a cluster node is misbehaving it is immediately powered off instead of causing data corruption as well as to trigger proper failover.
Example Diagram
The following is a diagram of the clustering scenario we will be setting up:
Install OS
Install CentOS 5.6 on two identical servers. You may also configure the NICs at this point. One NIC will be for the iSCSI storage network (eth2), one will be for clustering heartbeat (eth1), and one NIC will be for all other communications, including sipXecs (eth0).
...
When prompted for software selection, deselect desktop - – Gnome and select Clustering and Storage Clustering. Perform the install.
Post Installation
After each system reboots you'll need to disable the firewall and turn of SELinux. This is done when the Setup Agent appears. Choose Firewall, then disable all security settings.
...
Code Block |
---|
yum install xauth |
Connect to Shared iSCSI disk
Each server needs to have iSCSI turned on to connect. Run the following commands to enable iSCSI:
Code Block |
---|
chkconfig iscsi on onserviceservice iscsi start |
On each server you need to connect to a blank (fresh) shared iSCSI disk. For example, if your iSCSI shared disk is located at IP address 172.16.5.10 you would run the following command on each server:
...
You will now see the addition of a drive to the server, usually /dev/sda or something similar.
Configure LVM to allow for filesystem clustering
Edit /etc/lvm/lvm.conf on each node and change the following line:
...
and run the following command for this change to take effect:
Code Block |
---|
vgscan |
Add Clustered Network Script
Create a file on both nodes called /usr/local/bin/clusnet and fill it will the following, modifying SIPX_IP='172.16.1.5' on line 15 to the be the IP address of the sipXecs system, as well as the Ethernet interfaces for your services:
Code Block |
---|
#!/bin/bash # # # # Simple script for cluster services to use for bare metal ethernet interface # # chkconfig: 345 89 14 # description: Starts and stops the clustered service network interface # processname: net # # Source function library. . /etc/init.d/functions SIPX_IP='172.16.1.5' SIPX_INTERFACE=eth0 CLUS_INTERFACE=eth2 ISCSI_INTERFACE=eth1 PROG_NAME=clusnet RETVAL=0 DEFAULT_GW=`cat /etc/sysconfig/network-scripts/ifcfg-$SIPX_INTERFACE | grep GATEWAY | sed 's/GATEWAY=//g'` SUBNET_MASK=`cat /etc/sysconfig/network-scripts/ifcfg-$SIPX_INTERFACE | grep NETMASK | sed 's/NETMASK=//g'` INTUP=`/sbin/ifconfig | grep -c $SIPX_INTERFACE`; # <define any local shell functions used by the code that follows> start() { echo -n "Starting $SIPX_INTERFACE: " ip address flush dev eth0 sleep 3 ipifconfig address$SIPX_INTERFACE add $SIPX_IP/ netmask $SUBNET_MASK brd + dev eth0 ip up sleep 3 route add default viagw $DEFAULT_GW dev $SIPX_INTERFACE echo echo -n $"$SIPX_INTERFACE is up."; success $"$SIPX_INTERFACE is up."; echo chkconfig postgresql off return $RETVAL } stop() { echo -n "Shutting down $SIPX_INTERFACE: " ifdown $SIPX_INTERFACE ifup $SIPX_INTERFACE ifup $ISCSI_INTERFACE ifup $CLUS_INTERFACE echo echo -n $"Interface $SIPX_INTERFACE is down."; success $"Interface $SIPX_INTERFACE is down."; echo chkconfig postgresql off return $RETVAL } getstatus() { if [ $INTUP -lt 1 ] then echo echo -n $"Interface $SIPX_INTERFACE is down." failure $"Interface $SIPX_INTERFACE is down." echo return 1; fi echo echo -n $"Interface $SIPX_INTERFACE is up."; success $"Interface $SIPX_INTERFACE is up."; echo return $RETVAL } case "$1" in start) start ;; stop) stop ;; status) getstatus ;; restart) stop sleep 5 start ;; *) echo "Usage: $PROG_NAME {start|stop|status|restart}" exit 1 ;; esac exit $RETVAL |
...
Note |
---|
This script replaces your primary communication network's IP address with that of the sipXecs servers IP address. This is so there will not be any IP address conflict within sipXecs services. |
Configure Clustering
Initial cluster configuration is done with the utility system-configure-cluster which, if you're running a headless (text only) install, will require the use of X forwarding via PuTTY and Xming. Download Xming from http://sourceforge.net/projects/xming/files/Xming/6.9.0.31/Xming-6-9-0-31-setup.exe/download and install it, then start it. In PuTTY, enable the option Enable X11 Forwarding under Connection >> SSH >> X11. You may then connect to the primary node server and run the following command:
...
For a simple two node cluster it is not necessary to use a quorum disk. Click OK.
Add Nodes
We now need to add all of our cluster information into the cluster configuration manager the first thing we need to do is add our cluster nodes. To do this click on Cluster Nodes and click Add a Cluster Node. Be sure to use IP addresses and not DNS host names :
Click OK then add the second node in the same fashion.
Add Fencing Device
In the system requirements section of this document it was noted that you would need some sort of manual fencing device to enable the cluster to power down a troublesome node. To add a fencing device, click on Fence Devices then click on Add a Fencing Device. You will now need to select the type of fencing device you have installed:
...
Add one fencing device into the cluster manager for each iLO/DRAC you have (for two servers, you'd have two fencing devices).
Fencing, however, is beyond the scope of this document. For demonstration purposes we will use manual fencing.
Add Fencing Device to Cluster Nodes
Each cluster node needs to have its own fencing device assigned to itself so that the cluster manager knows how to power down the device. To add a fencing device to a cluster node, click on Cluster Nodes, click on the cluster node you wish to modify, then click Manage Fencing For This Node. You will now see the following:
...
Click Close to return to the cluster configuration system. Repeat the cluster node fence configuration for the other node in this cluster.
Add Resources
All items utilized by the cluster nodes are known as resources. In our case, our resources are:
...
- Name: clusnet
- File (with path): /usr/local/bin/clusnet
Add Cluster Service
Once you've created all the necessary resources you must now assign them to a cluster service.
...
For now we don't want this service to automatically start when the cluster service starts so we need to disable automatic startup for this service. To do this uncheck the Autostart This Service checkbox, then click Close
Save cluster configuration
To save the cluster configuration, click on File then click on Save and save the configuration to /etc/cluster/cluster.conf
Propagate the Cluster Configuration
For the initial propagation of the cluster configuration to the second cluster node you will need to manually copy /etc/cluster/cluster.conf from the primary node to the secondary node. This can be done with SCP or by copying and pasting the contents of the file.
Start Clustering Service
To start the clustering service and all of the necessary dependencies, run the following commands on both nodes:
...
Code Block |
---|
Cluster Status for uc_cluster @ FriSun Jul 13 0014:5646:2341 2011 2011MemberMember Status: Quorate Member Name ID Status ------ ---- ---- ------ 172.16.14.65 1 Online, Local, rgmanager rgmanager172172.16.14.8 2 Online, rgmanager Service Name Owner (Last) State ------- ---- ----- ------ ----- service:sipXpbx (172.16.1.6)4.5 disabled |
Create Clustered Storage
As of this writing the best filesystem to use for cluster storage is GFS. This is because unlike most other filesystems, GFS is safe to mount on both nodes at the same time without risking data corruption. This allows for quick failover with less risk of data corruption.
Create LVM
GFS requires LVM to operate properly, so we will need to create a LVM physical volume (PV), volume group (VG), and logical volumes (LV). This only needs to be done on one node of the cluster, preferably the primary node.
...
Code Block |
---|
lvcreate -L5G -nopenfire sipx-vol lvcreate -L5G -nfreeswitch sipx-vol lvcreate -L30G -nsipxdata sipx-vol lvcreate -L30G -npgsql sipx-vol |
Create GFS file systems
Run the following command to create a GFS partition on the etc_sipxpbx volume (this only needs to be done on one node of the cluster, preferably the primary node):
...
Code Block |
---|
gfs_mkfs -p lock_dlm -t uc_cluster:freeswitch -j 2 /dev/sipx-vol/freeswitch gfs_mkfs -p lock_dlm -t uc_cluster:openfire -j 2 /dev/sipx-vol/openfire gfs_mkfs -p lock_dlm -t uc_cluster:pgsql -j 2 /dev/sipx-vol/pgsql gfs_mkfs -p lock_dlm -t uc_cluster:sipxdata -j 2 /dev/sipx-vol/sipxdata |
Mount Filesystems in Appropriate Locations
First we need to create the directories where these filesystems will be located. Run the following commands on both nodes:
...
Code Block |
---|
/dev/sipx-vol/freeswitch /opt/freeswitch gfs defaults 0 0 /dev/sipx-vol/openfire /opt/openfire gfs defaults 0 0 /dev/sipx-vol/sipxdata /var/sipxdata gfs defaults 0 0 /dev/sipx-vol/etc_sipxpbx /etc/sipxpbx gfs defaults 0 0 /dev/sipx-vol/pgsql /var/lib/pgsql gfs defaults 0 0 |
...
Code Block |
---|
mount /etc/sipxpbx mount /opt/freeswitch mount /opt/openfire mount /var/sipxdata mount /var/lib/pgsql |
Install sipXecs Packages
Now that we have all the shared storage mounted we can install and set up sipXecs. First we need to install the repository. Edit /etc/yum.repos.d/sipxecs.repo on both nodes and add the following information:
...
Note |
---|
In the unlikely event that your postgres and sipxchange UID and GID do not match up on the two nodes, you will need to change /etc/passwd and /etc/group on the first node to match the values present on the second node. This is because the installation is performed last on the second node and filesystem permissions are set to the UID of the user on the second node. |
Run sipXecs setup
On the primary node, enable the sipXecs NIC buy running the following command:
...
You'll also need to change the IP address of your primary node's communication interface (eth0) to an IP address that's different than the sipXecs system's IP address (but on the same subnet). This is so sipXecs can function on both nodes.back to the original settings by running:
Code Block |
---|
/usr/local/bin/clusnet stop |
DNS Caching Nameserver Configuration
Because running sipxecs-setup-system would have broken many things we'll have to perform a few steps involving DNS manually.
...
Reboot both nodes simultaneously for all configuration changes to take effect.
Starting sipXecs Cluster Service
To start sipXecs you'll need to start system-config-cluster then click on the Cluster Management tab at the top of the window. Here you'll see the active nodes and services:
...