High Availability Cluster with Pacemaker, Chronyd
Contents
- 1 High Availability Cluster with Pacemaker, Chronyd, and HAProxy
High Availability Cluster with Pacemaker, Chronyd, and HAProxy
A Comprehensive Guide for Debian Systems
Introduction
This guide documents the setup of a 4-node high availability cluster using:
- Pacemaker/Corosync for cluster management
- HAProxy for load balancing
- Chronyd for time synchronization
- Automated DNS updates during failover scenarios
Prerequisites
Hardware Requirements
- 4 identical Debian servers (≥ 2GB RAM, ≥ 2 CPU cores recommended)
- Network interfaces:
- Primary: 1Gbps (for client traffic)
- Secondary: 100Mbps minimum (for heartbeat)
Software Requirements
- Debian 10/11 (tested on both)
- Root access to all nodes
- DNS server supporting dynamic updates
Network Requirements
- Static IP assignments for all nodes
- Fully qualified domain names for each node
- Unrestricted communication on ports:
- 2224 (pcsd)
- 5404-5405 (corosync)
- 80/443 (HAProxy)
Initial Node Setup
1. System Preparation
On all nodes (node1-node4):
<syntaxhighlight lang="bash">
- Update system
sudo apt update && sudo apt upgrade -y
- Install base packages
sudo apt install -y \
pacemaker \ pcs \ corosync \ crmsh \ haproxy \ chrony \ bind9utils \ mailutils \ vim
</syntaxhighlight>
2. Hosts File Configuration
Add to /etc/hosts on every node:
192.168.1.101 node1.cluster.local node1 192.168.1.102 node2.cluster.local node2 192.168.1.103 node3.cluster.local node3 192.168.1.104 node4.cluster.local node4
3. Firewall Configuration
<syntaxhighlight lang="bash"> sudo apt install -y ufw sudo ufw allow from 192.168.1.0/24 sudo ufw enable </syntaxhighlight>
Cluster Configuration
1. Initialize Cluster Services
<syntaxhighlight lang="bash"> sudo systemctl enable --now pcsd corosync pacemaker </syntaxhighlight>
2. Set hacluster Password
<syntaxhighlight lang="bash"> echo "hacluster:SecureP@ssw0rd123" | sudo chpasswd </syntaxhighlight>
3. Authenticate Nodes (from node1)
<syntaxhighlight lang="bash"> sudo pcs cluster auth \
node1.cluster.local \ node2.cluster.local \ node3.cluster.local \ node4.cluster.local \ -u hacluster \ -p SecureP@ssw0rd123 \ --force
</syntaxhighlight>
4. Create Cluster
<syntaxhighlight lang="bash"> sudo pcs cluster setup \
--name haproxy_cluster \ node1.cluster.local \ node2.cluster.local \ node3.cluster.local \ node4.cluster.local \ --force
</syntaxhighlight>
5. Start Cluster
<syntaxhighlight lang="bash"> sudo pcs cluster start --all sudo pcs cluster enable --all </syntaxhighlight>
6. Verify Status
<syntaxhighlight lang="bash"> sudo pcs status </syntaxhighlight> Expected output:
Cluster name: haproxy_cluster Stack: corosync Current DC: node1.cluster.local (version x.x.x-x) 4 nodes configured 0 resources configured
HAProxy Setup
1. Install HAProxy on All Nodes
<syntaxhighlight lang="bash"> sudo apt install -y haproxy </syntaxhighlight>
2. Configuration Template (/etc/haproxy/haproxy.cfg)
<syntaxhighlight lang="conf"> global
log /dev/log local0 log /dev/log local1 notice chroot /var/lib/haproxy stats socket /run/haproxy/admin.sock mode 660 level admin stats timeout 30s user haproxy group haproxy daemon maxconn 4000 tune.ssl.default-dh-param 2048
defaults
log global mode http option httplog option dontlognull timeout connect 5s timeout client 50s timeout server 50s option forwardfor option http-server-close
frontend http_front
bind *:80 bind *:443 ssl crt /etc/ssl/private/example.com.pem http-request set-header X-Forwarded-Port %[dst_port] http-request add-header X-Forwarded-Proto https if { ssl_fc } default_backend http_back
backend http_back
balance roundrobin cookie SERVERID insert indirect nocache server webserver1 192.168.1.201:80 check cookie s1 server webserver2 192.168.1.202:80 check cookie s2
listen stats
bind *:1936 stats enable stats uri / stats hide-version stats auth admin:SecureStatsP@ss
</syntaxhighlight>
3. Add as Cluster Resource
<syntaxhighlight lang="bash"> sudo pcs resource create haproxy systemd:haproxy \
op monitor interval=10s timeout=20s \ op start interval=0s timeout=30s \ op stop interval=0s timeout=30s \ --group haproxy_group
</syntaxhighlight>
Time Synchronization
1. Configure Chronyd (/etc/chrony/chrony.conf)
<syntaxhighlight lang="conf"> pool 0.debian.pool.ntp.org iburst pool 1.debian.pool.ntp.org iburst pool 2.debian.pool.ntp.org iburst pool 3.debian.pool.ntp.org iburst
allow 192.168.1.0/24 leapsectz right/UTC makestep 1.0 3 </syntaxhighlight>
2. Verify Time Sync
<syntaxhighlight lang="bash"> sudo systemctl restart chronyd sudo chronyc tracking </syntaxhighlight>
Failover Automation
1. DNS Update Script (/usr/local/bin/update_dns.sh)
<syntaxhighlight lang="bash">
- !/bin/bash
CLUSTER_NAME="haproxy_cluster" VIP="192.168.1.100" DNS_RECORD="haproxy.example.com" DNS_SERVER="ns1.example.com" DNS_KEY="/etc/bind/cluster.key" TTL=300
ACTIVE_NODE=$(sudo pcs status | grep "Current DC:" | awk '{print $4}' logger -t haproxy-cluster "Failover detected. Active node: $ACTIVE_NODE"
nsupdate -k $DNS_KEY <<EOF server $DNS_SERVER update delete $DNS_RECORD A update add $DNS_RECORD $TTL A $VIP send EOF
dig +short $DNS_RECORD @$DNS_SERVER </syntaxhighlight>
2. Make Script Executable
<syntaxhighlight lang="bash"> sudo chmod 750 /usr/local/bin/update_dns.sh sudo chown hacluster:haclient /usr/local/bin/update_dns.sh </syntaxhighlight>
3. Configure Pacemaker Resource
<syntaxhighlight lang="bash"> sudo pcs resource create dns_update ocf:pacemaker:ClusterMon \
user=hacluster \ extra_options="-e /usr/local/bin/update_dns.sh" \ op monitor interval=15s timeout=30s \ meta target-role=Started
sudo pcs constraint colocation add dns_update with haproxy_group INFINITY sudo pcs constraint order haproxy_group then dns_update </syntaxhighlight>
Testing Procedures
1. Manual Failover Test
<syntaxhighlight lang="bash"> sudo pcs resource move haproxy_group node2.cluster.local sudo pcs status | grep -A5 "haproxy_group" sudo pcs resource clear haproxy_group </syntaxhighlight>
2. Simulate Node Failure
<syntaxhighlight lang="bash"> sudo systemctl stop corosync sudo pcs status watch -n 1 sudo pcs status </syntaxhighlight>
3. Verify DNS Updates
<syntaxhighlight lang="bash"> dig +short haproxy.example.com </syntaxhighlight>
Maintenance Operations
1. Node Maintenance
<syntaxhighlight lang="bash"> sudo pcs node standby node3.cluster.local sudo pcs node unstandby node3.cluster.local </syntaxhighlight>
2. Cluster Maintenance
<syntaxhighlight lang="bash"> sudo pcs cluster standby --all sudo pcs cluster unstandby --all </syntaxhighlight>
3. Adding/Removing Nodes
<syntaxhighlight lang="bash"> sudo pcs cluster node add node5.cluster.local sudo pcs cluster node remove node4.cluster.local </syntaxhighlight>
Troubleshooting
Common Issues
Corosync fails to start <syntaxhighlight lang="bash"> sudo corosync-cfgtool -s sudo corosync-cmapctl | grep members </syntaxhighlight>
Split-brain scenario <syntaxhighlight lang="bash"> sudo pcs property set stonith-enabled=true sudo pcs stonith create fence_ipmi fence_ipmilan ... </syntaxhighlight>
Resource failures <syntaxhighlight lang="bash"> sudo pcs resource debug-start haproxy sudo pcs resource failcount show sudo pcs resource cleanup haproxy </syntaxhighlight>
Log Locations
- Corosync: /var/log/corosync/corosync.log
- Pacemaker: journalctl -u pacemaker
- HAProxy: /var/log/haproxy.log
Appendix
Sample DNS Key File (/etc/bind/cluster.key)
<syntaxhighlight lang="conf"> key "cluster-key" {
algorithm hmac-sha256; secret "Base64EncodedKeyHere==";
}; </syntaxhighlight>
Useful Commands
<syntaxhighlight lang="bash"> sudo pcs config show sudo pcs stonith show sudo pcs resource operations haproxy </syntaxhighlight>
References & Further Reading
- Pacemaker Documentation
- HAProxy Configuration Manual
- Debian Cluster Suite
- Corosync Documentation
- Chrony Documentation
This comprehensive guide includes: step-by-step instructions, configuration examples, automated failover procedures, maintenance and troubleshooting, and a reference appendix.