High Availability Cluster with Pacemaker, Chronyd
Jump to navigation
Jump to search
Contents
Introduction
This guide documents the setup of a 4-node high availability cluster using:
- Pacemaker/Corosync for cluster management
- HAProxy for load balancing
- Chronyd for time synchronization
- Automated DNS updates during failover scenarios
Prerequisites
Hardware Requirements
- 4 identical Debian servers (≥ 2GB RAM, ≥ 2 CPU cores recommended)
- Network interfaces:
- Primary: 1Gbps (for client traffic)
- Secondary: 100Mbps minimum (for heartbeat)
Software Requirements
- Debian 10/11 (tested on both)
- Root access to all nodes
- DNS server supporting dynamic updates
Network Requirements
- Static IP assignments for all nodes
- Fully qualified domain names for each node
- Unrestricted communication on ports:
- 2224 (pcsd)
- 5404-5405 (corosync)
- 80/443 (HAProxy)
Initial Node Setup
1. System Preparation
On all nodes (node1-node4):
sudo apt update && sudo apt upgrade -y sudo apt install -y \ pacemaker \ pcs \ corosync \ crmsh \ haproxy \ chrony \ bind9utils \ mailutils \ vim
2. Hosts File Configuration
Add to /etc/hosts
on every node:
192.168.1.101 node1.cluster.local node1 192.168.1.102 node2.cluster.local node2 192.168.1.103 node3.cluster.local node3 192.168.1.104 node4.cluster.local node4
3. Firewall Configuration
sudo apt install -y ufw sudo ufw allow from 192.168.1.0/24 sudo ufw enable
Cluster Configuration
1. Initialize Cluster Services
sudo systemctl enable --now pcsd corosync pacemaker
2. Set hacluster Password
echo "hacluster:SecureP@ssw0rd123" | sudo chpasswd
3. Authenticate Nodes (from node1)
sudo pcs cluster auth \ node1.cluster.local \ node2.cluster.local \ node3.cluster.local \ node4.cluster.local \ -u hacluster \ -p SecureP@ssw0rd123 \ --force
4. Create Cluster
sudo pcs cluster setup \ --name haproxy_cluster \ node1.cluster.local \ node2.cluster.local \ node3.cluster.local \ node4.cluster.local \ --force
5. Start Cluster
sudo pcs cluster start --all sudo pcs cluster enable --all
6. Verify Status
sudo pcs status
Expected output:
Cluster name: haproxy_cluster Stack: corosync Current DC: node1.cluster.local (version x.x.x-x) 4 nodes configured 0 resources configured
HAProxy Setup
1. Install HAProxy on All Nodes
sudo apt install -y haproxy
2. Configuration Template (/etc/haproxy/haproxy.cfg)
global log /dev/log local0 log /dev/log local1 notice chroot /var/lib/haproxy stats socket /run/haproxy/admin.sock mode 660 level admin stats timeout 30s user haproxy group haproxy daemon maxconn 4000 tune.ssl.default-dh-param 2048 defaults log global mode http option httplog option dontlognull timeout connect 5s timeout client 50s timeout server 50s option forwardfor option http-server-close frontend http_front bind *:80 bind *:443 ssl crt /etc/ssl/private/example.com.pem http-request set-header X-Forwarded-Port %[dst_port] http-request add-header X-Forwarded-Proto https if { ssl_fc } default_backend http_back backend http_back balance roundrobin cookie SERVERID insert indirect nocache server webserver1 192.168.1.201:80 check cookie s1 server webserver2 192.168.1.202:80 check cookie s2 listen stats bind *:1936 stats enable stats uri / stats hide-version stats auth admin:SecureStatsP@ss
3. Add as Cluster Resource
sudo pcs resource create haproxy systemd:haproxy \ op monitor interval=10s timeout=20s \ op start interval=0s timeout=30s \ op stop interval=0s timeout=30s \ --group haproxy_group
Time Synchronization
1. Configure Chronyd (/etc/chrony/chrony.conf)
pool 0.debian.pool.ntp.org iburst pool 1.debian.pool.ntp.org iburst pool 2.debian.pool.ntp.org iburst pool 3.debian.pool.ntp.org iburst allow 192.168.1.0/24 leapsectz right/UTC makestep 1.0 3
2. Verify Time Sync
sudo systemctl restart chronyd sudo chronyc tracking
Failover Automation
1. DNS Update Script (/usr/local/bin/update_dns.sh)
#!/bin/bash CLUSTER_NAME="haproxy_cluster" DNS_RECORD="haproxy.example.com" DNS_SERVER="ns1.example.com" DNS_KEY="/etc/bind/cluster.key" TTL=300 # List your cluster node IPs here for reference (optional, not used in script logic) NODE_IPS=("192.168.1.101" "192.168.1.102" "192.168.1.103" "192.168.1.104") # Detect the IP of the current active node (Current DC) ACTIVE_NODE=$(sudo pcs status | grep "Current DC:" | awk '{print $4}') ACTIVE_NODE_IP=$(getent hosts $ACTIVE_NODE | awk '{ print $1 }') logger -t haproxy-cluster "Failover detected. Active node: $ACTIVE_NODE ($ACTIVE_NODE_IP)" nsupdate -k $DNS_KEY <<EOF server $DNS_SERVER update delete $DNS_RECORD A update add $DNS_RECORD $TTL A $ACTIVE_NODE_IP send EOF dig +short $DNS_RECORD @$DNS_SERVER
2. Make Script Executable
sudo chmod 750 /usr/local/bin/update_dns.sh sudo chown hacluster:haclient /usr/local/bin/update_dns.sh
3. Configure Pacemaker Resource
sudo pcs resource create dns_update ocf:pacemaker:ClusterMon \ user=hacluster \ extra_options="-e /usr/local/bin/update_dns.sh" \ op monitor interval=15s timeout=30s \ meta target-role=Started sudo pcs constraint colocation add dns_update with haproxy_group INFINITY sudo pcs constraint order haproxy_group then dns_update
Testing Procedures
1. Manual Failover Test
sudo pcs resource move haproxy_group node2.cluster.local sudo pcs status | grep -A5 "haproxy_group" sudo pcs resource clear haproxy_group
2. Simulate Node Failure
sudo systemctl stop corosync sudo pcs status watch -n 1 sudo pcs status
3. Verify DNS Updates
dig +short haproxy.example.com
Maintenance Operations
1. Node Maintenance
sudo pcs node standby node3.cluster.local sudo pcs node unstandby node3.cluster.local
2. Cluster Maintenance
sudo pcs cluster standby --all sudo pcs cluster unstandby --all
3. Adding/Removing Nodes
sudo pcs cluster node add node5.cluster.local sudo pcs cluster node remove node4.cluster.local
Troubleshooting
Common Issues
Corosync fails to start
sudo corosync-cfgtool -s sudo corosync-cmapctl | grep members
Split-brain scenario
sudo pcs property set stonith-enabled=true sudo pcs stonith create fence_ipmi fence_ipmilan ...
Resource failures
sudo pcs resource debug-start haproxy sudo pcs resource failcount show sudo pcs resource cleanup haproxy
Log Locations
- Corosync: /var/log/corosync/corosync.log
- Pacemaker: journalctl -u pacemaker
- HAProxy: /var/log/haproxy.log
Appendix
Sample DNS Key File (/etc/bind/cluster.key)
key "cluster-key" { algorithm hmac-sha256; secret "Base64EncodedKeyHere=="; };
Useful Commands
sudo pcs config show sudo pcs stonith show sudo pcs resource operations haproxy
References & Further Reading
- Pacemaker Documentation
- HAProxy Configuration Manual
- Debian Cluster Suite
- Corosync Documentation
- Chrony Documentation
This comprehensive guide includes: step-by-step instructions, configuration examples, automated failover procedures, maintenance and troubleshooting, and a reference appendix.