High Availability Cluster with Pacemaker, Chronyd
Jump to navigation
Jump to search
Contents
Introduction
This guide documents the setup of a 4-node high availability cluster using:
- Pacemaker/Corosync for cluster management
- HAProxy for load balancing
- Chronyd for time synchronization
- Automated DNS updates during failover scenarios
Prerequisites
Hardware Requirements
- 4 identical Debian servers (≥ 2GB RAM, ≥ 2 CPU cores recommended)
- Network interfaces:
- Primary: 1Gbps (for client traffic)
- Secondary: 100Mbps minimum (for heartbeat)
Software Requirements
- Debian 10/11 (tested on both)
- Root access to all nodes
- DNS server supporting dynamic updates
Network Requirements
- Static IP assignments for all nodes
- Fully qualified domain names for each node
- Unrestricted communication on ports:
- 2224 (pcsd)
- 5404-5405 (corosync)
- 80/443 (HAProxy)
Initial Node Setup
1. System Preparation
On all nodes (node1-node4):
sudo apt update && sudo apt upgrade -y
sudo apt install -y \
pacemaker \
pcs \
corosync \
crmsh \
haproxy \
chrony \
bind9utils \
mailutils \
vim
2. Hosts File Configuration
Add to /etc/hosts on every node:
192.168.1.101 node1.cluster.local node1 192.168.1.102 node2.cluster.local node2 192.168.1.103 node3.cluster.local node3 192.168.1.104 node4.cluster.local node4
3. Firewall Configuration
sudo apt install -y ufw sudo ufw allow from 192.168.1.0/24 sudo ufw enable
Cluster Configuration
1. Initialize Cluster Services
sudo systemctl enable --now pcsd corosync pacemaker
2. Set hacluster Password
echo "hacluster:SecureP@ssw0rd123" | sudo chpasswd
3. Authenticate Nodes (from node1)
sudo pcs cluster auth \
node1.cluster.local \
node2.cluster.local \
node3.cluster.local \
node4.cluster.local \
-u hacluster \
-p SecureP@ssw0rd123 \
--force
4. Create Cluster
sudo pcs cluster setup \
--name haproxy_cluster \
node1.cluster.local \
node2.cluster.local \
node3.cluster.local \
node4.cluster.local \
--force
5. Start Cluster
sudo pcs cluster start --all sudo pcs cluster enable --all
6. Verify Status
sudo pcs status
Expected output:
Cluster name: haproxy_cluster Stack: corosync Current DC: node1.cluster.local (version x.x.x-x) 4 nodes configured 0 resources configured
HAProxy Setup
1. Install HAProxy on All Nodes
sudo apt install -y haproxy
2. Configuration Template (/etc/haproxy/haproxy.cfg)
global
log /dev/log local0
log /dev/log local1 notice
chroot /var/lib/haproxy
stats socket /run/haproxy/admin.sock mode 660 level admin
stats timeout 30s
user haproxy
group haproxy
daemon
maxconn 4000
tune.ssl.default-dh-param 2048
defaults
log global
mode http
option httplog
option dontlognull
timeout connect 5s
timeout client 50s
timeout server 50s
option forwardfor
option http-server-close
frontend http_front
bind *:80
bind *:443 ssl crt /etc/ssl/private/example.com.pem
http-request set-header X-Forwarded-Port %[dst_port]
http-request add-header X-Forwarded-Proto https if { ssl_fc }
default_backend http_back
backend http_back
balance roundrobin
cookie SERVERID insert indirect nocache
server webserver1 192.168.1.201:80 check cookie s1
server webserver2 192.168.1.202:80 check cookie s2
listen stats
bind *:1936
stats enable
stats uri /
stats hide-version
stats auth admin:SecureStatsP@ss
3. Add as Cluster Resource
sudo pcs resource create haproxy systemd:haproxy \
op monitor interval=10s timeout=20s \
op start interval=0s timeout=30s \
op stop interval=0s timeout=30s \
--group haproxy_group
Time Synchronization
1. Configure Chronyd (/etc/chrony/chrony.conf)
pool 0.debian.pool.ntp.org iburst pool 1.debian.pool.ntp.org iburst pool 2.debian.pool.ntp.org iburst pool 3.debian.pool.ntp.org iburst allow 192.168.1.0/24 leapsectz right/UTC makestep 1.0 3
2. Verify Time Sync
sudo systemctl restart chronyd sudo chronyc tracking
Failover Automation
1. DNS Update Script (/usr/local/bin/update_dns.sh)
#!/bin/bash
CLUSTER_NAME="haproxy_cluster"
DNS_RECORD="haproxy.example.com"
DNS_SERVER="ns1.example.com"
DNS_KEY="/etc/bind/cluster.key"
TTL=300
# List your cluster node IPs here for reference (optional, not used in script logic)
NODE_IPS=("192.168.1.101" "192.168.1.102" "192.168.1.103" "192.168.1.104")
# Detect the IP of the current active node (Current DC)
ACTIVE_NODE=$(sudo pcs status | grep "Current DC:" | awk '{print $4}')
ACTIVE_NODE_IP=$(getent hosts $ACTIVE_NODE | awk '{ print $1 }')
logger -t haproxy-cluster "Failover detected. Active node: $ACTIVE_NODE ($ACTIVE_NODE_IP)"
nsupdate -k $DNS_KEY <<EOF
server $DNS_SERVER
update delete $DNS_RECORD A
update add $DNS_RECORD $TTL A $ACTIVE_NODE_IP
send
EOF
dig +short $DNS_RECORD @$DNS_SERVER
2. Make Script Executable
sudo chmod 750 /usr/local/bin/update_dns.sh sudo chown hacluster:haclient /usr/local/bin/update_dns.sh
3. Configure Pacemaker Resource
sudo pcs resource create dns_update ocf:pacemaker:ClusterMon \
user=hacluster \
extra_options="-e /usr/local/bin/update_dns.sh" \
op monitor interval=15s timeout=30s \
meta target-role=Started
sudo pcs constraint colocation add dns_update with haproxy_group INFINITY
sudo pcs constraint order haproxy_group then dns_update
Testing Procedures
1. Manual Failover Test
sudo pcs resource move haproxy_group node2.cluster.local sudo pcs status | grep -A5 "haproxy_group" sudo pcs resource clear haproxy_group
2. Simulate Node Failure
sudo systemctl stop corosync sudo pcs status watch -n 1 sudo pcs status
3. Verify DNS Updates
dig +short haproxy.example.com
Maintenance Operations
1. Node Maintenance
sudo pcs node standby node3.cluster.local sudo pcs node unstandby node3.cluster.local
2. Cluster Maintenance
sudo pcs cluster standby --all sudo pcs cluster unstandby --all
3. Adding/Removing Nodes
sudo pcs cluster node add node5.cluster.local sudo pcs cluster node remove node4.cluster.local
Troubleshooting
Common Issues
Corosync fails to start
sudo corosync-cfgtool -s sudo corosync-cmapctl | grep members
Split-brain scenario
sudo pcs property set stonith-enabled=true sudo pcs stonith create fence_ipmi fence_ipmilan ...
Resource failures
sudo pcs resource debug-start haproxy sudo pcs resource failcount show sudo pcs resource cleanup haproxy
Log Locations
- Corosync: /var/log/corosync/corosync.log
- Pacemaker: journalctl -u pacemaker
- HAProxy: /var/log/haproxy.log
Appendix
Sample DNS Key File (/etc/bind/cluster.key)
key "cluster-key" {
algorithm hmac-sha256;
secret "Base64EncodedKeyHere==";
};
Useful Commands
sudo pcs config show sudo pcs stonith show sudo pcs resource operations haproxy
References & Further Reading
- Pacemaker Documentation
- HAProxy Configuration Manual
- Debian Cluster Suite
- Corosync Documentation
- Chrony Documentation
This comprehensive guide includes: step-by-step instructions, configuration examples, automated failover procedures, maintenance and troubleshooting, and a reference appendix.