High availability redundancy failover: Difference between revisions

Latest revision as of 21:39, 30 June 2025

This is an expert-level guide for creating a high-availability (HA) VoIPmonitor cluster using an Active-Passive failover model. This architecture ensures that if your primary sensor server fails, a secondary server will automatically and seamlessly take over its responsibilities.

Overview: How Active-Passive Failover Works

This setup consists of two identical VoIPmonitor servers that share a single "virtual" IP address.

Two Identical Nodes: Both servers (Node 1 and Node 2) receive the exact same mirrored network traffic from your switch (or other source). Both run the VoIPmonitor sensor.
One Active Node: At any given time, only one server is the Active (or Master) node. This node holds the Virtual IP (VIP) and is the only one actively writing Call Detail Records (CDRs) to the database.
One Passive Node: The other server is Passive (or Backup). It is sniffing traffic and writing PCAP files to its local disk, but its CDR writing is disabled to prevent database conflicts.
Keepalived & The Heartbeat: A small, efficient service called `keepalived` runs on both nodes. The Active node constantly sends "I'm alive" heartbeat messages to the Passive node.
Automatic Failover: If the Passive node stops receiving heartbeats, it assumes the Active node has failed. It immediately takes over the Virtual IP and runs a script to enable CDR writing, thus seamlessly becoming the new Active node.

This architecture requires a Master-Master MySQL/MariaDB replication setup, ensuring both nodes can write to the database without causing conflicts when a failover occurs.

Advantages vs. Disadvantages

Pros: Simpler to set up and manage than a full database cluster like Galera. It has a very clear and predictable failover path.
Cons: Only one node is "active" at a time (no load balancing). Failover time, while fast (a few seconds), is not instantaneous.

Architectural Diagram

Node 1 (Primary): IP: 10.0.0.1; State: Initially ACTIVE

Node 2 (Secondary): IP: 10.0.0.2; State: Initially PASSIVE

Shared Virtual IP: VIP: 10.0.0.128 (This is the IP your GUI and other services will use)

Network: Both nodes are connected to the same mirrored traffic source. They are also connected to each other, preferably via a direct, dedicated crossover cable for the heartbeat signal to ensure reliability.

Step 1: System Preparation (Both Nodes)

Before configuring `keepalived`, prepare both servers.

1. Install Keepalived

# For Debian/Ubuntu
sudo apt-get update && sudo apt-get install keepalived

# For CentOS/RHEL/AlmaLinux
sudo yum install keepalived

2. Allow Binding to a Non-Local IP

This kernel parameter is required for `keepalived` to manage the Virtual IP. Edit `/etc/sysctl.conf` and add this line:

net.ipv4.ip_nonlocal_bind=1

Apply the change immediately without rebooting:

sudo sysctl -p

Step 2: Create the Failover Script (Both Nodes)

`keepalived` will execute this script whenever a node's state changes (e.g., from Backup to Master). This script uses VoIPmonitor's manager API to enable or disable CDR writing.

Create the script file

sudo nano /etc/keepalived/voipmonitor_failover.sh

Copy and paste the following content

#!/bin/bash
#
# This script is managed by keepalived to control the VoIPmonitor CDR writing state.
#

TYPE=$1  # "GROUP" or "INSTANCE"
NAME=$2  # The name of the group or instance
STATE=$3 # "MASTER", "BACKUP", or "FAULT"

LOG_FILE="/var/log/keepalived-voipmonitor.log"
MANAGER_PORT=5029 # Ensure this matches your voipmonitor.conf

log() {
    echo "$(date '+%Y-%m-%d %H:%M:%S') - $1" >> "$LOG_FILE"
}

case $STATE in
    "MASTER")
        log "State changed to MASTER. Enabling CDR writing."
        echo "enablecdr" | nc localhost "$MANAGER_PORT"
        exit 0
        ;;
    "BACKUP")
        log "State changed to BACKUP. Disabling CDR writing."
        echo "disablecdr" | nc localhost "$MANAGER_PORT"
        exit 0
        ;;
    "FAULT")
        log "State changed to FAULT. Disabling CDR writing as a precaution."
        echo "disablecdr" | nc localhost "$MANAGER_PORT"
        exit 0
        ;;
    *)
        log "Unknown state '$STATE' received. Exiting."
        exit 1
        ;;
esac

Make the script executable

sudo chmod +x /etc/keepalived/voipmonitor_failover.sh

Step 3: Configure Keepalived

The configuration is nearly identical for both nodes, with only the `priority` needing to be different.

Configuration for Node 1 (Primary/Master)

Edit `/etc/keepalived/keepalived.conf`

# /etc/keepalived/keepalived.conf on NODE 1

global_defs {
   router_id voipmonitor_ha
}

vrrp_script check_voipmonitor {
    script "/usr/bin/pgrep voipmonitor" # Check if the voipmonitor process is running
    interval 2                         # Check every 2 seconds
    weight 20                          # If this script fails, reduce priority by 20
}

vrrp_instance VI_1 {
    state MASTER            # Start as the MASTER node
    interface eth0          # Network interface for heartbeats
    virtual_router_id 51    # Must be the same on both nodes
    priority 150            # Higher priority becomes master
    advert_int 1            # Send heartbeat every 1 second
    
    authentication {
        auth_type PASS
        auth_pass your_secret_password
    }
    
    virtual_ipaddress {
        10.0.0.128/24 dev eth0
    }

    track_script {
        check_voipmonitor
    }

    notify /etc/keepalived/voipmonitor_failover.sh
}

Configuration for Node 2 (Secondary/Backup)

Edit `/etc/keepalived/keepalived.conf` on Node 2. It is identical except for two lines

# /etc/keepalived/keepalived.conf on NODE 2

vrrp_instance VI_1 {
    state BACKUP            # Start as the BACKUP node
    interface eth0
    virtual_router_id 51
    priority 100            # LOWER priority than the master
    # ... rest of the file is identical ...
}

Step 4: Configure VoIPmonitor

Finally, on both nodes, edit `/etc/voipmonitor.conf` to disable CDR writing by default. `keepalived` will be responsible for enabling it on the active node.

# /etc/voipmonitor.conf on BOTH nodes

# Start with CDR writing disabled. The failover script will enable it on the active node.
nocdr = yes

Step 5: Start and Test

Start and enable the `keepalived` service on both nodes

sudo systemctl start keepalived
sudo systemctl enable keepalived

You can test the failover by rebooting the primary node (`reboot` on Node 1) or by stopping its `keepalived` service. You should see the Virtual IP `10.0.0.128` automatically appear on Node 2, and the log file `/var/log/keepalived-voipmonitor.log` should show that it has transitioned to the MASTER state and enabled CDRs.

AI Summary for RAG

Summary: This guide provides a detailed tutorial for setting up a high-availability (HA) Active-Passive VoIPmonitor cluster using `keepalived`. It explains the architecture, where two identical sensors receive the same mirrored traffic, but only one "Active" node holds a shared Virtual IP (VIP) and writes CDRs to a master-master replicated database. The guide replaces the outdated `heartbeat` software with the modern `keepalived` service. The process is broken down into clear steps: 1) Preparing the system by installing `keepalived` and setting the `net.ipv4.ip_nonlocal_bind` sysctl parameter. 2) Creating a `voipmonitor_failover.sh` script that `keepalived` uses to enable or disable CDR writing via the manager API. 3) Providing complete `keepalived.conf` examples for both the primary (MASTER) and secondary (BACKUP) nodes, highlighting the use of different priorities. 4) Configuring VoIPmonitor itself with `nocdr=yes` to ensure CDR writing is disabled by default and only activated by the failover script. Finally, it explains how to start the services and test the failover mechanism. Keywords: high availability, ha, failover, active-passive, keepalived, heartbeat, cluster, redundancy, virtual ip, vip, floating ip, master-master replication, `vrrp_instance`, `ip_nonlocal_bind`, `nocdr`, manager api Key Questions:

How can I set up a high-availability failover for VoIPmonitor?
What is an Active-Passive cluster and how does it work?
How to configure `keepalived` for VoIPmonitor?
What is a Virtual IP (VIP) and how does it provide failover?
What is the modern alternative to the `heartbeat` service on Linux?
How do I test my `keepalived` failover setup?

High availability redundancy failover: Difference between revisions

Latest revision as of 21:39, 30 June 2025

Contents

Overview: How Active-Passive Failover Works

Advantages vs. Disadvantages

Architectural Diagram

Step 1: System Preparation (Both Nodes)

Step 2: Create the Failover Script (Both Nodes)

Step 3: Configure Keepalived

Configuration for Node 1 (Primary/Master)

Configuration for Node 2 (Secondary/Backup)

Step 4: Configure VoIPmonitor

Step 5: Start and Test

AI Summary for RAG

Navigation menu

@@ Line 1: / Line 1: @@
-VoIPmonitor supports full redundancy and seamlessly failover switch. VoIPmonitor is installed on two servers which are both connected to mirroring switch where each server receives the same SIP/RTP traffic. MySQL is configured to Master-Master replication. Active node activly writes CDR to database while the passive is running but CDR are turned off but still writing pcap files on disk. Once active node dies or switches to maintainance mode the secondary nodes takes a shared IP and activates writing to CDR database.
+{{DISPLAYTITLE:High Availability with Keepalived (Active-Passive Failover)}}
-This guide is for Debian 6
+'''This is an expert-level guide for creating a high-availability (HA) VoIPmonitor cluster using an Active-Passive failover model. This architecture ensures that if your primary sensor server fails, a secondary server will automatically and seamlessly take over its responsibilities.'''
-= configure voipmonitor sniffer with redundancy =
-== topology ==
+== Overview: How Active-Passive Failover Works ==
+This setup consists of two identical VoIPmonitor servers that share a single "virtual" IP address.
+*'''Two Identical Nodes:''' Both servers (Node 1 and Node 2) receive the exact same mirrored network traffic from your switch (or other source). Both run the VoIPmonitor sensor.
+*'''One Active Node:''' At any given time, only one server is the '''Active''' (or Master) node. This node holds the '''Virtual IP (VIP)''' and is the only one actively writing Call Detail Records (CDRs) to the database.
+*'''One Passive Node:''' The other server is '''Passive''' (or Backup). It is sniffing traffic and writing PCAP files to its local disk, but its CDR writing is disabled to prevent database conflicts.
+*'''Keepalived & The Heartbeat:''' A small, efficient service called `keepalived` runs on both nodes. The Active node constantly sends "I'm alive" heartbeat messages to the Passive node.
+*'''Automatic Failover:''' If the Passive node stops receiving heartbeats, it assumes the Active node has failed. It immediately takes over the Virtual IP and runs a script to enable CDR writing, thus seamlessly becoming the new Active node.
- Node1 IP:10.0.0.1
+This architecture requires a '''Master-Master MySQL/MariaDB replication''' setup, ensuring both nodes can write to the database without causing conflicts when a failover occurs.
- Node2 IP:10.0.0.2
- Shared IP 10.0.0.128
-== Installing hearbeat ==
+=== Advantages vs. Disadvantages ===
+*'''Pros:''' Simpler to set up and manage than a full database cluster like Galera. It has a very clear and predictable failover path.
+*'''Cons:''' Only one node is "active" at a time (no load balancing). Failover time, while fast (a few seconds), is not instantaneous.
-Installing [http://www.linux-ha.org HA]
+== Architectural Diagram ==
- apt-get install heartbeat
+;Node 1 (Primary)
+:'''IP:''' 10.0.0.1
+:'''State:''' Initially '''ACTIVE'''
-edit /etc/hosts
+;Node 2 (Secondary)
-.0.0.1       localhost
+:'''IP:''' 10.0.0.2
-.0.0.1   voipmonitor1
+:'''State:''' Initially '''PASSIVE'''
-.0.0.2   voipmonitor2
-edit /etc/hostname to match voipmonitor1 and voipmonitor2
+;Shared Virtual IP
+:'''VIP:''' 10.0.0.128 (This is the IP your GUI and other services will use)
-allow binding of shared ip adress by editing /etc/sysctl.conf adding the following line (lb1&lb2)
+;Network
- net.ipv4.ip_nonlocal_bind=1
+:Both nodes are connected to the same mirrored traffic source. They are also connected to each other, preferably via a direct, dedicated crossover cable for the heartbeat signal to ensure reliability.
-run sysctl to activate changes
+== Step 1: System Preparation (Both Nodes) ==
- sysctl -p
+Before configuring `keepalived`, prepare both servers.
-generate file /etc/ha.d/authkeys on both nodes following content:
+;1. Install Keepalived:
- auth 3
+<pre>
-md5 somerandomstring
+# For Debian/Ubuntu
+sudo apt-get update && sudo apt-get install keepalived
-set permissions
+# For CentOS/RHEL/AlmaLinux
- chmod 600 /etc/ha.d/authkeys
+sudo yum install keepalived
+</pre>
-create file /etc/ha.d/ha.cf on both nodes
+;2. Allow Binding to a Non-Local IP:
- #
+This kernel parameter is required for `keepalived` to manage the Virtual IP. Edit `/etc/sysctl.conf` and add this line:
- #       keepalive: how many seconds between heartbeats
+<pre>net.ipv4.ip_nonlocal_bind=1</pre>
- #
+Apply the change immediately without rebooting:
- keepalive 2
+<pre>sudo sysctl -p</pre>
- #
- #       deadtime: seconds-to-declare-host-dead
- #
- deadtime 10
- #
- #       What UDP port to use for udp or ppp-udp communication?
- #
- udpport        694
- bcast  eth0
- # mcast eth0 225.0.0.1 694 1 0
- ucast eth0 10.0.0.2
- #       What interfaces to heartbeat over?
- udp     eth0
- #
- #       Facility to use for syslog()/logger (alternative to log/debugfile)
- #
- #logfacility     local0
- #
- #       Tell what machines are in the cluster
- #       node    nodename ...    -- must match uname -n
- node    voipmonitor1
- node    voipmonitor2
-on node2 change ucast eth0 10.0.0.2 to ucast eth0 10.0.0.1
+== Step 2: Create the Failover Script (Both Nodes) ==
+`keepalived` will execute this script whenever a node's state changes (e.g., from Backup to Master). This script uses VoIPmonitor's manager API to enable or disable CDR writing.
-on both nodes create file /etc/ha.d/haresources with the same exact content (do not be confused that the voipmonitor1 is also on node2 this in fact tells that voipmonitor1 is primary node)
+;Create the script file:
- voipmonitor1 10.0.0.128 IPsrcaddr::10.0.0.128 voipmonitor
+<pre>sudo nano /etc/keepalived/voipmonitor_failover.sh</pre>
-this harseources will activate IP 10.0.0.128 and runs voipmonitor script (it is run from left to right once the node becomes master). When the node becomes slave it will run the script from right to left with parametr stop)
+;Copy and paste the following content:
+<pre>
+#!/bin/bash
+#
+# This script is managed by keepalived to control the VoIPmonitor CDR writing state.
+#
-Create file in /etc/ha.d/resource.d/voipmonitor
+TYPE=$1  # "GROUP" or "INSTANCE"
-  #!/bin/bash
+NAME=$2  # The name of the group or instance
-  #
+STATE=$3 # "MASTER", "BACKUP", or "FAULT"
- # This script is inteded to be used as resource script by heartbeat
- #
- # May 2012 by Martin Vit
- #
- ###
- . /etc/ha.d/shellfuncs
- case "$1" in
-     start)
-         echo "enablecdr" | nc localhost 5029 >/dev/null 2>/dev/null
-         ;;
-     stop)
-         echo "disablecdr" | nc localhost 5029 >/dev/null 2>/dev/null
-         ;;
-     status)
-         ;;
-     *)
-         echo "Usage: {start|stop|status}"
-         exit 1
-         ;;
- esac
- exit 0
-= configuring db's redundancy =
+LOG_FILE="/var/log/keepalived-voipmonitor.log"
+MANAGER_PORT=5029 # Ensure this matches your voipmonitor.conf
-[[Mysql master-master replication hints]]
+log() {
+    echo "$(date '+%Y-%m-%d %H:%M:%S') - $1" >> "$LOG_FILE"
+}
+case $STATE in
+    "MASTER")
+        log "State changed to MASTER. Enabling CDR writing."
+        echo "enablecdr" | nc localhost "$MANAGER_PORT"
+        exit 0
+        ;;
+    "BACKUP")
+        log "State changed to BACKUP. Disabling CDR writing."
+        echo "disablecdr" | nc localhost "$MANAGER_PORT"
+        exit 0
+        ;;
+    "FAULT")
+        log "State changed to FAULT. Disabling CDR writing as a precaution."
+        echo "disablecdr" | nc localhost "$MANAGER_PORT"
+        exit 0
+        ;;
+    *)
+        log "Unknown state '$STATE' received. Exiting."
+        exit 1
+        ;;
+esac
+</pre>
+;Make the script executable:
+<pre>sudo chmod +x /etc/keepalived/voipmonitor_failover.sh</pre>
+== Step 3: Configure Keepalived ==
+The configuration is nearly identical for both nodes, with only the `priority` needing to be different.
+=== Configuration for Node 1 (Primary/Master) ===
+;Edit `/etc/keepalived/keepalived.conf`:
+<pre>
+# /etc/keepalived/keepalived.conf on NODE 1
+global_defs {
+   router_id voipmonitor_ha
+}
+vrrp_script check_voipmonitor {
+    script "/usr/bin/pgrep voipmonitor" # Check if the voipmonitor process is running
+    interval 2                         # Check every 2 seconds
+    weight 20                          # If this script fails, reduce priority by 20
+}
+vrrp_instance VI_1 {
+    state MASTER            # Start as the MASTER node
+    interface eth0          # Network interface for heartbeats
+    virtual_router_id 51    # Must be the same on both nodes
+    priority 150            # Higher priority becomes master
+    advert_int 1            # Send heartbeat every 1 second
+    authentication {
+        auth_type PASS
+        auth_pass your_secret_password
+    }
+    virtual_ipaddress {
+.0.0.128/24 dev eth0
+    }
+    track_script {
+        check_voipmonitor
+    }
+    notify /etc/keepalived/voipmonitor_failover.sh
+}
+</pre>
+=== Configuration for Node 2 (Secondary/Backup) ===
+;Edit `/etc/keepalived/keepalived.conf` on Node 2. It is identical except for two lines:
+<pre>
+# /etc/keepalived/keepalived.conf on NODE 2
+vrrp_instance VI_1 {
+    state BACKUP            # Start as the BACKUP node
+    interface eth0
+    virtual_router_id 51
+    priority 100            # LOWER priority than the master
+    # ... rest of the file is identical ...
+}
+</pre>
+== Step 4: Configure VoIPmonitor ==
+Finally, on '''both''' nodes, edit `/etc/voipmonitor.conf` to disable CDR writing by default. `keepalived` will be responsible for enabling it on the active node.
+<pre>
+# /etc/voipmonitor.conf on BOTH nodes
+# Start with CDR writing disabled. The failover script will enable it on the active node.
+nocdr = yes
+</pre>
+== Step 5: Start and Test ==
+;Start and enable the `keepalived` service on both nodes:
+<pre>
+sudo systemctl start keepalived
+sudo systemctl enable keepalived
+</pre>
+You can test the failover by rebooting the primary node (`reboot` on Node 1) or by stopping its `keepalived` service. You should see the Virtual IP `10.0.0.128` automatically appear on Node 2, and the log file `/var/log/keepalived-voipmonitor.log` should show that it has transitioned to the MASTER state and enabled CDRs.
+== AI Summary for RAG ==
+'''Summary:''' This guide provides a detailed tutorial for setting up a high-availability (HA) Active-Passive VoIPmonitor cluster using `keepalived`. It explains the architecture, where two identical sensors receive the same mirrored traffic, but only one "Active" node holds a shared Virtual IP (VIP) and writes CDRs to a master-master replicated database. The guide replaces the outdated `heartbeat` software with the modern `keepalived` service. The process is broken down into clear steps: 1) Preparing the system by installing `keepalived` and setting the `net.ipv4.ip_nonlocal_bind` sysctl parameter. 2) Creating a `voipmonitor_failover.sh` script that `keepalived` uses to enable or disable CDR writing via the manager API. 3) Providing complete `keepalived.conf` examples for both the primary (MASTER) and secondary (BACKUP) nodes, highlighting the use of different priorities. 4) Configuring VoIPmonitor itself with `nocdr=yes` to ensure CDR writing is disabled by default and only activated by the failover script. Finally, it explains how to start the services and test the failover mechanism.
+'''Keywords:''' high availability, ha, failover, active-passive, keepalived, heartbeat, cluster, redundancy, virtual ip, vip, floating ip, master-master replication, `vrrp_instance`, `ip_nonlocal_bind`, `nocdr`, manager api
+'''Key Questions:'''
+* How can I set up a high-availability failover for VoIPmonitor?
+* What is an Active-Passive cluster and how does it work?
+* How to configure `keepalived` for VoIPmonitor?
+* What is a Virtual IP (VIP) and how does it provide failover?
+* What is the modern alternative to the `heartbeat` service on Linux?
+* How do I test my `keepalived` failover setup?

High availability redundancy failover: Difference between revisions

Latest revision as of 21:39, 30 June 2025

Overview: How Active-Passive Failover Works

Advantages vs. Disadvantages

Architectural Diagram

Step 1: System Preparation (Both Nodes)

Step 2: Create the Failover Script (Both Nodes)

Step 3: Configure Keepalived

Configuration for Node 1 (Primary/Master)

Configuration for Node 2 (Secondary/Backup)

Step 4: Configure VoIPmonitor

Step 5: Start and Test

AI Summary for RAG

Navigation menu

Search