Scaling: Difference between revisions

Latest revision as of 21:52, 20 January 2026

This guide covers performance tuning for high-traffic VoIPmonitor deployments, addressing the three primary system bottlenecks.

Understanding Performance Bottlenecks

A VoIPmonitor deployment's capacity is limited by three potential bottlenecks:

Bottleneck	Description	Monitor
1. Packet Capture	Single CPU core reading packets from NIC	`t0CPU` in syslog
2. Disk I/O	Writing PCAP files to storage	`iostat`, `ioping`
3. Database	CDR ingestion and GUI queries	`SQLq` in syslog

Capacity: A modern server (24-core Xeon, 10Gbit NIC) can handle ~10,000 concurrent calls with full RTP recording, or 60,000+ with SIP-only analysis.

Optimizing Packet Capture (CPU & Network)

The packet capture thread (t0) runs on a single CPU core. If t0CPU approaches 100%, you've hit the capture limit.

With a modern kernel and VoIPmonitor build, a standard Intel 10Gbit NIC handles up to 3 Gbit/s VoIP traffic without special drivers and almost full 10Gbit rate with DPDK

Threading (Automatic)

Since version 2023.11, VoIPmonitor uses threading_expanded=yes by default, which automatically spawns threads based on CPU load. No manual threading configuration is needed.

For very high traffic (≥1500 Mbit/s), set:

threading_expanded = high_traffic

See Threading Model for details.

NIC Tuning (>500 Mbit/s)

# Increase ring buffer (prevents packet loss during CPU spikes)
ethtool -g eth0                  # Check max size
ethtool -G eth0 rx 16384         # Set to max

# Enable interrupt coalescing (reduces CPU overhead)
ethtool -C eth0 rx-usecs 1022

Persistent settings (Debian/Ubuntu /etc/network/interfaces):

auto eth0
iface eth0 inet manual
    up ip link set $IFACE up
    up ip link set $IFACE promisc on
    up ethtool -G $IFACE rx 16384
    up ethtool -C $IFACE rx-usecs 1022

Configuration Optimizations

Parameter	Purpose	Recommendation
`interface_ip_filter`	IP-based filtering	More efficient than BPF `filter`
`pcap_dump_writethreads_max`	Compression threads	Set to CPU core count
`jitterbuffer_f1/f2/adapt`	Jitter simulation	Keep `f2=yes`, disable f1 and adapt to save CPU while keeping MOS

# /etc/voipmonitor.conf

# Efficient IP filtering (replaces BPF filter)
interface_ip_filter = 192.168.0.0/24
interface_ip_filter = 10.0.0.0/8

# Compression scaling
pcap_dump_writethreads = 1
pcap_dump_writethreads_max = 32
pcap_dump_asyncwrite = yes

ℹ️ Note: Recommended: jitterbuffer_f1=no, jitterbuffer_f2=yes, jitterbuffer_adapt=no. This saves CPU while preserving MOS-F2 metrics. Only disable f2 if you don't need quality monitoring at all.

Kernel-Bypass Solutions

For extreme loads, bypass the kernel network stack entirely:

Solution	Type	CPU Reduction	Use Case
DPDK	Open-source	~70%	Multi-gigabit on commodity hardware
PF_RING ZC	Commercial	90% → 20%	High-volume enterprise
Napatech SmartNICs	Hardware	<3% at 10Gbit/s	Extreme performance

Optimizing Disk I/O

VoIPmonitor Storage Strategy

VoIPmonitor groups all calls starting within the same minute into a single compressed .tar archive. This changes thousands of random writes into few sequential writes, reducing IOPS by 10x+.

Typical capacity: 7200 RPM SATA handles ~2,000 concurrent calls with full recording.

Filesystem Tuning (ext4)

# Format without journal (requires battery-backed RAID)
mke2fs -t ext4 -O ^has_journal /dev/sda2

# /etc/fstab
/dev/sda2  /var/spool/voipmonitor  ext4  errors=remount-ro,noatime,data=writeback,barrier=0  0 0

⚠️ Warning: Disabling journal removes crash protection. Only use with battery-backed RAID controller (BBU).

RAID Controller

Set cache policy to WriteBack (not WriteThrough). Requires healthy BBU. Commands vary by vendor (megacli, ssacli, perccli).

Optimizing Database Performance

Memory Configuration

The most critical parameter is innodb_buffer_pool_size.

⚠️ Warning: Setting too high causes OOM killer events, CDR delays, and crashes. See OOM Troubleshooting.

Buffer Pool Sizing:

Server Type	Formula	Example (32GB RAM)
Shared (VoIPmonitor + MySQL)	(Total RAM - VoIPmonitor - OS) / 2	14GB
Dedicated MySQL server	50-70% of total RAM	20-22GB

RAM Recommendations:

Deployment Size	Minimum	Recommended
Small (<500 calls)	8GB	16GB
Medium (500-2000)	16GB	32GB
Large (>2000)	32GB	64GB+

Key MySQL Parameters

# /etc/mysql/my.cnf or mariadb.conf.d/50-server.cnf
[mysqld]
innodb_buffer_pool_size = 14G
innodb_flush_log_at_trx_commit = 2  # Faster, minimal data loss risk
innodb_file_per_table = 1           # Essential for partitioning
innodb_compression_algorithm = lz4  # MariaDB only

Slow Query Log

The slow query log can consume significant memory. Consider disabling on high-traffic systems:

[mysqld]
slow_query_log = 0
# Or increase threshold: long_query_time = 600

Database Partitioning

VoIPmonitor automatically partitions large tables (like cdr) by day. This is enabled by default and highly recommended.

See Database Partitioning for details.

Troubleshooting: Connection Refused

Symptoms: GUI crashes, "Connection refused" errors, intermittent issues during peak volumes.

Cause: innodb_buffer_pool_size too low (default 128M is insufficient).

Solution: Increase to 6G+ based on available RAM:

[mysqld]
innodb_buffer_pool_size = 6G

systemctl restart mariadb

Component Separation (Multi-Host Architecture)

For deployments exceeding 5,000-10,000 concurrent calls, separate VoIPmonitor components onto dedicated hosts.

Architecture Overview

Host	Component	Primary Resources	Scaling Strategy
Host 1	MySQL Database	RAM, fast SSD	Add RAM, read replicas
Host 2	Sensor(s)	CPU (t0 thread), network	DPDK/PF_RING, more sensors
Host 3	GUI	CPU, network	Load balancer, caching

Configuration

MySQL Server:

# /etc/mysql/my.cnf
[mysqld]
bind-address = 0.0.0.0
innodb_buffer_pool_size = 50G  # 50-70% RAM for dedicated server

CREATE USER 'voipmonitor'@'%' IDENTIFIED BY 'strong_password';
GRANT ALL PRIVILEGES ON voipmonitor.* TO 'voipmonitor'@'%';

Sensor:

# /etc/voipmonitor.conf
id_sensor = 1
mysqlhost = mysql.server.ip
mysqldb = voipmonitor
mysqlusername = voipmonitor
mysqlpassword = strong_password

GUI: Configure via Settings > System Configuration > Database, or edit config/system_configuration.php.

Firewall Rules:

Source	Destination	Port	Purpose
Sensor	MySQL	3306	CDR writes
GUI	MySQL	3306	Queries
GUI	Sensor(s)	5029	PCAP retrieval
Users	GUI	80, 443	Web access

ℹ️ Note: Component separation can be combined with Client-Server mode for multi-site deployments.

Monitoring Performance

VoIPmonitor logs performance metrics every 10 seconds to syslog. Key metrics to watch:

Metric	Warning Sign	Bottleneck Type
`t0CPU`	>90%	CPU (packet capture limit)
`heap[A\|B\|C]`	A >50%	I/O or CPU (buffer filling)
`SQLq`	Growing	Database
`comp`	Maxed out	I/O (compression waiting for disk)

# Monitor in real-time
journalctl -u voipmonitor -f

Main article: Syslog_Status_Line - Complete reference for all metrics with detailed explanations and troubleshooting guidance.

For bottleneck diagnosis: See I/O vs CPU Bottleneck Diagnosis for step-by-step diagnostic procedure using syslog metrics and Linux tools.

AI Summary for RAG

Summary

Performance tuning guide for high-traffic VoIPmonitor deployments. Covers three main bottlenecks: CPU (t0 packet capture thread, single-core limit), Disk I/O (PCAP storage), and Database (MySQL/MariaDB). Threading is automatic since 2023.11 via threading_expanded=yes (use high_traffic for ≥1500 Mbit/s). NIC tuning: ethtool ring buffer and interrupt coalescing. CPU optimization: interface_ip_filter instead of BPF, jitterbuffer_f2=yes with f1/adapt disabled. Kernel bypass solutions: DPDK (~70% CPU reduction), PF_RING ZC, Napatech SmartNICs (<3% CPU at 10Gbit). Disk I/O: TAR archives reduce IOPS 10x, ext4 tuning (noatime, writeback), RAID WriteBack cache with BBU. Database: innodb_buffer_pool_size (50-70% RAM for dedicated server), innodb_flush_log_at_trx_commit=2, partitioning. Multi-host architecture for >5000-10000 concurrent calls separating MySQL, sensors, and GUI.

Keywords

scaling, performance, tuning, optimization, high traffic, bottleneck, CPU, t0CPU, t0 thread, single-core, disk I/O, storage, database, MySQL, MariaDB, threading_expanded, high_traffic, NIC tuning, ethtool, ring buffer, interrupt coalescing, interface_ip_filter, jitterbuffer, DPDK, PF_RING, Napatech, kernel bypass, TAR archive, ext4, noatime, writeback, RAID, WriteBack cache, BBU, innodb_buffer_pool_size, innodb_flush_log_at_trx_commit, partitioning, multi-host, component separation, concurrent calls, capacity, 10000 calls, heap, SQLq, compression threads, pcap_dump_writethreads

Key Questions

How to tune VoIPmonitor for high traffic?
How many concurrent calls can VoIPmonitor handle?
What are the main performance bottlenecks?
How to optimize CPU usage for packet capture?
What is threading_expanded and when to use high_traffic?
How to tune NIC for VoIPmonitor?
How to reduce CPU with jitterbuffer settings?
What is DPDK and when to use it?
How to optimize disk I/O for PCAP storage?
How to tune ext4 filesystem for VoIPmonitor?
What is the recommended innodb_buffer_pool_size?
How to configure MySQL for VoIPmonitor performance?
When to separate VoIPmonitor components to multiple hosts?
How to monitor VoIPmonitor performance metrics?
What do t0CPU, heap, SQLq metrics mean?