Scaling: Difference between revisions

Latest revision as of 16:50, 8 January 2026

This guide covers performance tuning for high-traffic VoIPmonitor deployments, addressing the three primary system bottlenecks.

Understanding Performance Bottlenecks

A VoIPmonitor deployment's capacity is limited by three potential bottlenecks:

Bottleneck	Description	Monitor
1. Packet Capture	Single CPU core reading packets from NIC	`t0CPU` in syslog
2. Disk I/O	Writing PCAP files to storage	`iostat`, `ioping`
3. Database	CDR ingestion and GUI queries	`SQLq` in syslog

Capacity: A modern server (24-core Xeon, 10Gbit NIC) can handle ~10,000 concurrent calls with full RTP recording, or 60,000+ with SIP-only analysis.

Optimizing Packet Capture (CPU & Network)

The packet capture thread (t0) runs on a single CPU core. If t0CPU approaches 100%, you've hit the capture limit.

Prerequisites

Linux kernel 3.2+ with TPACKET_V3 support
Latest VoIPmonitor static binary

With a modern kernel and VoIPmonitor build, a standard Intel 10Gbit NIC handles up to 2 Gbit/s VoIP traffic without special drivers.

NIC Tuning (>500 Mbit/s)

# Increase ring buffer (prevents packet loss during CPU spikes)
ethtool -g eth0                  # Check max size
ethtool -G eth0 rx 16384         # Set to max

# Enable interrupt coalescing (reduces CPU overhead)
ethtool -C eth0 rx-usecs 1022

Persistent settings (Debian/Ubuntu /etc/network/interfaces):

auto eth0
iface eth0 inet manual
    up ip link set $IFACE up
    up ip link set $IFACE promisc on
    up ethtool -G $IFACE rx 16384
    up ethtool -C $IFACE rx-usecs 1022

Configuration Optimizations

Parameter	Purpose	Recommendation
`interface_ip_filter`	IP-based filtering	More efficient than BPF `filter`
`pcap_dump_writethreads_max`	Compression threads	Set to CPU core count
`jitterbuffer_f1/f2/adapt`	Jitter simulation	Disable to save CPU (loses MOS metrics)

# /etc/voipmonitor.conf

# Efficient IP filtering (replaces BPF filter)
interface_ip_filter = 192.168.0.0/24
interface_ip_filter = 10.0.0.0/8

# Compression scaling
pcap_dump_writethreads = 1
pcap_dump_writethreads_max = 32
pcap_dump_asyncwrite = yes

⚠️ Warning: Disabling jitterbuffer_* removes MOS/jitter metrics. Only disable if you don't need quality monitoring.

Kernel-Bypass Solutions

For extreme loads, bypass the kernel network stack entirely:

Solution	Type	CPU Reduction	Use Case
DPDK	Open-source	~70%	Multi-gigabit on commodity hardware
PF_RING ZC	Commercial	90% → 20%	High-volume enterprise
Napatech SmartNICs	Hardware	<3% at 10Gbit/s	Extreme performance

Optimizing Disk I/O

VoIPmonitor Storage Strategy

VoIPmonitor groups all calls starting within the same minute into a single compressed .tar archive. This changes thousands of random writes into few sequential writes, reducing IOPS by 10x+.

Typical capacity: 7200 RPM SATA handles ~2,000 concurrent calls with full recording.

Filesystem Tuning (ext4)

# Format without journal (requires battery-backed RAID)
mke2fs -t ext4 -O ^has_journal /dev/sda2

# /etc/fstab
/dev/sda2  /var/spool/voipmonitor  ext4  errors=remount-ro,noatime,data=writeback,barrier=0  0 0

⚠️ Warning: Disabling journal removes crash protection. Only use with battery-backed RAID controller (BBU).

RAID Controller

Set cache policy to WriteBack (not WriteThrough). Requires healthy BBU. Commands vary by vendor (megacli, ssacli, perccli).

Optimizing Database Performance

Memory Configuration

The most critical parameter is innodb_buffer_pool_size.

⚠️ Warning: Setting too high causes OOM killer events, CDR delays, and crashes. See OOM Troubleshooting.

Buffer Pool Sizing:

Server Type	Formula	Example (32GB RAM)
Shared (VoIPmonitor + MySQL)	(Total RAM - VoIPmonitor - OS) / 2	14GB
Dedicated MySQL server	50-70% of total RAM	20-22GB

RAM Recommendations:

Deployment Size	Minimum	Recommended
Small (<500 calls)	8GB	16GB
Medium (500-2000)	16GB	32GB
Large (>2000)	32GB	64GB+

Key MySQL Parameters

# /etc/mysql/my.cnf or mariadb.conf.d/50-server.cnf
[mysqld]
innodb_buffer_pool_size = 14G
innodb_flush_log_at_trx_commit = 2  # Faster, minimal data loss risk
innodb_file_per_table = 1           # Essential for partitioning
innodb_compression_algorithm = lz4  # MariaDB only

Slow Query Log

The slow query log can consume significant memory. Consider disabling on high-traffic systems:

[mysqld]
slow_query_log = 0
# Or increase threshold: long_query_time = 600

Database Partitioning

VoIPmonitor automatically partitions large tables (like cdr) by day. This is enabled by default and highly recommended.

See Database Partitioning for details.

Troubleshooting: Connection Refused

Symptoms: GUI crashes, "Connection refused" errors, intermittent issues during peak volumes.

Cause: innodb_buffer_pool_size too low (default 128M is insufficient).

Solution: Increase to 6G+ based on available RAM:

[mysqld]
innodb_buffer_pool_size = 6G

systemctl restart mariadb

Component Separation (Multi-Host Architecture)

For deployments exceeding 5,000-10,000 concurrent calls, separate VoIPmonitor components onto dedicated hosts.

Architecture Overview

Host	Component	Primary Resources	Scaling Strategy
Host 1	MySQL Database	RAM, fast SSD	Add RAM, read replicas
Host 2	Sensor(s)	CPU (t0 thread), network	DPDK/PF_RING, more sensors
Host 3	GUI	CPU, network	Load balancer, caching

Configuration

MySQL Server:

# /etc/mysql/my.cnf
[mysqld]
bind-address = 0.0.0.0
innodb_buffer_pool_size = 50G  # 50-70% RAM for dedicated server

CREATE USER 'voipmonitor'@'%' IDENTIFIED BY 'strong_password';
GRANT ALL PRIVILEGES ON voipmonitor.* TO 'voipmonitor'@'%';

Sensor:

# /etc/voipmonitor.conf
id_sensor = 1
mysqlhost = mysql.server.ip
mysqldb = voipmonitor
mysqlusername = voipmonitor
mysqlpassword = strong_password

GUI: Configure via Settings > System Configuration > Database, or edit config/system_configuration.php.

Firewall Rules:

Source	Destination	Port	Purpose
Sensor	MySQL	3306	CDR writes
GUI	MySQL	3306	Queries
GUI	Sensor(s)	5029	PCAP retrieval
Users	GUI	80, 443	Web access

ℹ️ Note: Component separation can be combined with Client-Server mode for multi-site deployments.

Monitoring Performance

VoIPmonitor logs metrics every 10 seconds:

tail -f /var/log/syslog | grep voipmonitor

Sample output:

calls[315][355] PS[C:4 S:29 R:6354] SQLq[0] heap[0|0|0] t0CPU[5.2%] RSS/VSZ[323|752]MB

Metric	Description	Warning Sign
`calls[X][Y]`	Active / total calls in memory	-
`SQLq[N]`	SQL queries queued	Growing = DB bottleneck
B\|C]	Buffer usage %	A=100% = packet drops
`t0CPU[X%]`	Capture thread CPU	>90% = limit reached
Y]	Memory usage (MB)	RSS growing = leak

AI Summary for RAG

Summary: VoIPmonitor scaling guide covering three bottlenecks: (1) Packet Capture - use TPACKET_V3, NIC tuning (ethtool ring buffer/interrupt coalescing), interface_ip_filter instead of BPF, kernel-bypass (DPDK, PF_RING, Napatech); (2) Disk I/O - TAR-based storage, ext4 tuning, RAID WriteBack cache; (3) Database - innodb_buffer_pool_size formula: shared servers = (RAM - VoIPmonitor - OS) / 2, dedicated = 50-70% RAM. Capacity: ~10,000 calls with full RTP, 60,000+ SIP-only. Component separation for >5,000 calls: MySQL/Sensor/GUI on separate hosts. Monitor t0CPU and SQLq in syslog.

Keywords: scaling, performance, bottleneck, t0CPU, SQLq, TPACKET_V3, DPDK, PF_RING, Napatech, ethtool, ring buffer, interrupt coalescing, interface_ip_filter, pcap_dump_writethreads, jitterbuffer, innodb_buffer_pool_size, OOM, database partitioning, component separation, three host architecture, dedicated MySQL, remote database

Key Questions:

How do I scale VoIPmonitor for thousands of concurrent calls?
What are the main performance bottlenecks?
How do I fix high t0CPU usage?
What is DPDK and when should I use it?
How do I calculate innodb_buffer_pool_size?
What causes "MariaDB connection refused" errors?
How do I deploy MySQL, GUI, and Sensor on separate servers?
How do I interpret syslog performance metrics?
How much RAM does VoIPmonitor need?