Sniffer troubleshooting: Difference between revisions
(Patch: replace '=== Solution: I/O Bottleneck =...') |
|||
| (28 intermediate revisions by 2 users not shown) | |||
| Line 15: | Line 15: | ||
</syntaxhighlight> | </syntaxhighlight> | ||
<kroki | <kroki lang="mermaid"> | ||
graph TD | graph TD | ||
A[No Calls Recorded] --> B{Packets on interface?<br/>tcpdump -i eth0 port 5060} | A[No Calls Recorded] --> B{Packets on interface?<br/>tcpdump -i eth0 port 5060} | ||
| Line 214: | Line 214: | ||
=== Diagnose: I/O vs CPU Bottleneck === | === Diagnose: I/O vs CPU Bottleneck === | ||
{{Warning|Do not guess the bottleneck source. Use proper diagnostics first to identify whether the issue is disk I/O, CPU, or database-related. Disabling storage as a test is valid but should be used to '''confirm''' findings, not as the primary diagnostic method.}} | |||
==== Step 1: Check IO[] Metrics (v2026.01.3+) ==== | |||
'''Starting with version 2026.01.3''', VoIPmonitor includes built-in disk I/O monitoring that directly shows disk saturation status: | |||
<syntaxhighlight lang="text"> | |||
[283.4/283.4Mb/s] IO[B1.1|L0.7|U45|C75|W125|R10|WI1.2k|RI0.5k] | |||
</syntaxhighlight> | |||
'''Quick interpretation:''' | |||
{| class="wikitable" | |||
|- | |||
! Metric !! Meaning !! Problem Indicator | |||
|- | |||
| '''C''' (Capacity) || % of disk's sustainable throughput used || '''C ≥ 80% = Warning''', '''C ≥ 95% = Saturated''' | |||
|- | |||
| '''L''' (Latency) || Current write latency in ms || '''L ≥ 3× B''' (baseline) = Saturated | |||
|- | |||
| '''U''' (Utilization) || % time disk is busy || '''U > 90%''' = Disk at limit | |||
|} | |||
'''If you see <code>DISK_SAT</code> or <code>WARN</code> after IO[]:''' | |||
<syntaxhighlight lang="text"> | |||
IO[B1.1|L8.5|U98|C97|W890|R5|WI12.5k|RI0.1k] DISK_SAT | |||
</syntaxhighlight> | |||
→ This confirms I/O bottleneck. Skip to [[#Solution:_I.2FO_Bottleneck|I/O Bottleneck Solutions]]. | |||
'''For older versions or additional confirmation''', continue with the steps below. | |||
{{Note|See [[Syslog_Status_Line#IO.5B....5D_-_Disk_I.2FO_Monitoring_.28v2026.01.3.2B.29|Syslog Status Line - IO[] section]] for detailed field descriptions.}} | |||
==== Step 2: Read the Full Syslog Status Line ==== | |||
VoIPmonitor outputs a status line every 10 seconds. This is your first diagnostic tool: | |||
<syntaxhighlight lang="bash"> | |||
# Monitor in real-time | |||
journalctl -u voipmonitor -f | |||
# or | |||
tail -f /var/log/syslog | grep voipmonitor | |||
</syntaxhighlight> | |||
'''Example status line:''' | |||
<syntaxhighlight lang="text"> | |||
calls[424] PS[C:4 S:41 R:13540] SQLq[C:0 M:0] heap[45|30|20] comp[48] [25.6Mb/s] t0CPU[85%] t1CPU[12%] t2CPU[8%] tacCPU[8|8|7|7%] RSS/VSZ[365|1640]MB | |||
</syntaxhighlight> | |||
'''Key metrics for bottleneck identification:''' | |||
{| class="wikitable" | |||
|- | |||
! Metric !! What It Indicates !! I/O Bottleneck Sign !! CPU Bottleneck Sign | |||
|- | |||
| <code>heap[A|B|C]</code> || Buffer fill % (primary / secondary / processing) || High A with low t0CPU || High A with high t0CPU | |||
|- | |||
| <code>t0CPU[X%]</code> || Packet capture thread (single-core, cannot parallelize) || Low (<50%) || High (>80%) | |||
|- | |||
| <code>comp[X]</code> || Active compression threads || Very high (maxed out) || Normal | |||
|- | |||
| <code>SQLq[C:X M:Y]</code> || Pending SQL queries || Growing = database bottleneck || Stable | |||
|- | |||
| <code>tacCPU[...]</code> || TAR compression threads || All near 100% = compression bottleneck || Normal | |||
|} | |||
'''Interpretation flowchart:''' | |||
<kroki lang="mermaid"> | |||
graph TD | |||
A[heap values rising] --> B{Check t0CPU} | |||
B -->|t0CPU > 80%| C[CPU Bottleneck] | |||
B -->|t0CPU < 50%| D{Check comp and tacCPU} | |||
D -->|comp maxed, tacCPU high| E[I/O Bottleneck<br/>Disk cannot keep up with writes] | |||
D -->|comp normal| F{Check SQLq} | |||
F -->|SQLq growing| G[Database Bottleneck] | |||
F -->|SQLq stable| H[Mixed/Other Issue] | |||
C --> C1[Solution: CPU optimization] | |||
E --> E1[Solution: Faster storage] | |||
G --> G1[Solution: MySQL tuning] | |||
</kroki> | |||
==== Step 3: Linux I/O Diagnostics ==== | |||
Use these standard Linux tools to confirm I/O bottleneck: | |||
'''Install required tools:''' | |||
<syntaxhighlight lang="bash"> | |||
# Debian/Ubuntu | |||
apt install sysstat iotop ioping | |||
# CentOS/RHEL | |||
yum install sysstat iotop ioping | |||
</syntaxhighlight> | |||
'''2a) iostat - Disk utilization and wait times''' | |||
<syntaxhighlight lang="bash"> | |||
# Run for 10 intervals of 2 seconds | |||
iostat -xz 2 10 | |||
</syntaxhighlight> | |||
'''Key output columns:''' | |||
<syntaxhighlight lang="text"> | |||
Device r/s w/s rkB/s wkB/s await %util | |||
sda 12.50 245.30 50.00 1962.40 45.23 98.50 | |||
</syntaxhighlight> | |||
{| class="wikitable" | |||
|- | |||
! Column !! Description !! Problem Indicator | |||
|- | |||
| <code>%util</code> || Device utilization percentage || '''> 90%''' = disk saturated | |||
|- | |||
| <code>await</code> || Average I/O wait time (ms) || '''> 20ms''' for SSD, '''> 50ms''' for HDD = high latency | |||
|- | |||
| <code>w/s</code> || Writes per second || Compare with disk's rated IOPS | |||
|} | |||
'''2b) iotop - Per-process I/O usage''' | |||
<syntaxhighlight lang="bash"> | |||
# Show I/O by process (run as root) | |||
iotop -o | |||
</syntaxhighlight> | |||
Look for <code>voipmonitor</code> or <code>mysqld</code> dominating I/O. If voipmonitor shows high DISK WRITE but system <code>%util</code> is 100%, disk cannot keep up. | |||
'''2c) ioping - Quick latency check''' | |||
<syntaxhighlight lang="bash"> | |||
# Test latency on VoIPmonitor spool directory | |||
cd /var/spool/voipmonitor | |||
ioping -c 20 . | |||
</syntaxhighlight> | |||
'''Expected results:''' | |||
{| class="wikitable" | |||
|- | |||
! Storage Type !! Healthy Latency !! Problem Indicator | |||
|- | |||
| NVMe SSD || < 0.5 ms || > 2 ms | |||
|- | |||
| SATA SSD || < 1 ms || > 5 ms | |||
|- | |||
| HDD (7200 RPM) || < 10 ms || > 30 ms | |||
|} | |||
==== Step 4: Linux CPU Diagnostics ==== | |||
'''3a) top - Overall CPU usage''' | |||
<syntaxhighlight lang="bash"> | |||
# Press '1' to show per-core CPU | |||
top | |||
</syntaxhighlight> | |||
Look for: | |||
* Individual CPU core at 100% (t0 thread is single-threaded) | |||
* High <code>%wa</code> (I/O wait) vs high <code>%us/%sy</code> (CPU-bound) | |||
'''3b) Verify voipmonitor threads''' | |||
<syntaxhighlight lang="bash"> | |||
# Show voipmonitor threads with CPU usage | |||
top -H -p $(pgrep voipmonitor) | |||
</syntaxhighlight> | |||
If one thread shows ~100% CPU while others are low, you have a CPU bottleneck on the capture thread (t0). | |||
==== Step 5: Decision Matrix ==== | |||
{| class="wikitable" | |||
|- | |||
! Observation !! Likely Cause !! Go To | |||
|- | |||
| <code>heap</code> high, <code>t0CPU</code> > 80%, iostat <code>%util</code> low || '''CPU Bottleneck''' || [[#Solution: CPU Bottleneck|CPU Solution]] | |||
|- | |||
| <code>heap</code> high, <code>t0CPU</code> < 50%, iostat <code>%util</code> > 90% || '''I/O Bottleneck''' || [[#Solution: I/O Bottleneck|I/O Solution]] | |||
|- | |||
| <code>heap</code> high, <code>t0CPU</code> < 50%, iostat <code>%util</code> < 50%, <code>SQLq</code> growing || '''Database Bottleneck''' || [[#SQL Queue Overload|Database Solution]] | |||
|- | |||
| <code>heap</code> normal, <code>comp</code> maxed, <code>tacCPU</code> all ~100% || '''Compression Bottleneck''' (type of I/O) || [[#Solution: I/O Bottleneck|I/O Solution]] | |||
|} | |||
==== Step 6: Confirmation Test (Optional) ==== | |||
After identifying the likely cause with the tools above, you can confirm with a storage disable test: | |||
<syntaxhighlight lang="ini"> | <syntaxhighlight lang="ini"> | ||
# voipmonitor.conf - disable storage | # /etc/voipmonitor.conf - temporarily disable all storage | ||
savesip = no | savesip = no | ||
savertp = no | savertp = no | ||
| Line 223: | Line 408: | ||
</syntaxhighlight> | </syntaxhighlight> | ||
<syntaxhighlight lang="bash"> | |||
systemctl restart voipmonitor | |||
# Monitor for 5-10 minutes during peak traffic | |||
journalctl -u voipmonitor -f | grep heap | |||
</syntaxhighlight> | |||
* If <code>heap</code> values drop to near zero → confirms '''I/O bottleneck''' | |||
* If <code>heap</code> values remain high → confirms '''CPU bottleneck''' | |||
{{Warning|Remember to re-enable storage after testing! This test causes call recordings to be lost.}} | |||
=== Solution: I/O Bottleneck === | === Solution: I/O Bottleneck === | ||
{{Note|If you see <code>IO[...] DISK_SAT</code> or <code>WARN</code> in the syslog status line (v2026.01.3+), disk saturation is already confirmed. See [[Syslog_Status_Line#IO.5B....5D_-_Disk_I.2FO_Monitoring_.28v2026.01.3.2B.29|IO[] Metrics]] for details.}} | |||
'''Quick confirmation (for older versions):''' | |||
Temporarily save only RTP headers to reduce disk write load: | |||
<syntaxhighlight lang="ini"> | |||
# /etc/voipmonitor.conf | |||
savertp = header | |||
</syntaxhighlight> | |||
Restart the sniffer and monitor. If heap usage stabilizes and "MEMORY IS FULL" errors stop, the issue is confirmed to be storage I/O. | |||
'''Check storage health before upgrading:''' | |||
<syntaxhighlight lang="bash"> | |||
# Check drive health | |||
smartctl -a /dev/sda | |||
# Check for I/O errors in system logs | |||
dmesg | grep -i "i/o error\|sd.*error\|ata.*error" | |||
</syntaxhighlight> | |||
Look for reallocated sectors, pending sectors, or I/O errors. Replace failing drives before considering upgrades. | |||
'''Storage controller cache settings:''' | |||
{| class="wikitable" | |||
|- | |||
! Storage Type !! Recommended Cache Mode | |||
|- | |||
| HDD / NAS || WriteBack (requires battery-backed cache) | |||
|- | |||
| SSD || WriteThrough (or WriteBack with power loss protection) | |||
|} | |||
Use vendor-specific tools to configure cache policy (<code>megacli</code>, <code>ssacli</code>, <code>perccli</code>). | |||
'''Storage upgrades (in order of effectiveness):''' | |||
{| class="wikitable" | |||
|- | |||
! Solution !! IOPS Improvement !! Notes | |||
|- | |||
| '''NVMe SSD''' || 50-100x vs HDD || Best option, handles 10,000+ concurrent calls | |||
|- | |||
| '''SATA SSD''' || 20-50x vs HDD || Good option, handles 5,000+ concurrent calls | |||
|- | |||
| '''RAID 10 with BBU''' || 5-10x vs single disk || Enable WriteBack cache (requires battery backup) | |||
|- | |||
| '''Separate storage server''' || Variable || Use [[Sniffer_distributed_architecture|client/server mode]] | |||
|} | |||
'''Filesystem tuning (ext4):''' | |||
<syntaxhighlight lang="bash"> | |||
# Check current mount options | |||
mount | grep voipmonitor | |||
# Recommended mount options for /var/spool/voipmonitor | |||
# Add to /etc/fstab: noatime,data=writeback,barrier=0 | |||
# WARNING: barrier=0 requires battery-backed RAID | |||
</syntaxhighlight> | |||
'''Verify improvement:''' | |||
<syntaxhighlight lang="bash"> | |||
# After changes, monitor iostat | |||
iostat -xz 2 10 | |||
# %util should drop below 70%, await should decrease | |||
</syntaxhighlight> | |||
=== Solution: CPU Bottleneck === | === Solution: CPU Bottleneck === | ||
For 8, | ==== Identify CPU Bottleneck Using Manager Commands ==== | ||
VoIPmonitor provides manager commands to monitor thread CPU usage in real-time. This is essential for identifying which thread is saturated. | |||
'''Connect to manager interface:''' | |||
<syntaxhighlight lang="bash"> | |||
# Via Unix socket (local, recommended) | |||
echo 'sniffer_threads' | nc -U /tmp/vm_manager_socket | |||
# Via TCP port 5029 (remote or local) | |||
echo 'sniffer_threads' | nc 127.0.0.1 5029 | |||
# Monitor continuously (every 2 seconds) | |||
watch -n 2 "echo 'sniffer_threads' | nc -U /tmp/vm_manager_socket" | |||
</syntaxhighlight> | |||
{{Note|1=TCP port 5029 is encrypted by default. For unencrypted access, set <code>manager_enable_unencrypted = yes</code> in voipmonitor.conf (security risk on public networks).}} | |||
'''Example output:''' | |||
<syntaxhighlight lang="text"> | |||
t0 - binlog1 fifo pcap read ( 12345) : 78.5 FIFO 99 1234 | |||
t2 - binlog1 pb write ( 12346) : 12.3 456 | |||
rtp thread binlog1 binlog1 0 ( 12347) : 8.1 234 | |||
rtp thread binlog1 binlog1 1 ( 12348) : 6.2 198 | |||
t1 - binlog1 call processing ( 12349) : 4.5 567 | |||
tar binlog1 compression 0 ( 12350) : 3.2 89 | |||
</syntaxhighlight> | |||
'''Column interpretation:''' | |||
{| class="wikitable" | |||
|- | |||
! Column !! Description | |||
|- | |||
| Thread name || Descriptive name (t0=capture, t1=call processing, t2=packet buffer write) | |||
|- | |||
| (TID) || Linux thread ID (useful for <code>top -H -p TID</code>) | |||
|- | |||
| CPU % || Current CPU usage percentage - '''key metric''' | |||
|- | |||
| Sched || Scheduler type (FIFO = real-time, empty = normal) | |||
|- | |||
| Priority || Thread priority | |||
|- | |||
| CS/s || Context switches per second | |||
|} | |||
'''Critical threads to watch:''' | |||
{| class="wikitable" | |||
|- | |||
! Thread !! Role !! If at 90-100% | |||
|- | |||
| '''t0''' (pcap read) || Packet capture from NIC || '''Single-core limit reached!''' Cannot parallelize. Need DPDK/Napatech. | |||
|- | |||
| '''t2''' (pb write) || Packet buffer processing || Processing bottleneck. Check t2CPU breakdown. | |||
|- | |||
| '''rtp thread''' || RTP packet processing || Threads auto-scale. If still saturated, consider DPDK/Napatech. | |||
|- | |||
| '''tar compression''' || PCAP archiving || I/O bottleneck (compression waiting for disk) | |||
|- | |||
| '''mysql store''' || Database writes || Database bottleneck. Check SQLq metric. | |||
|} | |||
{{Warning|If '''t0 thread is at 90-100%''', you have hit the fundamental single-core capture limit. The t0 thread reads packets from the kernel and '''cannot be parallelized'''. Disabling features like jitterbuffer will NOT help - those run on different threads. The only solutions are: | |||
* '''Reduce captured traffic''' using <code>interface_ip_filter</code> or BPF <code>filter</code> | |||
* '''Use kernel bypass''' ([[DPDK]] or [[Napatech]]) which eliminates kernel overhead entirely}} | |||
==== Interpreting t2CPU Detailed Breakdown ==== | |||
The syslog status line shows <code>t2CPU</code> with detailed sub-metrics: | |||
<syntaxhighlight lang="text"> | |||
t2CPU[pb:10/ d:39/ s:24/ e:17/ c:6/ g:6/ r:7/ rm:24/ rh:16/ rd:19/] | |||
</syntaxhighlight> | |||
{| class="wikitable" | |||
|- | |||
! Code !! Function !! High Value Indicates | |||
|- | |||
| '''pb''' || Packet buffer output || Buffer management overhead | |||
|- | |||
| '''d''' || Dispatch || Structure creation bottleneck | |||
|- | |||
| '''s''' || SIP parsing || Complex/large SIP messages | |||
|- | |||
| '''e''' || Entity lookup || Call table lookup overhead | |||
|- | |||
| '''c''' || Call processing || Call state machine processing | |||
|- | |||
| '''g''' || Register processing || High REGISTER volume | |||
|- | |||
| '''r, rm, rh, rd''' || RTP processing stages || High RTP volume (threads auto-scale) | |||
|} | |||
'''Thread auto-scaling:''' VoIPmonitor automatically spawns additional threads when load increases: | |||
* If '''d''' > 50% → SIP parsing thread ('''s''') starts | |||
* If '''s''' > 50% → Entity lookup thread ('''e''') starts | |||
* If '''e''' > 50% → Call/register/RTP threads start | |||
==== Configuration for High Traffic (>10,000 calls/sec) ==== | |||
<syntaxhighlight lang="ini"> | <syntaxhighlight lang="ini"> | ||
# voipmonitor.conf | # /etc/voipmonitor.conf | ||
# Increase buffer to handle processing spikes (value in MB) | |||
# 10000 = 10 GB - can go higher (20000, 30000+) if RAM allows | |||
# Larger buffer absorbs I/O and CPU spikes without packet loss | |||
max_buffer_mem = 10000 | max_buffer_mem = 10000 | ||
# Use IP filter instead of BPF (more efficient) | |||
interface_ip_filter = 10.0.0.0/8 | |||
interface_ip_filter = 192.168.0.0/16 | |||
# Comment out any 'filter' parameter | |||
</syntaxhighlight> | </syntaxhighlight> | ||
{{Note|1=After changes, monitor syslog <code>heap[A B C]</code> values - should stay below 20% during peak traffic.}} | ==== CPU Optimizations ==== | ||
<syntaxhighlight lang="ini"> | |||
# /etc/voipmonitor.conf | |||
# Reduce jitterbuffer calculations to save CPU (keeps MOS-F2 metric) | |||
jitterbuffer_f1 = no | |||
jitterbuffer_f2 = yes | |||
jitterbuffer_adapt = no | |||
# If MOS metrics are not needed at all, disable everything: | |||
# jitterbuffer_f1 = no | |||
# jitterbuffer_f2 = no | |||
# jitterbuffer_adapt = no | |||
</syntaxhighlight> | |||
==== Kernel Bypass Solutions (Extreme Loads) ==== | |||
When t0 thread hits 100% on standard NIC, kernel bypass is the only solution: | |||
{| class="wikitable" | |||
|- | |||
! Solution !! Type !! CPU Reduction !! Use Case | |||
|- | |||
| '''[[DPDK]]''' || Open-source || ~70% || Multi-gigabit on commodity hardware | |||
|- | |||
| '''[[Napatech]]''' || Hardware SmartNIC || >97% (< 3% at 10Gbit) || Extreme performance requirements | |||
|} | |||
==== Verify Improvement ==== | |||
<syntaxhighlight lang="bash"> | |||
# Monitor thread CPU after changes | |||
watch -n 2 "echo 'sniffer_threads' | nc -U /tmp/vm_manager_socket | head -10" | |||
# Or monitor syslog | |||
journalctl -u voipmonitor -f | |||
# t0CPU should drop, heap values should stay < 20% | |||
</syntaxhighlight> | |||
{{Note|1=After changes, monitor syslog <code>heap[A|B|C]</code> values - should stay below 20% during peak traffic. See [[Syslog_Status_Line]] for detailed metric explanations.}} | |||
== Storage Hardware Failure == | == Storage Hardware Failure == | ||
| Line 494: | Line 895: | ||
* [[Database_troubleshooting]] - Database issues | * [[Database_troubleshooting]] - Database issues | ||
* [[FAQ]] - Common questions and Wireshark display issues | * [[FAQ]] - Common questions and Wireshark display issues | ||
| Line 503: | Line 905: | ||
== AI Summary for RAG == | == AI Summary for RAG == | ||
<!-- This section is for AI/RAG systems. Do not edit manually. --> | |||
=== Summary === | |||
Comprehensive troubleshooting guide for VoIPmonitor sniffer/sensor problems. Covers: verifying traffic reaches interface (tcpdump/tshark), diagnosing no calls recorded (service, config, capture rules, SPAN), missing audio/RTP issues (one-way audio, NAT, natalias, rtp_check_both_sides_by_sdp), PACKETBUFFER FULL errors (I/O vs CPU bottleneck diagnosis using syslog metrics heap/t0CPU/SQLq and Linux tools iostat/iotop/ioping), manager commands for thread monitoring (sniffer_threads via socket or port 5029), t0 single-core capture limit and solutions (DPDK/Napatech kernel bypass), I/O solutions (NVMe/SSD, async writes, pcap_dump_writethreads), CPU solutions (max_buffer_mem 10GB+, jitterbuffer tuning), OOM issues (MySQL buffer pool, voipmonitor buffers), network interface problems (promiscuous mode, drops, offloading), packet ordering, database issues (SQL queue, Error 1062). | |||
=== Keywords === | |||
troubleshooting, sniffer, sensor, no calls, missing audio, one-way audio, RTP, PACKETBUFFER FULL, memory is FULL, buffer saturation, I/O bottleneck, CPU bottleneck, heap, t0CPU, t1CPU, t2CPU, SQLq, comp, tacCPU, iostat, iotop, ioping, sniffer_threads, manager socket, port 5029, thread CPU, t0 thread, single-core limit, DPDK, Napatech, kernel bypass, NVMe, SSD, async write, pcap_dump_writethreads, tar_maxthreads, max_buffer_mem, jitterbuffer, interface_ip_filter, OOM, out of memory, innodb_buffer_pool_size, promiscuous mode, interface drops, ethtool, packet ordering, SPAN, mirror, SQL queue, Error 1062, natalias, NAT, id_sensor, snaplen, capture rules, tcpdump, tshark | |||
=== Key Questions === | |||
* Why are no calls being recorded in VoIPmonitor? | |||
* How to diagnose PACKETBUFFER FULL or memory is FULL error? | |||
* How to determine if bottleneck is I/O or CPU? | |||
* What do heap values in syslog mean? | |||
* What does t0CPU percentage indicate? | |||
* How to use sniffer_threads manager command? | |||
* How to connect to manager socket or port 5029? | |||
* What to do when t0 thread is at 100%? | |||
* How to fix one-way audio or missing RTP? | |||
* How to configure natalias for NAT? | |||
* How to increase max_buffer_mem for high traffic? | |||
* How to disable jitterbuffer to save CPU? | |||
* What causes OOM kills of voipmonitor or MySQL? | |||
* How to check disk I/O performance with iostat? | |||
* How to enable promiscuous mode on interface? | |||
* How to fix packet ordering issues with SPAN? | |||
* What is Error 1062 duplicate entry? | |||
* How to verify traffic reaches capture interface? | |||
Latest revision as of 19:08, 22 January 2026
Sniffer Troubleshooting
This page covers common VoIPmonitor sniffer/sensor problems organized by symptom. For configuration reference, see Sniffer_configuration. For performance tuning, see Scaling.
Critical First Step: Is Traffic Reaching the Interface?
⚠️ Warning: Before any sensor tuning, verify packets are reaching the network interface. If packets aren't there, no amount of sensor configuration will help.
# Check for SIP traffic on the capture interface
tcpdump -i eth0 -nn "host <PROBLEMATIC_IP> and port 5060" -c 10
# If no packets: Network/SPAN issue - contact network admin
# If packets visible: Proceed with sensor troubleshooting below
Quick Diagnostic Checklist
| Check | Command | Expected Result |
|---|---|---|
| Service running | systemctl status voipmonitor |
Active (running) |
| Traffic on interface | tshark -i eth0 -c 5 -Y "sip" |
SIP packets displayed |
| Interface errors | ip -s link show eth0 |
No RX errors/drops |
| Promiscuous mode | ip link show eth0 |
PROMISC flag present |
| Logs | grep voip | No critical errors |
| GUI rules | Settings → Capture Rules | No unexpected "Skip" rules |
No Calls Being Recorded
Service Not Running
# Check status
systemctl status voipmonitor
# View recent logs
journalctl -u voipmonitor --since "10 minutes ago"
# Start/restart
systemctl restart voipmonitor
Common startup failures:
- Interface not found: Check
interfacein voipmonitor.conf matchesip aoutput - Port already in use: Another process using the management port
- License issue: Check License for activation problems
Wrong Interface or Port Configuration
# Check current config
grep -E "^interface|^sipport" /etc/voipmonitor.conf
# Example correct config:
# interface = eth0
# sipport = 5060
💡 Tip:
GUI Capture Rules Blocking
Navigate to Settings → Capture Rules and check for rules with action "Skip" that may be blocking calls. Rules are processed in order - a Skip rule early in the list will block matching calls.
See Capture_rules for detailed configuration.
SPAN/Mirror Not Configured
If tcpdump shows no traffic:
- Verify switch SPAN/mirror port configuration
- Check that both directions (ingress + egress) are mirrored
- Confirm VLAN tagging is preserved if needed
- Test physical connectivity (cable, port status)
See Sniffing_modes for SPAN, RSPAN, and ERSPAN configuration.
Filter Parameter Too Restrictive
If filter is set in voipmonitor.conf, it may exclude traffic:
# Check filter
grep "^filter" /etc/voipmonitor.conf
# Temporarily disable to test
# Comment out the filter line and restart
Missing id_sensor Parameter
Symptom: SIP packets visible in Capture/PCAP section but missing from CDR, SIP messages, and Call flow.
Cause: The id_sensor parameter is not configured or is missing. This parameter is required to associate captured packets with the CDR database.
Solution:
# Check if id_sensor is set
grep "^id_sensor" /etc/voipmonitor.conf
# Add or correct the parameter
echo "id_sensor = 1" >> /etc/voipmonitor.conf
# Restart the service
systemctl restart voipmonitor
💡 Tip: Use a unique numeric identifier (1-65535) for each sensor. Essential for multi-sensor deployments. See id_sensor documentation.
Missing Audio / RTP Issues
One-Way Audio (Asymmetric Mirroring)
Symptom: SIP recorded but only one RTP direction captured.
Cause: SPAN port configured for only one direction.
Diagnosis:
# Count RTP packets per direction
tshark -i eth0 -Y "rtp" -T fields -e ip.src -e ip.dst | sort | uniq -c
If one direction shows 0 or very few packets, configure the switch to mirror both ingress and egress traffic.
RTP Not Associated with Call
Symptom: Audio plays in sniffer but not in GUI, or RTP listed under wrong call.
Possible causes:
1. SIP and RTP on different interfaces/VLANs:
# voipmonitor.conf - enable automatic RTP association
auto_enable_use_blocks = yes
2. NAT not configured:
# voipmonitor.conf - for NAT scenarios
natalias = <public_ip> <private_ip>
# If not working, try reversed order:
natalias = <private_ip> <public_ip>
3. External device modifying media ports:
If SDP advertises one port but RTP arrives on different port (SBC/media server issue):
# Compare SDP ports vs actual RTP
tshark -r call.pcap -Y "sip.Method == INVITE" -V | grep "m=audio"
tshark -r call.pcap -Y "rtp" -T fields -e udp.dstport | sort -u
If ports don't match, the external device must be configured to preserve SDP ports - VoIPmonitor cannot compensate.
RTP Incorrectly Associated with Wrong Call (PBX Port Reuse)
Symptom: RTP streams from one call appear associated with a different CDR when your PBX aggressively reuses the same IP:port across multiple calls.
Cause: When PBX reuses media ports, VoIPmonitor may incorrectly correlate RTP packets to the wrong call based on weaker correlation methods.
Solution: Enable rtp_check_both_sides_by_sdp to require verification of both source and destination IP:port against SDP:
# voipmonitor.conf - require both source and destination to match SDP
rtp_check_both_sides_by_sdp = yes
# Alternative (strict) mode - allows initial unverified packets
rtp_check_both_sides_by_sdp = strict
⚠️ Warning: Enabling this may prevent RTP association for calls using NAT, as the source IP:port will not match the SDP. Use natalias mappings or the strict setting to mitigate this.
Snaplen Truncation
Symptom: Large SIP messages truncated, incomplete headers.
Solution:
# voipmonitor.conf - increase packet capture size
snaplen = 8192
For Kamailio siptrace, also check trace_msg_fragment_size in Kamailio config. See snaplen documentation.
PACKETBUFFER Saturation
Symptom: Log shows PACKETBUFFER: memory is FULL, truncated RTP recordings.
⚠️ Warning: This alert refers to VoIPmonitor's internal packet buffer (max_buffer_mem), NOT system RAM. High system memory availability does not prevent this error. The root cause is always a downstream bottleneck (disk I/O or CPU) preventing packets from being processed fast enough.
Before testing solutions, gather diagnostic data:
- Check sensor logs:
/var/log/syslog(Debian/Ubuntu) or/var/log/messages(RHEL/CentOS) - Generate debug log via GUI: Tools → Generate debug log
Diagnose: I/O vs CPU Bottleneck
⚠️ Warning: Do not guess the bottleneck source. Use proper diagnostics first to identify whether the issue is disk I/O, CPU, or database-related. Disabling storage as a test is valid but should be used to confirm findings, not as the primary diagnostic method.
Step 1: Check IO[] Metrics (v2026.01.3+)
Starting with version 2026.01.3, VoIPmonitor includes built-in disk I/O monitoring that directly shows disk saturation status:
[283.4/283.4Mb/s] IO[B1.1|L0.7|U45|C75|W125|R10|WI1.2k|RI0.5k]
Quick interpretation:
| Metric | Meaning | Problem Indicator |
|---|---|---|
| C (Capacity) | % of disk's sustainable throughput used | C ≥ 80% = Warning, C ≥ 95% = Saturated |
| L (Latency) | Current write latency in ms | L ≥ 3× B (baseline) = Saturated |
| U (Utilization) | % time disk is busy | U > 90% = Disk at limit |
If you see DISK_SAT or WARN after IO[]:
IO[B1.1|L8.5|U98|C97|W890|R5|WI12.5k|RI0.1k] DISK_SAT
→ This confirms I/O bottleneck. Skip to I/O Bottleneck Solutions.
For older versions or additional confirmation, continue with the steps below.
ℹ️ Note: See Syslog Status Line - IO[] section for detailed field descriptions.
Step 2: Read the Full Syslog Status Line
VoIPmonitor outputs a status line every 10 seconds. This is your first diagnostic tool:
# Monitor in real-time
journalctl -u voipmonitor -f
# or
tail -f /var/log/syslog | grep voipmonitor
Example status line:
calls[424] PS[C:4 S:41 R:13540] SQLq[C:0 M:0] heap[45|30|20] comp[48] [25.6Mb/s] t0CPU[85%] t1CPU[12%] t2CPU[8%] tacCPU[8|8|7|7%] RSS/VSZ[365|1640]MB
Key metrics for bottleneck identification:
| Metric | What It Indicates | I/O Bottleneck Sign | CPU Bottleneck Sign |
|---|---|---|---|
heap[A|B|C] |
Buffer fill % (primary / secondary / processing) | High A with low t0CPU | High A with high t0CPU |
t0CPU[X%] |
Packet capture thread (single-core, cannot parallelize) | Low (<50%) | High (>80%) |
comp[X] |
Active compression threads | Very high (maxed out) | Normal |
SQLq[C:X M:Y] |
Pending SQL queries | Growing = database bottleneck | Stable |
tacCPU[...] |
TAR compression threads | All near 100% = compression bottleneck | Normal |
Interpretation flowchart:
Step 3: Linux I/O Diagnostics
Use these standard Linux tools to confirm I/O bottleneck:
Install required tools:
# Debian/Ubuntu
apt install sysstat iotop ioping
# CentOS/RHEL
yum install sysstat iotop ioping
2a) iostat - Disk utilization and wait times
# Run for 10 intervals of 2 seconds
iostat -xz 2 10
Key output columns:
Device r/s w/s rkB/s wkB/s await %util
sda 12.50 245.30 50.00 1962.40 45.23 98.50
| Column | Description | Problem Indicator |
|---|---|---|
%util |
Device utilization percentage | > 90% = disk saturated |
await |
Average I/O wait time (ms) | > 20ms for SSD, > 50ms for HDD = high latency |
w/s |
Writes per second | Compare with disk's rated IOPS |
2b) iotop - Per-process I/O usage
# Show I/O by process (run as root)
iotop -o
Look for voipmonitor or mysqld dominating I/O. If voipmonitor shows high DISK WRITE but system %util is 100%, disk cannot keep up.
2c) ioping - Quick latency check
# Test latency on VoIPmonitor spool directory
cd /var/spool/voipmonitor
ioping -c 20 .
Expected results:
| Storage Type | Healthy Latency | Problem Indicator |
|---|---|---|
| NVMe SSD | < 0.5 ms | > 2 ms |
| SATA SSD | < 1 ms | > 5 ms |
| HDD (7200 RPM) | < 10 ms | > 30 ms |
Step 4: Linux CPU Diagnostics
3a) top - Overall CPU usage
# Press '1' to show per-core CPU
top
Look for:
- Individual CPU core at 100% (t0 thread is single-threaded)
- High
%wa(I/O wait) vs high%us/%sy(CPU-bound)
3b) Verify voipmonitor threads
# Show voipmonitor threads with CPU usage
top -H -p $(pgrep voipmonitor)
If one thread shows ~100% CPU while others are low, you have a CPU bottleneck on the capture thread (t0).
Step 5: Decision Matrix
| Observation | Likely Cause | Go To |
|---|---|---|
heap high, t0CPU > 80%, iostat %util low |
CPU Bottleneck | CPU Solution |
heap high, t0CPU < 50%, iostat %util > 90% |
I/O Bottleneck | I/O Solution |
heap high, t0CPU < 50%, iostat %util < 50%, SQLq growing |
Database Bottleneck | Database Solution |
heap normal, comp maxed, tacCPU all ~100% |
Compression Bottleneck (type of I/O) | I/O Solution |
Step 6: Confirmation Test (Optional)
After identifying the likely cause with the tools above, you can confirm with a storage disable test:
# /etc/voipmonitor.conf - temporarily disable all storage
savesip = no
savertp = no
savertcp = no
savegraph = no
systemctl restart voipmonitor
# Monitor for 5-10 minutes during peak traffic
journalctl -u voipmonitor -f | grep heap
- If
heapvalues drop to near zero → confirms I/O bottleneck - If
heapvalues remain high → confirms CPU bottleneck
⚠️ Warning: Remember to re-enable storage after testing! This test causes call recordings to be lost.
Solution: I/O Bottleneck
ℹ️ Note: If you see IO[...] DISK_SAT or WARN in the syslog status line (v2026.01.3+), disk saturation is already confirmed. See IO[] Metrics for details.
Quick confirmation (for older versions):
Temporarily save only RTP headers to reduce disk write load:
# /etc/voipmonitor.conf
savertp = header
Restart the sniffer and monitor. If heap usage stabilizes and "MEMORY IS FULL" errors stop, the issue is confirmed to be storage I/O.
Check storage health before upgrading:
# Check drive health
smartctl -a /dev/sda
# Check for I/O errors in system logs
dmesg | grep -i "i/o error\|sd.*error\|ata.*error"
Look for reallocated sectors, pending sectors, or I/O errors. Replace failing drives before considering upgrades.
Storage controller cache settings:
| Storage Type | Recommended Cache Mode |
|---|---|
| HDD / NAS | WriteBack (requires battery-backed cache) |
| SSD | WriteThrough (or WriteBack with power loss protection) |
Use vendor-specific tools to configure cache policy (megacli, ssacli, perccli).
Storage upgrades (in order of effectiveness):
| Solution | IOPS Improvement | Notes |
|---|---|---|
| NVMe SSD | 50-100x vs HDD | Best option, handles 10,000+ concurrent calls |
| SATA SSD | 20-50x vs HDD | Good option, handles 5,000+ concurrent calls |
| RAID 10 with BBU | 5-10x vs single disk | Enable WriteBack cache (requires battery backup) |
| Separate storage server | Variable | Use client/server mode |
Filesystem tuning (ext4):
# Check current mount options
mount | grep voipmonitor
# Recommended mount options for /var/spool/voipmonitor
# Add to /etc/fstab: noatime,data=writeback,barrier=0
# WARNING: barrier=0 requires battery-backed RAID
Verify improvement:
# After changes, monitor iostat
iostat -xz 2 10
# %util should drop below 70%, await should decrease
Solution: CPU Bottleneck
Identify CPU Bottleneck Using Manager Commands
VoIPmonitor provides manager commands to monitor thread CPU usage in real-time. This is essential for identifying which thread is saturated.
Connect to manager interface:
# Via Unix socket (local, recommended)
echo 'sniffer_threads' | nc -U /tmp/vm_manager_socket
# Via TCP port 5029 (remote or local)
echo 'sniffer_threads' | nc 127.0.0.1 5029
# Monitor continuously (every 2 seconds)
watch -n 2 "echo 'sniffer_threads' | nc -U /tmp/vm_manager_socket"
ℹ️ Note: TCP port 5029 is encrypted by default. For unencrypted access, set manager_enable_unencrypted = yes in voipmonitor.conf (security risk on public networks).
Example output:
t0 - binlog1 fifo pcap read ( 12345) : 78.5 FIFO 99 1234
t2 - binlog1 pb write ( 12346) : 12.3 456
rtp thread binlog1 binlog1 0 ( 12347) : 8.1 234
rtp thread binlog1 binlog1 1 ( 12348) : 6.2 198
t1 - binlog1 call processing ( 12349) : 4.5 567
tar binlog1 compression 0 ( 12350) : 3.2 89
Column interpretation:
| Column | Description |
|---|---|
| Thread name | Descriptive name (t0=capture, t1=call processing, t2=packet buffer write) |
| (TID) | Linux thread ID (useful for top -H -p TID)
|
| CPU % | Current CPU usage percentage - key metric |
| Sched | Scheduler type (FIFO = real-time, empty = normal) |
| Priority | Thread priority |
| CS/s | Context switches per second |
Critical threads to watch:
| Thread | Role | If at 90-100% |
|---|---|---|
| t0 (pcap read) | Packet capture from NIC | Single-core limit reached! Cannot parallelize. Need DPDK/Napatech. |
| t2 (pb write) | Packet buffer processing | Processing bottleneck. Check t2CPU breakdown. |
| rtp thread | RTP packet processing | Threads auto-scale. If still saturated, consider DPDK/Napatech. |
| tar compression | PCAP archiving | I/O bottleneck (compression waiting for disk) |
| mysql store | Database writes | Database bottleneck. Check SQLq metric. |
⚠️ Warning: If t0 thread is at 90-100%, you have hit the fundamental single-core capture limit. The t0 thread reads packets from the kernel and cannot be parallelized. Disabling features like jitterbuffer will NOT help - those run on different threads. The only solutions are:
Interpreting t2CPU Detailed Breakdown
The syslog status line shows t2CPU with detailed sub-metrics:
t2CPU[pb:10/ d:39/ s:24/ e:17/ c:6/ g:6/ r:7/ rm:24/ rh:16/ rd:19/]
| Code | Function | High Value Indicates |
|---|---|---|
| pb | Packet buffer output | Buffer management overhead |
| d | Dispatch | Structure creation bottleneck |
| s | SIP parsing | Complex/large SIP messages |
| e | Entity lookup | Call table lookup overhead |
| c | Call processing | Call state machine processing |
| g | Register processing | High REGISTER volume |
| r, rm, rh, rd | RTP processing stages | High RTP volume (threads auto-scale) |
Thread auto-scaling: VoIPmonitor automatically spawns additional threads when load increases:
- If d > 50% → SIP parsing thread (s) starts
- If s > 50% → Entity lookup thread (e) starts
- If e > 50% → Call/register/RTP threads start
Configuration for High Traffic (>10,000 calls/sec)
# /etc/voipmonitor.conf
# Increase buffer to handle processing spikes (value in MB)
# 10000 = 10 GB - can go higher (20000, 30000+) if RAM allows
# Larger buffer absorbs I/O and CPU spikes without packet loss
max_buffer_mem = 10000
# Use IP filter instead of BPF (more efficient)
interface_ip_filter = 10.0.0.0/8
interface_ip_filter = 192.168.0.0/16
# Comment out any 'filter' parameter
CPU Optimizations
# /etc/voipmonitor.conf
# Reduce jitterbuffer calculations to save CPU (keeps MOS-F2 metric)
jitterbuffer_f1 = no
jitterbuffer_f2 = yes
jitterbuffer_adapt = no
# If MOS metrics are not needed at all, disable everything:
# jitterbuffer_f1 = no
# jitterbuffer_f2 = no
# jitterbuffer_adapt = no
Kernel Bypass Solutions (Extreme Loads)
When t0 thread hits 100% on standard NIC, kernel bypass is the only solution:
| Solution | Type | CPU Reduction | Use Case |
|---|---|---|---|
| DPDK | Open-source | ~70% | Multi-gigabit on commodity hardware |
| Napatech | Hardware SmartNIC | >97% (< 3% at 10Gbit) | Extreme performance requirements |
Verify Improvement
# Monitor thread CPU after changes
watch -n 2 "echo 'sniffer_threads' | nc -U /tmp/vm_manager_socket | head -10"
# Or monitor syslog
journalctl -u voipmonitor -f
# t0CPU should drop, heap values should stay < 20%
ℹ️ Note: After changes, monitor syslog heap[A|B|C] values - should stay below 20% during peak traffic. See Syslog_Status_Line for detailed metric explanations.
Storage Hardware Failure
Symptom: Sensor shows disconnected (red X) with "DROPPED PACKETS" at low traffic volumes.
Diagnosis:
# Check disk health
smartctl -a /dev/sda
# Check RAID status (if applicable)
cat /proc/mdstat
mdadm --detail /dev/md0
Look for reallocated sectors, pending sectors, or RAID degraded state. Replace failing disk.
OOM (Out of Memory)
Identify OOM Victim
# Check for OOM kills
dmesg | grep -i "out of memory\|oom\|killed process"
journalctl --since "1 hour ago" | grep -i oom
MySQL Killed by OOM
Reduce InnoDB buffer pool:
# /etc/mysql/my.cnf
innodb_buffer_pool_size = 2G # Reduce from default
Voipmonitor Killed by OOM
Reduce buffer sizes in voipmonitor.conf:
max_buffer_mem = 2000 # Reduce from default
ringbuffer = 50 # Reduce from default
Runaway External Process
# Find memory-hungry processes
ps aux --sort=-%mem | head -20
# Kill orphaned/runaway process
kill -9 <PID>
For servers limited to 16GB RAM or when experiencing repeated MySQL OOM kills:
# /etc/my.cnf or /etc/mysql/mariadb.conf.d/50-server.cnf
[mysqld]
# On 16GB server: 6GB buffer pool + 6GB MySQL overhead = 12GB total
# Leaves 4GB for OS + GUI, preventing OOM
innodb_buffer_pool_size = 6G
# Enable write buffering (may lose up to 1s of data on crash but reduces memory pressure)
innodb_flush_log_at_trx_commit = 2
Restart MySQL after changes:
systemctl restart mysql
# or
systemctl restart mariadb
SQL Queue Growth from Non-Call Data
If sip-register, sip-options, or sip-subscribe are enabled, non-call SIP-messages (OPTIONS, REGISTER, SUBSCRIBE, NOTIFY) can accumulate in the database and cause the SQL queue to grow unbounded. This increases MySQL memory usage and leads to OOM kills of mysqld.
⚠️ Warning: Even with reduced innodb_buffer_pool_size, SQL queue will grow indefinitely without cleanup of non-call data.
Solution: Enable automatic cleanup of old non-call data
# /etc/voipmonitor.conf
# cleandatabase=2555 automatically deletes partitions older than 7 years
# Covers: CDR, register_state, register_failed, and sip_msg (OPTIONS/SUBSCRIBE/NOTIFY)
cleandatabase = 2555
Restart the sniffer after changes:
systemctl restart voipmonitor
ℹ️ Note: See Data_Cleaning for detailed configuration options and other cleandatabase_* parameters.
Service Startup Failures
Interface No Longer Exists
After OS upgrade, interface names may change (eth0 → ensXXX):
# Find current interface names
ip a
# Update all config locations
grep -r "interface" /etc/voipmonitor.conf /etc/voipmonitor.conf.d/
# Also check GUI: Settings → Sensors → Configuration
Missing Dependencies
# Install common missing package
apt install libpcap0.8 # Debian/Ubuntu
yum install libpcap # RHEL/CentOS
Network Interface Issues
Promiscuous Mode
Required for SPAN port monitoring:
# Enable
ip link set eth0 promisc on
# Verify
ip link show eth0 | grep PROMISC
ℹ️ Note: Promiscuous mode is NOT required for ERSPAN/GRE tunnels where traffic is addressed to the sensor.
Interface Drops
# Check for drops
ip -s link show eth0 | grep -i drop
# If drops present, increase ring buffer
ethtool -G eth0 rx 4096
Bonded/EtherChannel Interfaces
Symptom: False packet loss when monitoring bond0 or br0.
Solution: Monitor physical interfaces, not logical:
# voipmonitor.conf - use physical interfaces
interface = eth0,eth1
Network Offloading Issues
Symptom: Kernel errors like bad gso: type: 1, size: 1448
# Disable offloading on capture interface
ethtool -K eth0 gso off tso off gro off lro off
Packet Ordering Issues
If SIP messages appear out of sequence:
First: Rule out Wireshark display artifact - disable "Analyze TCP sequence numbers" in Wireshark. See FAQ.
If genuine reordering: Usually caused by packet bursts in network infrastructure. Use tcpdump to verify packets arrive out of order at the interface. Work with network admin to implement QoS or traffic shaping. For persistent issues, consider dedicated capture card with hardware timestamping (see Napatech).
ℹ️ Note: For out-of-order packets in client/server mode (multiple sniffers), see Sniffer_distributed_architecture for pcap_queue_dequeu_window_length configuration.
Solutions for SPAN/Mirroring Reordering
If packets arrive out of order at the SPAN/mirror port (e.g., 302 responses before INVITE causing "000 no response" errors):
1. Configure switch to preserve packet order: Many switches allow configuring SPAN/mirror ports to maintain packet ordering. Consult your switch documentation for packet ordering guarantees in mirroring configuration.
2. Replace SPAN with TAP or packet broker: Unlike software-based SPAN mirroring, hardware TAPs and packet brokers guarantee packet order. Consider upgrading to a dedicated TAP or packet broker device for mission-critical monitoring.
Database Issues
SQL Queue Overload
Symptom: Growing SQLq metric, potential coredumps.
# voipmonitor.conf - increase threads
mysqlstore_concat_limit_cdr = 1000
cdr_check_exists_callid = 0
Error 1062 - Lookup Table Limit
Symptom: Duplicate entry '16777215' for key 'PRIMARY'
Quick fix:
# voipmonitor.conf
cdr_reason_string_enable = no
See Database Troubleshooting for complete solution.
Bad Packet Errors
Symptom: bad packet with ether_type 0xFFFF detected on interface
Diagnosis:
# Run diagnostic (let run 30-60 seconds, then kill)
voipmonitor --check_bad_ether_type=eth0
# Find and kill the diagnostic process
ps ax | grep voipmonitor
kill -9 <PID>
Causes: corrupted packets, driver issues, VLAN tagging problems. Check ethtool -S eth0 for interface errors.
Useful Diagnostic Commands
tshark Filters for SIP
# All SIP INVITEs
tshark -r capture.pcap -Y "sip.Method == INVITE"
# Find specific phone number
tshark -r capture.pcap -Y 'sip contains "5551234567"'
# Get Call-IDs
tshark -r capture.pcap -Y "sip.Method == INVITE" -T fields -e sip.Call-ID
# SIP errors (4xx, 5xx)
tshark -r capture.pcap -Y "sip.Status-Code >= 400"
Interface Statistics
# Detailed NIC stats
ethtool -S eth0
# Watch packet rates
watch -n 1 'cat /proc/net/dev | grep eth0'
See Also
- Sniffer_configuration - Configuration parameter reference
- Sniffer_distributed_architecture - Client/server deployment
- Capture_rules - GUI-based recording rules
- Sniffing_modes - SPAN, ERSPAN, GRE, TZSP setup
- Scaling - Performance optimization
- Database_troubleshooting - Database issues
- FAQ - Common questions and Wireshark display issues
AI Summary for RAG
Summary
Comprehensive troubleshooting guide for VoIPmonitor sniffer/sensor problems. Covers: verifying traffic reaches interface (tcpdump/tshark), diagnosing no calls recorded (service, config, capture rules, SPAN), missing audio/RTP issues (one-way audio, NAT, natalias, rtp_check_both_sides_by_sdp), PACKETBUFFER FULL errors (I/O vs CPU bottleneck diagnosis using syslog metrics heap/t0CPU/SQLq and Linux tools iostat/iotop/ioping), manager commands for thread monitoring (sniffer_threads via socket or port 5029), t0 single-core capture limit and solutions (DPDK/Napatech kernel bypass), I/O solutions (NVMe/SSD, async writes, pcap_dump_writethreads), CPU solutions (max_buffer_mem 10GB+, jitterbuffer tuning), OOM issues (MySQL buffer pool, voipmonitor buffers), network interface problems (promiscuous mode, drops, offloading), packet ordering, database issues (SQL queue, Error 1062).
Keywords
troubleshooting, sniffer, sensor, no calls, missing audio, one-way audio, RTP, PACKETBUFFER FULL, memory is FULL, buffer saturation, I/O bottleneck, CPU bottleneck, heap, t0CPU, t1CPU, t2CPU, SQLq, comp, tacCPU, iostat, iotop, ioping, sniffer_threads, manager socket, port 5029, thread CPU, t0 thread, single-core limit, DPDK, Napatech, kernel bypass, NVMe, SSD, async write, pcap_dump_writethreads, tar_maxthreads, max_buffer_mem, jitterbuffer, interface_ip_filter, OOM, out of memory, innodb_buffer_pool_size, promiscuous mode, interface drops, ethtool, packet ordering, SPAN, mirror, SQL queue, Error 1062, natalias, NAT, id_sensor, snaplen, capture rules, tcpdump, tshark
Key Questions
- Why are no calls being recorded in VoIPmonitor?
- How to diagnose PACKETBUFFER FULL or memory is FULL error?
- How to determine if bottleneck is I/O or CPU?
- What do heap values in syslog mean?
- What does t0CPU percentage indicate?
- How to use sniffer_threads manager command?
- How to connect to manager socket or port 5029?
- What to do when t0 thread is at 100%?
- How to fix one-way audio or missing RTP?
- How to configure natalias for NAT?
- How to increase max_buffer_mem for high traffic?
- How to disable jitterbuffer to save CPU?
- What causes OOM kills of voipmonitor or MySQL?
- How to check disk I/O performance with iostat?
- How to enable promiscuous mode on interface?
- How to fix packet ordering issues with SPAN?
- What is Error 1062 duplicate entry?
- How to verify traffic reaches capture interface?