Syslog Status Line: Difference between revisions
(Patch: replace '<syntaxhighlight lang=text> ...') |
(Patch: replace '{| class=wikitable |- ! Fiel...') |
||
| Line 151: | Line 151: | ||
! Field !! Description !! Source | ! Field !! Description !! Source | ||
|- | |- | ||
| '''B''' || Baseline latency from calibration profile || Measured at minimal I/O load | | '''B''' || Baseline latency from calibration profile (ms) || Measured at minimal I/O load | ||
|- | |- | ||
| '''L''' || Current write latency || Calculated from <code>/proc/diskstats</code> | | '''L''' || Current write latency (ms) || Calculated from <code>/proc/diskstats</code> | ||
|- | |||
| '''Q''' || I/O queue depth (pending operations) || <code>/proc/diskstats</code> weighted_io_time | |||
|- | |- | ||
| '''U''' || Disk utilization % (time spent doing I/O) || <code>/proc/diskstats</code> io_time | | '''U''' || Disk utilization % (time spent doing I/O) || <code>/proc/diskstats</code> io_time | ||
Revision as of 00:16, 23 January 2026
Every 10 seconds, VoIPmonitor outputs a status line to syslog containing real-time metrics about calls, CPU usage, memory, and queue sizes. This page explains each metric and what to do when values indicate problems.
Overview
The status line is logged to:
- Debian/Ubuntu:
/var/log/syslog - CentOS/RHEL:
/var/log/messages
Example status line:
Nov 26 07:22:06 voipmonitor voipmonitor[2518]: calls[424][424] PS[C:4 S:41/41 R:13540 A:13583] SQLq[C:0 M:0 Cl:0] heap[0|0|0] comp[48] [25.6Mb/s] t0CPU[7.0%] t1CPU[2.5%] t2CPU[1.6%] tacCPU[8.0|8.0|6.8|6.9%] RSS/VSZ[365|1640]MB
Monitor in real-time:
journalctl -u voipmonitor -f
# or
tail -f /var/log/syslog | grep voipmonitor
Metric Reference
calls[X][Y] - Active Calls
calls[424][424]
│ └── Total calls in memory (including finishing)
└─────── Active calls (in progress)
| Value | Meaning | Action |
|---|---|---|
| Both equal | Normal operation | None |
| Second > First | Calls finishing, cleanup pending | Normal |
| Very high numbers | High traffic or stuck calls | Check for call leaks |
PS[C:x S:y/z R:a A:b] - Packet Statistics
PS[C:4 S:41/41 R:13540 A:13583]
│ │ │ │ └── A: All packets processed
│ │ │ └────────── R: RTP packets
│ │ └─────────────── S: SIP packets (current/total)
│ └─────────────────── S: SIP packets per interval
└─────────────────────── C: Control packets
💡 Tip: If R (RTP) is 0 but S (SIP) shows traffic, check that RTP ports are included in capture or NAT aliases are configured.
SQLq[C:x M:y Cl:z] - SQL Queue
Critical metric for database health.
SQLq[C:0 M:0 Cl:0]
│ │ └── Cl: Cleanup queries pending
│ └─────── M: Message/register queries pending
└─────────── C: CDR queries pending
| Value | Status | Action |
|---|---|---|
| All zeros | Healthy - DB keeping up | None |
| Growing slowly | Warning - DB slightly behind | Monitor trend |
| Growing continuously | Critical - DB bottleneck | See SQL Queue Growing |
| Stuck at high value | Critical - DB connection issue | Check MySQL connectivity |
Alternative format (older versions):
SQLq[cdr: 5234] SQLf[cdr: 12]
└── In-memory queue └── On-disk queue (query_cache files)
heap[A|B|C] - Memory Buffers
heap[0|0|0]
│ │ └── C: Processing heap usage %
│ └──── B: Secondary buffer usage %
└────── A: Primary packet buffer usage %
| Value | Status | Action |
|---|---|---|
| All < 20% | Healthy | None |
| Any > 50% | Warning - Buffer filling | Investigate bottleneck |
| Any approaching 100% | Critical | Increase max_buffer_mem or fix bottleneck
|
⚠️ Warning: If heap reaches 100%, you'll see PACKETBUFFER: MEMORY IS FULL and packets will be dropped.
comp[x] - Compression Threads
comp[48]
└── Number of active compression threads for PCAP/audio files
High values indicate high CPU usage
[X.X Mb/s] - Traffic Rate
[25.6Mb/s]
└── Current network traffic rate being processed
Useful for capacity planning and detecting traffic spikes.
IO[...] - Disk I/O Monitoring (v2026.01.3+)
[283.4/283.4Mb/s] IO[B1.1|L0.7|Q3.2|U45|C75|W125|R10|WI1.2k|RI0.5k]
│ │ │ │ │ │ │ │ └── RI: Read IOPS (k=thousands)
│ │ │ │ │ │ │ └───────── WI: Write IOPS (k=thousands)
│ │ │ │ │ │ └───────────── R: Read throughput MB/s
│ │ │ │ │ └────────────────── W: Write throughput MB/s
│ │ │ │ └────────────────────── C: Capacity % (current/knee throughput)
│ │ │ └────────────────────────── U: Disk utilization %
│ │ └─────────────────────────────── Q: I/O queue depth
│ └──────────────────────────────────── L: Current write latency (ms)
└───────────────────────────────────────── B: Baseline latency (calibrated, ms)
Key concept: The Capacity (C) metric shows what percentage of the disk's sustainable throughput ("knee point") is currently being used. This is the primary indicator for disk saturation risk.
| Field | Description | Source |
|---|---|---|
| B | Baseline latency from calibration profile (ms) | Measured at minimal I/O load |
| L | Current write latency (ms) | Calculated from /proc/diskstats
|
| Q | I/O queue depth (pending operations) | /proc/diskstats weighted_io_time
|
| U | Disk utilization % (time spent doing I/O) | /proc/diskstats io_time
|
| C | Capacity % = current throughput / knee throughput × 100 | Compared to calibration profile |
| W | Write throughput (MB/s) | /proc/diskstats sectors_written
|
| R | Read throughput (MB/s) | /proc/diskstats sectors_read
|
| WI | Write IOPS (always shown with 'k' suffix) | /proc/diskstats writes_completed
|
| RI | Read IOPS (always shown with 'k' suffix) | /proc/diskstats reads_completed
|
Status Indicators
| Status | Condition | Meaning | Action |
|---|---|---|---|
| (none) | C < 80% | Normal operation | None required |
| WARN | C ≥ 80% | Approaching disk capacity limit | Monitor closely, plan storage upgrade |
| DISK_SAT | C ≥ 95% OR L ≥ 3×B | Disk cannot keep up with writes | Immediate action required - see I/O Bottleneck Solutions |
Example with saturation warning:
[450.2/450.2Mb/s] IO[B1.1|L8.5|U98|C97|W890|R5|WI12.5k|RI0.1k] DISK_SAT
Calibration
On first startup, VoIPmonitor performs automatic disk calibration (~2-3 minutes, runs in background):
- Measures baseline latency with minimal I/O
- Progressively increases write load to find the "knee point" (where latency starts increasing)
- Stores profile in
.disk_io_calibration.confin the spool directory
During calibration:
IO[calibrating 45%]
After calibration:
IO[B1.1|L0.7|U12|C8|W65|R2|WI0.8k|RI0.1k]
Recalibration
Calibration is automatically triggered when:
- No calibration profile exists
- Filesystem UUID changes (disk replaced)
To force recalibration, delete the profile:
rm /var/spool/voipmonitor/.disk_io_calibration.conf
systemctl restart voipmonitor
💡 Tip: The IO[] metrics are written to RRD for graphing. The chart 2db-diskio contains all 9 metrics: io-latency, io-qdepth, io-util, io-capacity, io-write-throughput, io-read-throughput, io-write-iops, io-read-iops, io-reserve.
t0CPU[X%] - Packet Capture Thread
Most critical CPU metric.
t0CPU[7.0%]
└── CPU usage of the main packet capture thread
| Value | Status | Action |
|---|---|---|
| < 50% | Healthy | None |
| 50-80% | Warning | Plan capacity upgrade |
| > 90% | Critical | Packets will be dropped! |
⚠️ Warning: The t0 thread cannot be parallelized. If it hits 100%, you must reduce load (filters, disable features) or use kernel bypass (DPDK, Napatech).
t1CPU[X%], t2CPU[X%] - Processing Threads
t1CPU[2.5%] t2CPU[1.6%]
Secondary processing threads. These can scale with traffic automatically.
t2CPU Detailed Breakdown
When t2CPU shows high usage, the detailed breakdown helps identify the bottleneck:
t2CPU[pb:10.5/ d:39.2/ s:24.6/ e:17.3/ c:6.8/ g:6.4/ r:7.3/ rm:24.6/ rh:16.7/ rd:19.3/]
| Code | Function | Description |
|---|---|---|
| pb | Packet buffer | Output from packet buffer |
| d | Dispatch | Creating structures for processing |
| s | SIP parsing | Parsing SIP headers |
| e | Entity lookup | Finding/creating calls |
| c | Call processing | Processing call packets |
| g | Register processing | Processing REGISTER packets |
| r | RTP processing | Processing RTP packets |
| rm | RTP move | Moving RTP packets for processing |
| rh | RTP hash | RTP hash table lookups |
| rd | RTP dispatch | Dispatching to RTP read queue |
Thread auto-scaling:
- If d > 50% → s thread starts (SIP parsing)
- If s > 50% → e thread starts (entity lookup)
- If e > 50% → c, g, r threads start
tacCPU[A|B|C|D%] - TAR Compression
tacCPU[8.0|8.0|6.8|6.9%]
└── CPU usage of TAR archive compression threads
High values indicate heavy PCAP archiving. Controlled by tar_maxthreads (default: 8).
RSS/VSZ[X|Y]MB - Memory Usage
RSS/VSZ[365|1640]MB
│ └── VSZ: Virtual memory (pre-allocated, not all used)
└─────── RSS: Resident Set Size (actual physical memory used)
| Metric | Meaning | Concern Level |
|---|---|---|
| RSS growing continuously | Possible memory leak | Investigate |
| High VSZ, normal RSS | Normal - virtual pre-allocation | None |
| RSS approaching server RAM | OOM risk | Reduce buffers or add RAM |
Common Issues
SQL Queue Growing
Symptoms: SQLq[C:5234...] increasing over time
Diagnosis:
# Check MySQL CPU
top -p $(pgrep mysqld)
# Check disk I/O
iostat -x 1 5
Solutions:
| Bottleneck | Solution |
|---|---|
| I/O (high iowait) | Upgrade to SSD/NVMe storage |
| CPU (mysqld at 100%) | Upgrade CPU or optimize queries |
| RAM (swapping) | Increase innodb_buffer_pool_size
|
Immediate mitigations:
# /etc/voipmonitor.conf
query_cache = yes # Prevent OOM by using disk queue
mysqlstore_max_threads_cdr = 8 # Parallel DB writes
See SQL_queue_is_growing_in_a_peaktime and Database_troubleshooting for detailed guidance.
High t0CPU
Symptoms: t0CPU[>90%]
Solutions (in order of preference):
- Use
interface_ip_filterinstead of BPFfilter - Disable unused features (jitterbuffer if not needed)
- Upgrade to kernel bypass: DPDK or Napatech
PACKETBUFFER FULL
Symptoms: heap[90|80|70] approaching 100%, or log message "PACKETBUFFER: MEMORY IS FULL"
⚠️ Warning: This refers to VoIPmonitor's internal packet buffer, NOT system RAM. The root cause is always a downstream bottleneck (disk I/O, CPU, or database) - simply increasing buffer sizes will only delay the problem.
Diagnosis and Solutions: See PACKETBUFFER Saturation Troubleshooting for comprehensive diagnostic steps including:
- How to identify I/O vs CPU bottleneck using syslog metrics
- Linux diagnostic tools (iostat, iotop, ioping)
- Decision matrix for identifying root cause
- Solutions for I/O, CPU, and database bottlenecks
Accessing Status Programmatically
Via Manager API
The sniffer_stat command returns JSON including pbStatString:
echo 'sniffer_stat' | nc -U /tmp/vm_manager_socket | jq '.pbStatString'
See Manager_API for connection details.
Via GUI
Navigate to Settings → Sensors → Status to see real-time metrics for all connected sensors.
See Also
- Manager_API - Query status programmatically
- Sniffer_detailed_architecture - Internal architecture overview
- SQL_queue_is_growing_in_a_peaktime - Database bottleneck troubleshooting
- Scaling - Performance tuning guide
- Database_troubleshooting - MySQL optimization