Understanding the Sniffer's Performance Log: Difference between revisions

From VoIPmonitor.org
Jump to navigation Jump to search
No edit summary
 
(23 intermediate revisions by 2 users not shown)
Line 1: Line 1:
{{DISPLAYTITLE:Understanding the Sniffer's Performance Log}}


=Messages from GNU GPL sniffer sensor service=
'''This guide provides a detailed explanation of the real-time performance and status logs generated by the VoIPmonitor sensor. Understanding these metrics is essential for diagnosing bottlenecks, tuning performance, and ensuring the health of your capture environment.'''


'''Voipmonitor''' by default uses 'daemon' facility of a syslog to store status messages.
== Accessing and Managing the Logs ==
==Default location==
By default, the VoIPmonitor sensor sends all its status messages to the system's `syslog` service using the `daemon` facility.
it is stored to '''/var/log/syslog'''(on debian/ubuntu)or to '''/var/log/messages'''(on centos/rh)


==Messages file Change==
=== Default Log Locations ===
You can find useful to store status info from voipmonitor to different file:
*'''Debian/Ubuntu:''' <code>/var/log/syslog</code>
For '''rsyslog''' use this in /etc/rsyslog.conf
*'''CentOS/RHEL/AlmaLinux:''' <code>/var/log/messages</code>
if $programname == 'voipmonitor' and $syslogseverity <= '7' then /var/log/voipmon.log
& ~


=Status line details=
You can watch the log in real-time with the `tail` command:
<pre>tail -f /var/log/syslog | grep voipmonitor</pre>


==SQLq/SQLf==
=== Redirecting Logs to a Custom File ===
C=CDR_queue
For easier analysis, you can redirect all VoIPmonitor logs to a separate file.
M=Message_queue
;Edit `/etc/rsyslog.conf` and add the following lines at the top:
R=Register_queue
<pre>
L=LiveSniffer_queue
# Send all messages from the 'voipmonitor' program to a dedicated file
Cl=Cleanspool queue
if $programname == 'voipmonitor' then /var/log/voipmonitor.log
'''SQLf reported''' when query_cache enabled in sensors config


==heap[A|B|C]==
# Stop processing these messages further to avoid duplication
===A===
& ~
number of % of used heap memory.If 100 voipmonitor is not able to process packets in realtime due to CPU or I/O.
</pre>
===B===
Remember to restart the `rsyslog` service after making changes.
number of % used memory in packetbuffer.  
===C===
% used for async write buffers (if 100% I/O is blocking and heap will grow and than ring buffer will get full and then packet loss will occur)


==[Mb/s]==
== Anatomy of the Status Line ==
total network throughput
Every 10 seconds, the sensor outputs a detailed status line. Here is a breakdown of each component, which together act as a real-time dashboard for your sensor's health.


==tarQ==
A sample log line:
number of files in a queue
<code>voipmonitor[15567]: calls[315][355] PS[C:4 S:29 R:6354] SQLq[0] heap[0|0|0] [12.6Mb/s] t0CPU[5.2%] t2CPU[pb:43/d:19] tRTP_CPU[658%/46m/15t] tarQ[1865] RSS/VSZ[323|752]MB LA[11.9]</code>


==tarB==
=== Overall Call & Packet Statistics ===
MBs in tar buffer


==tarCPU==
;<code>calls[14527,r:5241][14428,r:5350]</code>
threads used for taring - its consumption
:The number of call structures currently held in memory.
:* '''First Bracket:''' Active calls being processed. `14527` SIP calls and `r:5241` REGISTER dialogs.
:* '''Second Bracket:''' Calls in the final processing queue before being cleared.


==t2CPU==
;<code>PS[C:29 S:2017 R:327512 A:269886]</code>
pb:10.5/ - packetbuffer - out of the buffer
:'''P'''ackets per '''S'''econd. This shows the rate of incoming traffic.
d:39.2/ - structs create for processing in t2
:* '''C:''' New '''C'''alls per second.
s:24.6/ - SIP - parse
:* '''S:''' Valid '''S'''IP packets per second.
e:17.3/ - SIP - calls/messages search, struct creation
:* '''R:''' '''R'''TP packets per second from monitored calls.
c:6.8/               - process_packets - calls/messages
:* '''A:''' '''A'''ll packets per second seen by the interface.
g:6.4/                      - process_packets - registers
r:7.3/               - process_packets - RTP
rm:24.6/ - RTP - packets shift, prepare for processing
rh:16.7/ - RTP - search hash
rd:19.3/ - RTP - move to read queue


===Adding new thread is automatic===
;<code>[12.6Mb/s]</code>
'd' is running after pb,
:The total network throughput being processed by the sniffer in Megabits per second.
if 'd' > 50%, new thread 's' (reasembles, sip parse)
if 's' > 50%, new thread 'e' (callid search + structs create for calls),
if 'e' > 50%, new thread 'c' (calls)
if 'c' > 50%, new thread 'g' (registers)
if 'g' > 50%, new thread 'r' (rtp)


===Threads removing===
=== Database Health ===
if 'r' < N remove 'r'
if 'g' < N remove 'g'
if 'c' < N remove 'c'
if 'e' < N remove 'e'
if 's' < N remove 's'


==tRTP_CPU==
;<code>SQLq[C:0 M:0 R:0]</code>
[658.8%/46.7m/15t] Means that 15threads processing RTP, peak thread 46.7%, Sum 658.8%
:The number of SQL queries waiting in the queue for each data type.
:* '''C:''' CDRs, '''M:''' Messages, '''R:''' Registers.
:* '''What to watch for:''' If these numbers are consistently high and growing, your MySQL database cannot keep up with the write load from the sensor. This is a critical indicator of a database bottleneck. See the [[Scaling|Performance Tuning Guide]] for solutions.


==tacCPU==
=== CPU Load & Thread Performance ===
[N0|N1|N...] %CPU utilization when compressing pcap files or when compressing internal memory if tar=yes (which is by default) number of threads grows automatically
This is the most critical section for diagnosing performance bottlenecks.


==RSS/VSZ==
;<code>t0CPU[5.2%]</code>
===RSS===
:'''The single most important CPU metric.''' This is the CPU usage of Thread 0, which is solely responsible for reading packets from the network card.
resident size, which is an accurate representation of how much actual physical memory sniffer is consuming. in MB
:* '''What to watch for:''' If `t0CPU` consistently exceeds '''90-95%''', your server is at its absolute limit for packet capture. No amount of other tuning will help; you have either a CPU or a network stack bottleneck that requires a faster CPU core or a kernel-bypass solution like [[High-Performance_Packet_Capture_with_DPDK|DPDK]].
===VSZ===
virtual size of a process, which is the sum of memory it is actually using, memory it has mapped into itself (for instance the video card’s RAM for the X server), files on disk that have been mapped into it (most notably shared libraries), and memory shared with other processes. VIRT represents how much memory the program is able to access at the present moment.


==LA==
;<code>t1CPU[1.2%]</code>
[11.90 10.93 10.71] Load averages in last 1,5,10 minutes
:CPU usage for Thread 1, which moves packets from the capture buffer into internal processing queues.
 
;<code>t2CPU[pb:43.8/d:19.5/s:12.9/...]</code>
:CPU usage for the main processing pipeline, broken down by task. This helps identify where the most time is spent.
:* '''pb:''' Reading from the packet buffer.
:* '''s:''' Parsing '''S'''IP packets.
:* '''r:''' Processing '''R'''TP packets.
:* '''c:''' Handling '''c'''all logic.
:* VoIPmonitor automatically spawns more `t2` threads as load increases, providing granular insight into the processing stages.
 
;<code>tRTP_CPU[658.8%/46.7m/15t]</code>
:A summary of all RTP processing threads.
:* '''658.8%:''' Sum of CPU usage across all RTP threads.
:* '''46.7m:''' The '''m'''aximum CPU usage of the busiest single RTP thread.
:* '''15t:''' The total number of active RTP processing '''t'''hreads.
 
;<code>tarCPU[3.3|4.8|9.6|...]</code>
:CPU usage for each thread responsible for compressing and writing `.tar` archives to disk.
 
;<code>LA[11.90 10.93 10.71/8h]</code>
:The system '''L'''oad '''A'''verage for the last 1, 5, and 15 minutes, followed by the number of CPU cores and a flag for hyperthreading (`h`).
 
=== Memory & Buffer Status ===
 
;<code>heap[0|0|0]</code>
:Shows the percentage fullness of the three main memory buffers.
:* '''First value (A):''' Main packet heap. '''What to watch for:''' If this reaches 100%, it means the sniffer cannot process packets as fast as they are coming in, and packet loss will occur. This indicates a severe CPU or I/O bottleneck.
:* '''Second value (B):''' Kernel ring buffer.
:* '''Third value (C):''' Asynchronous disk write buffer.
 
;<code>comp[23]</code>
:The current compression ratio being achieved for in-memory packet buffers (if `packetbuffer_compress` is enabled).
 
;<code>tarQ[1865]</code>
:The number of call recordings (PCAP files) waiting in the '''q'''ueue to be written to a `.tar` archive.
 
;<code>RSS/VSZ[323|752]MB</code>
:Standard Linux memory metrics.
:* '''RSS (Resident Set Size):''' The actual physical memory (RAM) the process is currently consuming. This is the most accurate measure of memory usage.
:* '''VSZ (Virtual Memory Size):''' The total address space the process has reserved.
 
== AI Summary for RAG ==
'''Summary:''' This document is a reference guide for interpreting the real-time performance status line logged by the VoIPmonitor sensor to syslog. It explains where to find the logs on different Linux distributions and how to redirect them to a custom file using `rsyslog`. The guide provides a detailed breakdown of each metric in the status line, grouping them into logical categories. The "Overall Call & Packet Statistics" section covers `calls` and `PS` (Packets per Second). "Database Health" focuses on the critical `SQLq` (SQL Queue) metric, which indicates database write performance. The "CPU Load" section provides an in-depth explanation of the most important thread metrics, especially `t0CPU` (the master packet capture thread), `t2CPU` (the processing pipeline), and `tRTP_CPU` (RTP processing threads). "Memory & Buffer Status" explains the `heap`, `tarQ` (TAR Queue), and `RSS` (Resident Set Size) metrics. For each key metric, the guide includes a "What to Watch For" tip, explaining how to use the value to diagnose specific system bottlenecks like CPU limitations, database overload, or I/O issues.
'''Keywords:''' log, logging, syslog, messages, status line, performance, metrics, diagnostics, troubleshooting, calls, ps, packets per second, SQLq, SQL queue, heap, t0CPU, t2CPU, tRTP_CPU, tarCPU, RSS, VSZ, load average, bottleneck, cpu, database, memory
'''Key Questions:'''
* How do I read the VoIPmonitor sniffer log?
* What do the metrics in the syslog status line mean?
* How can I tell if my database is a bottleneck? (Check `SQLq`)
* How do I know if my server can handle the traffic? (Check `t0CPU`)
* What does `t0CPU` represent?
* What does the `heap` metric mean?
* How can I diagnose performance problems with my sniffer?
* Where are the VoIPmonitor log files located?

Latest revision as of 17:29, 30 June 2025


This guide provides a detailed explanation of the real-time performance and status logs generated by the VoIPmonitor sensor. Understanding these metrics is essential for diagnosing bottlenecks, tuning performance, and ensuring the health of your capture environment.

Accessing and Managing the Logs

By default, the VoIPmonitor sensor sends all its status messages to the system's `syslog` service using the `daemon` facility.

Default Log Locations

  • Debian/Ubuntu: /var/log/syslog
  • CentOS/RHEL/AlmaLinux: /var/log/messages

You can watch the log in real-time with the `tail` command:

tail -f /var/log/syslog | grep voipmonitor

Redirecting Logs to a Custom File

For easier analysis, you can redirect all VoIPmonitor logs to a separate file.

Edit `/etc/rsyslog.conf` and add the following lines at the top
# Send all messages from the 'voipmonitor' program to a dedicated file
if $programname == 'voipmonitor' then /var/log/voipmonitor.log

# Stop processing these messages further to avoid duplication
& ~

Remember to restart the `rsyslog` service after making changes.

Anatomy of the Status Line

Every 10 seconds, the sensor outputs a detailed status line. Here is a breakdown of each component, which together act as a real-time dashboard for your sensor's health.

A sample log line: voipmonitor[15567]: calls[315][355] PS[C:4 S:29 R:6354] SQLq[0] heap[0|0|0] [12.6Mb/s] t0CPU[5.2%] t2CPU[pb:43/d:19] tRTP_CPU[658%/46m/15t] tarQ[1865] RSS/VSZ[323|752]MB LA[11.9]

Overall Call & Packet Statistics

calls[14527,r:5241][14428,r:5350]
The number of call structures currently held in memory.
  • First Bracket: Active calls being processed. `14527` SIP calls and `r:5241` REGISTER dialogs.
  • Second Bracket: Calls in the final processing queue before being cleared.
PS[C:29 S:2017 R:327512 A:269886]
Packets per Second. This shows the rate of incoming traffic.
  • C: New Calls per second.
  • S: Valid SIP packets per second.
  • R: RTP packets per second from monitored calls.
  • A: All packets per second seen by the interface.
[12.6Mb/s]
The total network throughput being processed by the sniffer in Megabits per second.

Database Health

SQLq[C:0 M:0 R:0]
The number of SQL queries waiting in the queue for each data type.
  • C: CDRs, M: Messages, R: Registers.
  • What to watch for: If these numbers are consistently high and growing, your MySQL database cannot keep up with the write load from the sensor. This is a critical indicator of a database bottleneck. See the Performance Tuning Guide for solutions.

CPU Load & Thread Performance

This is the most critical section for diagnosing performance bottlenecks.

t0CPU[5.2%]
The single most important CPU metric. This is the CPU usage of Thread 0, which is solely responsible for reading packets from the network card.
  • What to watch for: If `t0CPU` consistently exceeds 90-95%, your server is at its absolute limit for packet capture. No amount of other tuning will help; you have either a CPU or a network stack bottleneck that requires a faster CPU core or a kernel-bypass solution like DPDK.
t1CPU[1.2%]
CPU usage for Thread 1, which moves packets from the capture buffer into internal processing queues.
t2CPU[pb:43.8/d:19.5/s:12.9/...]
CPU usage for the main processing pipeline, broken down by task. This helps identify where the most time is spent.
  • pb: Reading from the packet buffer.
  • s: Parsing SIP packets.
  • r: Processing RTP packets.
  • c: Handling call logic.
  • VoIPmonitor automatically spawns more `t2` threads as load increases, providing granular insight into the processing stages.
tRTP_CPU[658.8%/46.7m/15t]
A summary of all RTP processing threads.
  • 658.8%: Sum of CPU usage across all RTP threads.
  • 46.7m: The maximum CPU usage of the busiest single RTP thread.
  • 15t: The total number of active RTP processing threads.
tarCPU[3.3|4.8|9.6|...]
CPU usage for each thread responsible for compressing and writing `.tar` archives to disk.
LA[11.90 10.93 10.71/8h]
The system Load Average for the last 1, 5, and 15 minutes, followed by the number of CPU cores and a flag for hyperthreading (`h`).

Memory & Buffer Status

heap[0|0|0]
Shows the percentage fullness of the three main memory buffers.
  • First value (A): Main packet heap. What to watch for: If this reaches 100%, it means the sniffer cannot process packets as fast as they are coming in, and packet loss will occur. This indicates a severe CPU or I/O bottleneck.
  • Second value (B): Kernel ring buffer.
  • Third value (C): Asynchronous disk write buffer.
comp[23]
The current compression ratio being achieved for in-memory packet buffers (if `packetbuffer_compress` is enabled).
tarQ[1865]
The number of call recordings (PCAP files) waiting in the queue to be written to a `.tar` archive.
RSS/VSZ[323|752]MB
Standard Linux memory metrics.
  • RSS (Resident Set Size): The actual physical memory (RAM) the process is currently consuming. This is the most accurate measure of memory usage.
  • VSZ (Virtual Memory Size): The total address space the process has reserved.

AI Summary for RAG

Summary: This document is a reference guide for interpreting the real-time performance status line logged by the VoIPmonitor sensor to syslog. It explains where to find the logs on different Linux distributions and how to redirect them to a custom file using `rsyslog`. The guide provides a detailed breakdown of each metric in the status line, grouping them into logical categories. The "Overall Call & Packet Statistics" section covers `calls` and `PS` (Packets per Second). "Database Health" focuses on the critical `SQLq` (SQL Queue) metric, which indicates database write performance. The "CPU Load" section provides an in-depth explanation of the most important thread metrics, especially `t0CPU` (the master packet capture thread), `t2CPU` (the processing pipeline), and `tRTP_CPU` (RTP processing threads). "Memory & Buffer Status" explains the `heap`, `tarQ` (TAR Queue), and `RSS` (Resident Set Size) metrics. For each key metric, the guide includes a "What to Watch For" tip, explaining how to use the value to diagnose specific system bottlenecks like CPU limitations, database overload, or I/O issues. Keywords: log, logging, syslog, messages, status line, performance, metrics, diagnostics, troubleshooting, calls, ps, packets per second, SQLq, SQL queue, heap, t0CPU, t2CPU, tRTP_CPU, tarCPU, RSS, VSZ, load average, bottleneck, cpu, database, memory Key Questions:

  • How do I read the VoIPmonitor sniffer log?
  • What do the metrics in the syslog status line mean?
  • How can I tell if my database is a bottleneck? (Check `SQLq`)
  • How do I know if my server can handle the traffic? (Check `t0CPU`)
  • What does `t0CPU` represent?
  • What does the `heap` metric mean?
  • How can I diagnose performance problems with my sniffer?
  • Where are the VoIPmonitor log files located?