Data Cleaning: Difference between revisions

Revision as of 19:04, 4 January 2026

This guide explains how VoIPmonitor manages data retention for both captured packets (PCAP files) and Call Detail Records (CDRs) in the database. Proper configuration is essential for managing disk space and maintaining long-term database performance.

Overview

VoIPmonitor generates two primary types of data that require periodic cleaning:

PCAP Files: Raw packet captures of SIP/RTP/GRAPH data stored on the filesystem in the spool directory. These can consume significant disk space.
CDR Data: Call metadata stored in the MySQL database. Large tables can slow down GUI performance if not managed properly.

The system uses two separate, independent mechanisms to manage the retention of this data:

Filesystem Cleaning (PCAP Spool Directory)

The sensor stores captured call data in a structured directory tree on the local filesystem.

Spool Directory Location

By default, all data is stored in /var/spool/voipmonitor. This location can be changed by setting the spooldir option in voipmonitor.conf.

Retention Configuration

The cleaning process runs automatically every 5 minutes and removes the oldest data based on the rules you define in voipmonitor.conf. You can set limits based on total size (in Megabytes) or age (in days). If both a size and day limit are set for the same data type, the first limit that is reached will trigger the cleaning.

Parameter	Default Value	Description
`maxpoolsize`	102400 (100 GB)	The total maximum disk space for all captured data (SIP, RTP, GRAPH, AUDIO).
`maxpooldays`	(unset)	The maximum number of days to keep all captured data.
`maxpoolsipsize`	(unset)	A specific size limit for SIP PCAP files only.
`maxpoolsipdays`	(unset)	A specific age limit for SIP PCAP files only.
`maxpoolrtpsize`	(unset)	A specific size limit for RTP PCAP files only.
`maxpoolrtpdays`	(unset)	A specific age limit for RTP PCAP files only.
`maxpoolgraphsize`	(unset)	A specific size limit for GRAPH files only.
`maxpoolgraphdays`	(unset)	A specific age limit for GRAPH files only.
`maxpoolaudiosize`	(unset)	A specific size limit for converted audio files (WAV/OGG) only.
`maxpoolaudiodays`	(unset)	An age limit for converted audio files (WAV/OGG) only.

Troubleshooting: Disk Full / Files Disappearing

If you see errors when attempting to extract older calls from the GUI, or if call files are disappearing too quickly, your spool directory may have reached its size limit.

Diagnosis: Check Disk Usage

Identify the sensor/probe responsible for the missing data.
SSH into the sensor/probe and navigate to the spooldir.
Check the disk usage:

cd /var/spool/voipmonitor
du -h --max-depth=1 ./

# Example output:
# 150G    ./2025-01
# 120G    ./2024-12
# 90G     ./2024-11
# 360G    .

Compare with the configured limit:

grep maxpoolsize /etc/voipmonitor.conf
# Example output: maxpoolsize = 102400  (100 GB in MB)

Resolution: Increase Spooldir Size

If the actual usage exceeds the configured limit, increase maxpoolsize:

# Edit /etc/voipmonitor.conf
[general]
maxpoolsize = 716800   # 700 GB in MB
maxpooldays = 90       # Optional: Keep data for last 90 days

Apply changes:

systemctl restart voipmonitor

Maintenance: Re-indexing the Spool Directory

VoIPmonitor maintains an index of all created PCAP files to perform cleaning efficiently without scanning the entire directory tree. If this index becomes corrupt, or if you manually move files into the spool, old data may not be deleted correctly.

To trigger a manual re-index via the manager API:

# Open a manager API session
echo 'manager_file start /tmp/vmsck' | nc 127.0.0.1 5029

# Send the re-index command
echo reindexfiles | nc -U /tmp/vmsck

Note: This command requires netcat with support for UNIX sockets (-U). For alternative methods, see the Manager API documentation.

Database Cleaning (CDR Retention)

Managing the size of the cdr table and other large tables is critical for GUI performance.

Partitioning Method (Recommended)

Since version 7, VoIPmonitor utilizes database partitioning, which splits large tables into smaller, daily segments. This is the recommended method for managing database retention.

Aspect	Description
How it works	Set `cleandatabase = 30` in `voipmonitor.conf` to keep the last 30 days of data.
Why it's better	Dropping old partitions is instantaneous (milliseconds), regardless of row count. Zero database load.
Requirement	Partitioning is enabled by default on new installations.

Quick Start: Global Retention

For most deployments, configure one parameter in /etc/voipmonitor.conf:

# Keep all records for 30 days
cleandatabase = 30

The cleandatabase parameter acts as a global default for all cleandatabase_* options and applies to:

cdr - Call Detail Records
message - SIP MESSAGE texts
sip_msg - SIP OPTIONS/SUBSCRIBE/NOTIFY messages
register_state - SIP registration states
register_failed - Failed registration attempts

Retention Parameters

Parameter	Default	Description
`cleandatabase`	0 (disabled)	Master retention setting in days.
`cleandatabase_cdr`	0	Specific retention for `cdr` and `message` tables.
`cleandatabase_rtp_stat`	2	Retention for detailed RTP statistics.
`cleandatabase_sip_msg`	0	Retention for OPTIONS/SUBSCRIBE/NOTIFY.
`cleandatabase_size`	(unset)	Alternative: size-based limit in MB (requires version 2024.05.1+).
`partition_operations_enable_fromto`	1-5	Time window for partition operations (e.g., 1-5 AM).

More details: Sniffer Configuration - Database Cleaning.

Legacy Method: Manual Deletion (Not Recommended)

For very old, non-partitioned databases, you would need custom scripts with DELETE FROM cdr WHERE calldate < ... queries.

Warning: Manual DELETE on large tables is extremely slow and resource-intensive. A single operation on millions of rows can take hours and impact GUI performance.

Troubleshooting Disk Space Issues

Disk Space Not Reclaimed After Cleanup

If automatic cleanup runs but disk space is not freed from the MySQL data directory, check the innodb_file_per_table setting:

SHOW GLOBAL VARIABLES LIKE 'innodb_file_per_table';

Value	Behavior
ON	Each table/partition has its own `.ibd` file. Dropping partitions reclaims space immediately.
OFF	All data in shared `ibdata1` file. Dropping partitions does not reduce file size.

Solutions

Option 1: Enable for Future Tables

Add to /etc/my.cnf or /etc/mysql/my.cnf:

[mysqld]
innodb_file_per_table = 1

systemctl restart mysql

Note: This only affects NEW tables/partitions. Existing data in ibdata1 remains.

Option 2: Reclaim Space from Existing Tables

OPTIMIZE TABLE cdr;

Warning: Requires significant free disk space to duplicate table data. May crash if disk is nearly full.

Option 3: Export and Re-import

mysqldump -u root -p voipmonitor > voipmonitor_backup.sql
mysql -u root -p -e "DROP DATABASE voipmonitor; CREATE DATABASE voipmonitor;"
mysql -u root -p voipmonitor < voipmonitor_backup.sql

Monitoring Database Health

SQL Queue Metrics

The sensor tracks queue metrics visible in GUI → Settings → Sensors → Status:

Metric	Description	Healthy Range
SQLq	CDRs waiting to be written to database	Near 0, sporadic spikes OK
SQLf	Failed database write attempts	Zero (not growing)

Consistently high/growing SQLq → database cannot keep up
Non-zero/growing SQLf → database errors or connectivity issues

See SQL Queue Troubleshooting for details.

System Monitoring

# Check system load
cat /proc/loadavg

# Monitor disk I/O (shows only active processes)
iotop -o

High I/O from mysqld processes may indicate slow storage or poorly tuned MySQL settings.

MySQL Performance Settings

For high-performance operation with partitioning:

[mysqld]
# Use 50-70% of available RAM for caching
innodb_buffer_pool_size = 4G

# Flush logs to OS every second (faster, safe for VoIPmonitor)
innodb_flush_log_at_trx_commit = 2

# Enable per-table filespace for easy space reclamation
innodb_file_per_table = 1

For comprehensive tuning, see Scaling and Performance Guide.

AI Summary for RAG

Summary: VoIPmonitor has two independent data retention mechanisms: (1) Filesystem cleaning for PCAP files using maxpoolsize/maxpooldays parameters, running every 5 minutes to delete oldest data first; (2) Database cleaning using cleandatabase parameter with daily partitioning for instant partition drops. Troubleshooting covers disk full scenarios (check with du -h --max-depth=1, increase maxpoolsize), space not reclaimed issues (innodb_file_per_table setting), and database health monitoring (SQLq/SQLf metrics).

Keywords: data retention, cleaning, delete old calls, disk space, spooldir, maxpoolsize, maxpooldays, cleandatabase, partitioning, reindexfiles, innodb_file_per_table, SQLq, SQLf

Key Questions:

How do I automatically delete old PCAP files?
What is the difference between maxpoolsize and maxpooldays?
My spool directory is full but old files are not deleted - how to fix?
How do I configure database retention with cleandatabase?
Why is disk space not reclaimed after MySQL cleanup?
What do SQLq and SQLf metrics mean?