Data Cleaning

From VoIPmonitor.org
Revision as of 05:24, 4 January 2026 by Admin (talk | contribs) (Add missing info: cleandatabase_size, disk space troubleshooting (innodb_file_per_table), SQLq/SQLf monitoring, iotop/loadavg monitoring)

This guide explains how VoIPmonitor manages data retention for both captured packets (PCAP files) and Call Detail Records (CDRs) in the database. Proper configuration is essential for managing disk space and maintaining long-term database performance.

Overview of Data Cleaning

VoIPmonitor generates two primary types of data that require periodic cleaning:

  • PCAP Files: Raw packet captures of SIP/RTP/GRAPH data stored on the filesystem in the spool directory. These can consume significant disk space.
  • CDR Data: Call metadata stored in the MySQL database. Large tables can slow down GUI performance if not managed properly.

The system uses two separate, independent mechanisms to manage the retention of this data.

1. Filesystem Cleaning (PCAP Spool Directory)

The sensor stores captured call data in a structured directory tree on the local filesystem.

Spool Directory Location

By default, all data is stored in `/var/spool/voipmonitor`. This location can be changed by setting the `spooldir` option in `voipmonitor.conf`.

Retention Configuration

The cleaning process runs automatically every 5 minutes and removes the oldest data based on the rules you define in `voipmonitor.conf`. You can set limits based on total size (in Megabytes) or age (in days). If both a size and day limit are set for the same data type, the first limit that is reached will trigger the cleaning.

The following options are available:

Parameter Default Value Description
`maxpoolsize` `102400` (100 GB) The total maximum disk space for all captured data (SIP, RTP, GRAPH, AUDIO).
`maxpooldays` (unset) The maximum number of days to keep all captured data.
`maxpoolsipsize` (unset) A specific size limit for SIP PCAP files only.
`maxpoolsipdays` (unset) A specific age limit for SIP PCAP files only.
`maxpoolrtpsize` (unset) A specific size limit for RTP PCAP files only.
`maxpoolrtpdays` (unset) A specific age limit for RTP PCAP files only.
`maxpoolgraphsize` (unset) A specific size limit for GRAPH files only.
`maxpoolgraphdays` (unset) A specific age limit for GRAPH files only.
`maxpoolaudiosize` (unset) A specific size limit for converted audio files (WAV/OGG) only.
`maxpoolaudiodays` (unset) An age limit for converted audio files (WAV/OGG) only.

Maintenance: Re-indexing the Spool Directory

VoIPmonitor maintains an index of all created PCAP files to perform cleaning efficiently without scanning the entire directory tree. If this index becomes corrupt, or if you manually move files into the spool, old data may not be deleted correctly.

In this case, you must trigger a manual re-index. This can be done via the sniffer's manager API.

  1. Open a manager API session:
  2. echo 'manager_file start /tmp/vmsck' | nc 127.0.0.1 5029
  3. Send the re-index command:
  4. echo reindexfiles | nc -U /tmp/vmsck

Note: This command requires `netcat` with support for UNIX sockets (`-U`). For alternative methods, see the Manager API documentation.

2. Database Cleaning (CDR Retention)

Managing the size of the `cdr` table and other large tables is critical for GUI performance.

The Modern Method: Partitioning (Recommended)

Since version 7, VoIPmonitor utilizes **database partitioning**, which splits large tables into smaller, daily segments. This is the highly recommended method for managing database retention.

  • How it works: You set a single parameter, `cleandatabase`, in `voipmonitor.conf`. This defines the number of days to keep CDRs. For example, `cleandatabase = 30` will keep the last 30 days of data.
  • Why it's better: The sniffer automatically drops old daily partitions, which is an instantaneous operation that takes milliseconds, regardless of how many millions of rows it contains. This puts zero load on the database.
  • Requirement: Partitioning is enabled by default on all new installations. If you are upgrading from a very old version, it may require starting with a fresh database.

More details can be found in the Sniffer Configuration guide.

Database Retention Parameters

Parameter Default Value Description
`cleandatabase` `0` (disabled) Master retention setting in days for CDRs and several other tables.
`cleandatabase_cdr` `0` (disabled, uses cleandatabase) Specific retention period for `cdr` and `message` tables.
`cleandatabase_rtp_stat` `2` days Retention in days for detailed RTP statistics.
`cleandatabase_sip_msg` `0` (disabled) Retention for SIP OPTIONS/SUBSCRIBE/NOTIFY messages.
`cleandatabase_size` (unset) Alternative cleaning method that removes old data to stay below a total database size limit (in Megabytes). Requires sniffer version 2024.05.1 or newer.
`partition_operations_enable_fromto` `1-5` Restricts partition-dropping operations to a specific time window (e.g., 1 AM to 5 AM) to avoid impacting performance during peak hours.

The Legacy Method: Manual Deletion (Not Recommended)

If you are running a very old, non-partitioned database, you cannot use the `cleandatabase` option. You would need to create a custom script that runs a `DELETE FROM cdr WHERE calldate < ...` query.

  • Warning: This method is extremely slow and resource-intensive on large tables. A single `DELETE` operation on millions of rows can take hours and generate significant I/O load on your database server, potentially impacting GUI performance.

3. Troubleshooting Disk Space Issues

Disk Space Not Reclaimed After Cleanup

If you have configured automatic cleanup but disk space is not being freed from the MySQL data directory, the issue is likely related to the MySQL `innodb_file_per_table` setting. This setting controls whether each InnoDB table is stored in its own `.ibd` file or in the shared `ibdata1` file.

Check the Current Setting:

mysql> SHOW GLOBAL VARIABLES LIKE 'innodb_file_per_table';
+-----------------------+-------+
| Variable_name         | Value |
+-----------------------+-------+
| innodb_file_per_table | ON    |
+-----------------------+-------+

What This Means:

  • If value is ON: Each table/partition is stored in its own file. When a partition is dropped, the file is deleted immediately, and disk space is reclaimed.
  • If value is OFF when database was created: All table data is stored in the shared `ibdata1` file. Dropping partitions or rows will not reduce this file size.

Solutions:

Option 1: Enable innodb_file_per_table (Preventative) Add to `/etc/my.cnf` or `/etc/mysql/my.cnf`:

[mysqld]
innodb_file_per_table = 1

After changing this setting, restart MySQL:

service mysql restart

Note: This only affects NEW tables/partitions. Existing data in `ibdata1` will remain.

Option 2: Reclaim Space from Existing Tables If `innodb_file_per_table` is ON and you have partitioned tables, run:

OPTIMIZE TABLE cdr;

Warning: This requires significant free disk space to duplicate the table data. If your disk is already nearly full, this command will fail or crash the database.

Option 3: Export and Re-import (Complete Reclaim) If `innodb_file_per_table` was OFF when the database was created and you must reclaim space:

mysqldump -u root -p voipmonitor > voipmonitor_backup.sql
mysql -u root -p -e "DROP DATABASE voipmonitor; CREATE DATABASE voipmonitor;"
mysql -u root -p voipmonitor < voipmonitor_backup.sql

Monitoring Database Health

To proactively monitor database health and detect issues before they become critical, use the following metrics and tools.

SQL Queue Metrics (SQLq/SQLf)

The VoIPmonitor sensor tracks database queue metrics that indicate if the database is falling behind processing CDRs. These are visible in:

  • GUI: Settings → Sensors → Status (expanded status line)
  • RRD files (if using collectd or similar)
  • Sniffer's internal health monitoring logs
Metric Description Healthy Range
SQLq SQL Queue count - number of CDRs waiting to be written to database Near 0, sporadic spikes
SQLf SQL Failed count - number of failed database write attempts Zero (or very low, not growing)

Interpreting SQLq/SQLf:

  • Occasionally high SQLq during traffic spikes is normal
  • Consistently high or growing SQLq indicates the database cannot keep up
  • Non-zero or growing SQLf indicates database errors or connectivity issues

For detailed troubleshooting of high SQL queue, see SQL Queue Troubleshooting.

System Load Monitoring

Check overall system load average:

cat /proc/loadavg

The three numbers represent 1-minute, 5-minute, and 15-minute load averages. Consistently high load averages indicate CPU, I/O, or memory bottlenecks.

Disk I/O Monitoring

Use `iotop` to identify processes causing disk I/O bottlenecks:

iotop -o

The `-o` flag shows only processes with active I/O. Look for high I/O from `mysql` or `mysqld` processes, which may indicate slow storage or poorly tuned MySQL settings.

4. Recommended MySQL Performance Settings

For high-performance database operations (especially with partitioning enabled), ensure your MySQL configuration includes these critical settings:

[mysqld]
# Use 50-70% of available RAM for caching
innodb_buffer_pool_size = 4G

# Flush logs to OS every second (faster, safe for VoIPmonitor)
innodb_flush_log_at_trx_commit = 2

# Enable per-table filespace for easy space reclamation
innodb_file_per_table = 1

For comprehensive performance tuning guidelines, see the Scaling and Performance guide.

AI Summary for RAG

Summary: This article explains the two distinct data retention mechanisms in VoIPmonitor: filesystem cleaning for PCAP files and database cleaning for CDRs. For filesystem storage in the `spooldir`, it details the various `maxpoolsize` and `maxpooldays` configuration options that control the retention of SIP, RTP, and other files based on size or age. It also describes the `reindexfiles` manager command for troubleshooting cases where old files are not being deleted. For database retention, it explains the modern partitioning approach with `cleandatabase` parameters (day-based and size-based), partitioning configuration, and scheduling via `partition_operations_enable_fromto`. It includes troubleshooting guidance for disk space not being reclaimed due to `innodb_file_per_table` settings, with solutions including OPTIMIZE TABLE and database export/import. It also provides monitoring guidance using SQLq/SQLf metrics from health charts and system monitoring tools like `iotop` and `cat /proc/loadavg`, and recommends key MySQL performance settings. Keywords: data retention, cleaning, delete old calls, purge data, disk space, spooldir, maxpoolsize, maxpooldays, pcap, filesystem, database, cdr, cleandatabase, cleandatabase_size, partitioning, reindexfiles, manager api, innodb_file_per_table, SQLq, SQLf, monitoring, iotop, database health, mysql performance Key Questions:

  • How do I automatically delete old PCAP files to free up disk space?
  • What is the difference between `maxpoolsize` and `maxpooldays`?
  • My spool directory is full, but old files are not being deleted. How do I fix it?
  • How do I automatically delete old CDRs from the database?
  • What is the `cleandatabase` option and how does it work?
  • Why is database partitioning important for VoIPmonitor?
  • What is `cleandatabase_size` and how does it differ from `cleandatabase`?
  • Disk space is not reclaimed after cleanup. What should i check?
  • What do SQLq and SQLf metrics mean in the health status?
  • How do I monitor database health and performance?
  • Why does MySQL not free disk space after dropping partitions?