Data Cleaning: Difference between revisions
(Complete rewrite: Logical structure (fundamentals first, then config, then advanced, then troubleshooting). Remove legacy mode references. Fix formatting. Reduce from 1106 to 386 lines.) |
(Add legacy cleanspool_use_files parameter documentation (deprecated MySQL files table indexing)) |
||
| Line 46: | Line 46: | ||
@enduml | @enduml | ||
</kroki> | </kroki> | ||
{| class="wikitable" style="background-color: #f0f0f0;" | |||
|- | |||
! colspan="2" | Note: Legacy Indexing Method | |||
|- | |||
| colspan="2" | Older VoIPmonitor versions used a different indexing method where file metadata was stored in the MySQL <code>files</code> table instead of in-memory. This legacy behavior can be enabled with <code>cleanspool_use_files = yes</code> (the parameter name refers to the SQL <code>files</code> table, not the <code>.cleanspool_cache</code> files). The default is <code>cleanspool_use_files = no</code> which uses the current in-memory indexing. The legacy method is deprecated and should not be used. | |||
|} | |||
=== Retention Configuration === | === Retention Configuration === | ||
| Line 371: | Line 378: | ||
'''Summary:''' VoIPmonitor has two independent data retention systems: (1) Filesystem cleaning for PCAP files using <code>maxpoolsize</code>/<code>maxpooldays</code>, running every 5 minutes; (2) Database cleaning using <code>cleandatabase</code> with daily partitioning. | '''Summary:''' VoIPmonitor has two independent data retention systems: (1) Filesystem cleaning for PCAP files using <code>maxpoolsize</code>/<code>maxpooldays</code>, running every 5 minutes; (2) Database cleaning using <code>cleandatabase</code> with daily partitioning. | ||
'''Cleanspool:''' File index kept in memory for fast operations. <code>.cleanspool_cache</code> files in hourly directories (e.g., <code>/var/spool/voipmonitor/2026-01-01/15/.cleanspool_cache</code>) are for fast restart only. | '''Cleanspool:''' File index kept in memory for fast operations. <code>.cleanspool_cache</code> files in hourly directories (e.g., <code>/var/spool/voipmonitor/2026-01-01/15/.cleanspool_cache</code>) are for fast restart only. Legacy parameter <code>cleanspool_use_files = yes</code> enables deprecated MySQL <code>files</code> table indexing (the "files" in parameter name refers to SQL table, not cache files). Default is <code>cleanspool_use_files = no</code> (in-memory indexing). | ||
'''Key parameters:''' <code>maxpoolsize</code> (default 100GB), <code>maxpoolrtpsize</code>/<code>maxpoolrtpdays</code> for RTP-specific limits, <code>cleandatabase</code> for CDR retention. Emergency cleanup via <code>autocleanspoolminpercent</code> (default 1%) and <code>autocleanmingb</code> (default 5GB) overrides settings when disk nearly full. GUI settings override config file when <code>mysqlloadconfig=yes</code>. Size-based database cleaning requires <code>cleandatabase_size</code> AND <code>cleandatabase_size_force=true</code>. Database partition limit ~8000 (~22 years); CDR uses BIGINT. | '''Key parameters:''' <code>maxpoolsize</code> (default 100GB), <code>maxpoolrtpsize</code>/<code>maxpoolrtpdays</code> for RTP-specific limits, <code>cleandatabase</code> for CDR retention. Emergency cleanup via <code>autocleanspoolminpercent</code> (default 1%) and <code>autocleanmingb</code> (default 5GB) overrides settings when disk nearly full. GUI settings override config file when <code>mysqlloadconfig=yes</code>. Size-based database cleaning requires <code>cleandatabase_size</code> AND <code>cleandatabase_size_force=true</code>. Database partition limit ~8000 (~22 years); CDR uses BIGINT. | ||
'''Keywords:''' data retention, maxpoolsize, maxpooldays, maxpoolrtpsize, cleandatabase, cleandatabase_size, .cleanspool_cache, autocleanspoolminpercent, autocleanmingb, tar_move, innodb_file_per_table, savertp header, BIGINT | '''Keywords:''' data retention, maxpoolsize, maxpooldays, maxpoolrtpsize, cleandatabase, cleandatabase_size, .cleanspool_cache, cleanspool_use_files, files table, autocleanspoolminpercent, autocleanmingb, tar_move, innodb_file_per_table, savertp header, BIGINT | ||
'''Key Questions:''' | '''Key Questions:''' | ||
* How does cleanspool work? | * How does cleanspool work? | ||
* Where are .cleanspool_cache files? | * Where are .cleanspool_cache files? | ||
* What is cleanspool_use_files? | |||
* Why are files deleted faster than expected? | * Why are files deleted faster than expected? | ||
* How to fix MySQL Error 28? | * How to fix MySQL Error 28? | ||
Revision as of 14:46, 8 January 2026
This guide explains how VoIPmonitor manages data retention for PCAP files and database records.
Overview
VoIPmonitor generates two types of data requiring periodic cleanup:
| Data Type | Storage | Cleanup Mechanism | Key Parameters |
|---|---|---|---|
| PCAP Files | Filesystem (spool directory) | Cleanspool process | maxpoolsize, maxpooldays
|
| CDR Records | MySQL database | Partition dropping | cleandatabase
|
These are independent systems - filesystem cleanup does not affect database records and vice versa.
Filesystem Cleaning (PCAP Files)
How Cleanspool Works
The sniffer maintains a complete file index in memory during operation. Every 5 minutes, the cleanspool thread checks retention limits and deletes the oldest files when limits are exceeded.
The .cleanspool_cache files in hourly directories (e.g., /var/spool/voipmonitor/2026-01-01/15/.cleanspool_cache) serve as persistent storage for fast restart - they allow quick index reload without scanning the entire directory structure.
| Note: Legacy Indexing Method | |
|---|---|
Older VoIPmonitor versions used a different indexing method where file metadata was stored in the MySQL files table instead of in-memory. This legacy behavior can be enabled with cleanspool_use_files = yes (the parameter name refers to the SQL files table, not the .cleanspool_cache files). The default is cleanspool_use_files = no which uses the current in-memory indexing. The legacy method is deprecated and should not be used.
|
Retention Configuration
Retention limits can be set by size (MB) or age (days). When both are configured, the first limit reached triggers cleanup.
Global Limits
| Parameter | Default | Description |
|---|---|---|
maxpoolsize |
102400 (100 GB) | Maximum total size for all PCAP data |
maxpooldays |
(unset) | Maximum age in days for all PCAP data |
Per-Type Limits
You can set different retention for each data type:
| Data Type | Size Parameter | Days Parameter |
|---|---|---|
| SIP signaling | maxpoolsipsize |
maxpoolsipdays
|
| RTP audio | maxpoolrtpsize |
maxpoolrtpdays
|
| Quality graphs | maxpoolgraphsize |
maxpoolgraphdays
|
| Converted audio | maxpoolaudiosize |
maxpoolaudiodays
|
Recommended Configuration
For most deployments, limit RTP (which consumes most space) while keeping SIP longer:
# Limit RTP to 100 GB (deleted when exceeded)
maxpoolrtpsize = 102400
# Overall limit for all data
maxpoolsize = 512000
This keeps SIP signaling (small files, useful for troubleshooting) as long as overall space allows, while limiting large RTP files.
Verifying Active Cleanup Rules
Check which rule triggered cleanup:
journalctl -u voipmonitor | grep -i clean
Log messages indicate: clean_maxpoolsize, clean_maxpooldays, clean_maxpoolrtpdays, etc.
Emergency Cleanup
Emergency cleanup acts as a safety mechanism when disk is nearly full:
| Parameter | Default | Triggers When |
|---|---|---|
autocleanspoolminpercent |
1 | Disk usage reaches 99% |
autocleanmingb |
5 | Free space below 5 GB |
When triggered, oldest data is deleted aggressively regardless of maxpool* settings until thresholds are cleared.
Reducing Data at Source
Before tuning retention, consider reducing data volume:
Save RTP Headers Only
If you only need call quality statistics (MOS, jitter, packet loss) without audio playback:
savertp = header
This reduces storage by up to 90% while preserving all quality metrics.
Selective Audio Recording
To record full audio only for specific calls (legal holds, VIP customers):
- Set global default to headers only:
savertp = header - Create capture rules in GUI (Control Panel > Capture Rules) with
recordRTP=ONfor exceptions
See Capture_rules for details.
The maxpool_clean_obsolete Parameter
Controls handling of files not in the index:
| Setting | Behavior |
|---|---|
maxpool_clean_obsolete = no (default) |
Only delete indexed files. Unknown files are preserved. |
maxpool_clean_obsolete = yes |
Delete ALL files in spool, including unindexed ones. |
Database Cleaning (CDR Records)
Partitioning Method
VoIPmonitor uses daily partitioning for database tables. Dropping old partitions is instant (milliseconds) regardless of row count.
# Keep CDR records for 30 days
cleandatabase = 30
| Parameter | Default | Description |
|---|---|---|
cleandatabase |
0 (disabled) | Global retention in days |
cleandatabase_cdr |
0 | CDR table retention |
cleandatabase_register_state |
0 | Registration state retention |
cleandatabase_register_time_info |
0 | Registration timing (must be set explicitly) |
partition_operations_enable_fromto |
1-5 | Time window for partition operations |
| Important: register_time_info | |
|---|---|
The register_time_info table is NOT covered by global cleandatabase. Set cleandatabase_register_time_info explicitly to enable cleanup.
| |
Size-Based Database Cleaning
To limit database by size instead of time:
cleandatabase_size = 512000 # 500 GB limit in MB
cleandatabase_size_force = true # Required to enable
Limits
- Partition limit: ~8000 partitions per table (~22 years with daily partitioning)
- CDR record limit: No practical limit - uses BIGINT (18 quintillion records). See Upgrade_to_bigint.
Multi-Sensor Environments
When multiple sensors share a database, only ONE sensor should manage partitions:
# On all sensors EXCEPT one:
disable_partition_operations = yes
# On the designated sensor:
partition_operations_enable_fromto = 4-6
Advanced Topics
Spool Directory Location
Default location: /var/spool/voipmonitor
Directory structure: YYYY-MM-DD/HH/MM/{SIP|RTP|GRAPH|AUDIO}/files...
Relocating the Spool Directory
To move spool to a larger partition:
- Step 1
- Create new directory
mkdir -p /mnt/storage/voipmonitor
chown voipmonitor:voipmonitor /mnt/storage/voipmonitor
- Step 2
- Update sniffer configuration
# /etc/voipmonitor.conf
spooldir = /mnt/storage/voipmonitor
- Step 3
- Update GUI configuration
Edit GUI config file and set SNIFFER_DATA_PATH to match:
define('SNIFFER_DATA_PATH', '/mnt/storage/voipmonitor');
- Step 4
- Restart services
systemctl restart voipmonitor
Tiered Storage (tar_move)
To extend retention using secondary storage:
# Local fast storage for live capture
spooldir = /var/spool/voipmonitor
# Archive to secondary storage
tar_move = yes
tar_move_destination_path = /mnt/archive/voipmonitor
Files in tar_move_destination_path remain accessible via GUI.
S3 Cloud Storage
Use rclone instead of s3fs to avoid GUI unresponsiveness:
rclone mount bucket-name /mnt/s3-archive \
--allow-other --dir-cache-time 30s --vfs-cache-mode off
Custom Autocleaning (GUI)
For one-time cleanup of specific recordings (e.g., by IP address):
- Navigate to Settings > Custom Autocleaning
- Create rule with filters (IP, phone number, etc.)
- Apply and remove rule after completion
Useful for cleaning old data after configuring capture rules to stop future recording.
Troubleshooting
Files Disappearing Faster Than Expected
Check these causes in order:
- 1. Emergency cleanup triggered
df -h /var/spool/voipmonitor
# If >95% full, emergency cleanup is active
- 2. GUI configuration override
When mysqlloadconfig = yes (default), GUI settings override config file. Check: Settings > Sensors > wrench icon.
- 3. Insufficient maxpoolsize
Set maxpoolsize to 90-95% of disk capacity to leave buffer for growth between cleanup cycles.
Disk Space Not Reclaimed After Database Cleanup
Check innodb_file_per_table:
SHOW GLOBAL VARIABLES LIKE 'innodb_file_per_table';
If OFF, space is not reclaimed when partitions drop. Enable for future tables:
[mysqld]
innodb_file_per_table = 1
MySQL Error 28: No Space Left
Primary solution - enable size-based cleaning:
cleandatabase_size = 512000
cleandatabase_size_force = true
Other causes:
- Inode exhaustion: Check with
df -i - MySQL tmpdir full: Check with
SHOW VARIABLES LIKE 'tmpdir'
Database Not Cleaning
Verify tables are partitioned:
SHOW CREATE TABLE cdr\G
SELECT PARTITION_NAME, TABLE_ROWS
FROM information_schema.PARTITIONS
WHERE TABLE_NAME = 'cdr'
ORDER BY PARTITION_ORDINAL_POSITION DESC
LIMIT 10;
If only expected partitions exist (matching your cleandatabase setting), cleaning IS working - you may simply have high data volume.
Spool Filling Due to Database Bottleneck
If spool fills rapidly after disk swap or MySQL upgrade:
# Check SQL queue
tail -f /var/log/syslog | grep voipmonitor | grep SQLq
Growing SQLq indicates database cannot keep up. Solution:
[mysqld]
innodb_buffer_pool_size = 4G # 50-70% of RAM
innodb_flush_log_at_trx_commit = 2 # Faster writes
See Also
AI Summary for RAG
Summary: VoIPmonitor has two independent data retention systems: (1) Filesystem cleaning for PCAP files using maxpoolsize/maxpooldays, running every 5 minutes; (2) Database cleaning using cleandatabase with daily partitioning.
Cleanspool: File index kept in memory for fast operations. .cleanspool_cache files in hourly directories (e.g., /var/spool/voipmonitor/2026-01-01/15/.cleanspool_cache) are for fast restart only. Legacy parameter cleanspool_use_files = yes enables deprecated MySQL files table indexing (the "files" in parameter name refers to SQL table, not cache files). Default is cleanspool_use_files = no (in-memory indexing).
Key parameters: maxpoolsize (default 100GB), maxpoolrtpsize/maxpoolrtpdays for RTP-specific limits, cleandatabase for CDR retention. Emergency cleanup via autocleanspoolminpercent (default 1%) and autocleanmingb (default 5GB) overrides settings when disk nearly full. GUI settings override config file when mysqlloadconfig=yes. Size-based database cleaning requires cleandatabase_size AND cleandatabase_size_force=true. Database partition limit ~8000 (~22 years); CDR uses BIGINT.
Keywords: data retention, maxpoolsize, maxpooldays, maxpoolrtpsize, cleandatabase, cleandatabase_size, .cleanspool_cache, cleanspool_use_files, files table, autocleanspoolminpercent, autocleanmingb, tar_move, innodb_file_per_table, savertp header, BIGINT
Key Questions:
- How does cleanspool work?
- Where are .cleanspool_cache files?
- What is cleanspool_use_files?
- Why are files deleted faster than expected?
- How to fix MySQL Error 28?
- How to configure size-based database cleaning?
- How to extend retention with tiered storage?
- Why is disk space not reclaimed after cleanup?