High-Performance VoIPmonitor and MySQL Setup Manual: Difference between revisions

From VoIPmonitor.org
No edit summary
(Review: add architecture diagram, warning box, note box; shorten AI summary; fix formatting)
 
(4 intermediate revisions by 2 users not shown)
Line 2: Line 2:


This manual explains the configuration for a high-call-per-second (CPS) VoIPmonitor deployment. The configuration is split into two parts: the MySQL database and the VoIPmonitor sniffer.
This manual explains the configuration for a high-call-per-second (CPS) VoIPmonitor deployment. The configuration is split into two parts: the MySQL database and the VoIPmonitor sniffer.
{{Warning|1=This configuration trades data safety for performance. Settings like <code>innodb_flush_log_at_trx_commit=0</code> and <code>innodb_doublewrite=0</code> can lead to data loss on power failure. Use only when performance is critical and data loss risk is acceptable.}}


== Overall Concept ==
== Overall Concept ==


This system is designed for maximum data ingestion performance based on a '''centralized writer architecture'''. It trades certain data safety and consistency guarantees for an extremely high write throughput, suitable for large-scale deployments. The architecture is composed of three distinct stages:
This system is designed for maximum data ingestion performance based on a '''centralized writer architecture'''. It trades certain data safety and consistency guarantees for an extremely high write throughput, suitable for large-scale deployments (10,000+ concurrent calls, 3000+ CDRs/second).
 
<kroki lang="mermaid">
%%{init: {'flowchart': {'nodeSpacing': 15, 'rankSpacing': 40}}}%%
flowchart TB
    subgraph Sensors["Remote Sensors"]
        S1[Sensor 1]
        S2[Sensor 2]
        S3[Sensor N]
    end
 
    subgraph Central["Central Writer"]
        CW[Central Sniffer<br/>mysql_enable_set_id=yes]
        Q[SQL Queue<br/>1M items]
    end
 
    subgraph DB["MySQL/MariaDB"]
        M[(Hourly Partitions<br/>innodb_buffer_pool=150G)]
    end
 
    S1 -->|CDRs| CW
    S2 -->|CDRs| CW
    S3 -->|CDRs| CW
    CW --> Q
    Q -->|Batch 10K| M
</kroki>
 
The architecture is composed of three distinct stages:


# '''Centralized CDR Ingestion (Sniffer Role):''' The primary sniffer instance is configured to act as a central data collector. Its main purpose is to receive Call Detail Records (CDRs) from all other sniffer instances in the network. In this mode, the central sniffer may not process all network packets itself; its primary function is to prepare and queue CDRs for database insertion. It is critical that in this architecture, '''only the central sniffer writes to the database'''. All other sniffers must be configured to send their data to this central instance.
# '''Centralized CDR Ingestion (Sniffer Role):''' The primary sniffer instance is configured to act as a central data collector. Its main purpose is to receive Call Detail Records (CDRs) from all other sniffer instances in the network. '''Only the central sniffer writes to the database'''. All other sniffers must be configured to send their data to this central instance.


# '''Optimized Batch Writing (Sniffer to MySQL):''' To achieve maximum insert throughput, the central sniffer writes CDRs to MySQL in large, consolidated batches. A key optimization is that the database's automatic ID generators (<code>AUTO_INCREMENT</code>) are disabled. The sniffer pre-assigns unique IDs to the records, which allows for massive parallel inserts without the locking and serialization bottlenecks that normally occur at the database level.
# '''Optimized Batch Writing (Sniffer to MySQL):''' To achieve maximum insert throughput, the central sniffer writes CDRs to MySQL in large, consolidated batches. The database's automatic ID generators (<code>AUTO_INCREMENT</code>) are disabled. The sniffer pre-assigns unique IDs to the records, which allows for massive parallel inserts without locking bottlenecks.


# '''High-Speed Database Storage (MySQL):''' The MySQL database is heavily optimized to ingest these large data batches at extremely high speeds. For environments with exceptionally high CDR-per-second rates, this configuration supports '''hourly table partitioning'''. While hourly partitioning dramatically accelerates write performance by keeping the active table small, it introduces a trade-off for data retrieval. When performing a search across an entire day, the database engine must query 24 separate partitions instead of a single daily one. This can result in slower query performance for reports or searches covering broad time ranges.
# '''High-Speed Database Storage (MySQL):''' The MySQL database is heavily optimized to ingest these large data batches at extremely high speeds. For environments with exceptionally high CDR-per-second rates, this configuration supports '''hourly table partitioning'''. While hourly partitioning dramatically accelerates write performance by keeping the active table small, it introduces a trade-off for data retrieval (queries across a full day must scan 24 partitions).


== MySQL Configuration (my.cnf) ==
== MySQL Configuration (my.cnf) ==


The goal of this configuration is to maximize the write performance of the InnoDB engine, often at the expense of strict ACID durability.
The goal of this configuration is to maximize the write performance of the InnoDB engine, often at the expense of strict ACID durability.
{{Note|1=These settings assume a dedicated database server with 256GB+ RAM and NVMe storage. Adjust <code>innodb_buffer_pool_size</code> to 50-70% of available RAM.}}


{| class="wikitable" style="background-color: #FFFFFF;"
{| class="wikitable" style="background-color: #FFFFFF;"
Line 25: Line 56:
|-
|-
| <code>innodb_flush_log_at_trx_commit=0</code>
| <code>innodb_flush_log_at_trx_commit=0</code>
| The most critical setting for write performance. Instead of flushing the transaction log to disk after every single transaction (default: <code>1</code>), the database flushes it only once per second. This dramatically speeds up writes but can lead to the loss of the last second's worth of transactions if the server crashes.
| The most critical setting for write performance. Instead of flushing the transaction log to disk after every single transaction (default: <code>1</code>), the database flushes it only once per second. This dramatically speeds up writes but can lead to the loss of the last second's worth of transactions if the server crashes. Usually <code>innodb_flush_log_at_trx_commit=2</code> is enough but for extreme CDR CPS this setting is crucial.
|-
|-
| <code>innodb_flush_log_at_timeout = 1800</code>
| <code>innodb_flush_log_at_timeout = 1800</code>
Line 116: Line 147:
| <code>partition_operations_enable_fromto = 4-6</code>
| <code>partition_operations_enable_fromto = 4-6</code>
| Schedules database partition management to run only between 4:00 and 6:00 AM, avoiding impact during peak hours.
| Schedules database partition management to run only between 4:00 and 6:00 AM, avoiding impact during peak hours.
|-
| <code>disable_partition_operations = yes</code>
| Disables partition management operations (creating new partitions, dropping old ones) on this sniffer instance. Use this on all sniffer instances '''except one''' when running multiple sniffers that write to the same database. Only a single designated sniffer should manage the database partition structure to avoid metadata table locks and database unresponsiveness.
|-
|-
| <code>cleandatabase = N</code>
| <code>cleandatabase = N</code>
Line 135: Line 169:
| Instructs the sniffer to ignore specific, non-critical database errors to prevent it from stopping during database maintenance.
| Instructs the sniffer to ignore specific, non-critical database errors to prevent it from stopping during database maintenance.
|}
|}
== AI Summary for RAG ==
'''Summary:''' Expert-level guide for configuring VoIPmonitor for extreme high-CPS deployments (10,000+ concurrent calls, 3000+ CDRs/second). Uses "centralized writer architecture": central sniffer collects CDRs from remote sensors, pre-assigns primary keys (<code>mysql_enable_set_id=yes</code>) to eliminate AUTO_INCREMENT locking, and performs batch writes (<code>mysqlstore_concat_limit=10000</code>). MySQL is tuned for maximum write throughput with relaxed durability (<code>innodb_flush_log_at_trx_commit=0</code>), disabled doublewrite, massive buffer pool (150GB), and high I/O capacity (1M IOPS). Hourly partitioning (<code>cdr_partition_by_hours=yes</code>) keeps active tables small. Resolves UI lag, unresponsive GUI, and SQL queue growth issues.
'''Keywords:''' high performance, high cps, centralized writer, batch writing, mysql_enable_set_id, innodb_flush_log_at_trx_commit, innodb_buffer_pool_size, cdr_partition_by_hours, mysqlstore_concat_limit, UI lag, SQL queue, innodb_io_capacity, 10000 concurrent calls
'''Key Questions:'''
* How do I configure VoIPmonitor for very high call volumes (10,000+ concurrent calls)?
* How do I fix unresponsive UI or GUI lag under high load?
* What causes SQL queue (SQLq) growth and how do I resolve it?
* What is the centralized writer architecture?
* What are the recommended MySQL settings for high CDR ingestion rates?
* How does mysql_enable_set_id=yes improve performance?
* What are the benefits and drawbacks of hourly partitioning?
* Why set innodb_flush_log_at_trx_commit to 0?

Latest revision as of 18:00, 6 January 2026

High-Performance VoIPmonitor and MySQL Setup Manual

This manual explains the configuration for a high-call-per-second (CPS) VoIPmonitor deployment. The configuration is split into two parts: the MySQL database and the VoIPmonitor sniffer.

⚠️ Warning: This configuration trades data safety for performance. Settings like innodb_flush_log_at_trx_commit=0 and innodb_doublewrite=0 can lead to data loss on power failure. Use only when performance is critical and data loss risk is acceptable.

Overall Concept

This system is designed for maximum data ingestion performance based on a centralized writer architecture. It trades certain data safety and consistency guarantees for an extremely high write throughput, suitable for large-scale deployments (10,000+ concurrent calls, 3000+ CDRs/second).

The architecture is composed of three distinct stages:

  1. Centralized CDR Ingestion (Sniffer Role): The primary sniffer instance is configured to act as a central data collector. Its main purpose is to receive Call Detail Records (CDRs) from all other sniffer instances in the network. Only the central sniffer writes to the database. All other sniffers must be configured to send their data to this central instance.
  1. Optimized Batch Writing (Sniffer to MySQL): To achieve maximum insert throughput, the central sniffer writes CDRs to MySQL in large, consolidated batches. The database's automatic ID generators (AUTO_INCREMENT) are disabled. The sniffer pre-assigns unique IDs to the records, which allows for massive parallel inserts without locking bottlenecks.
  1. High-Speed Database Storage (MySQL): The MySQL database is heavily optimized to ingest these large data batches at extremely high speeds. For environments with exceptionally high CDR-per-second rates, this configuration supports hourly table partitioning. While hourly partitioning dramatically accelerates write performance by keeping the active table small, it introduces a trade-off for data retrieval (queries across a full day must scan 24 partitions).

MySQL Configuration (my.cnf)

The goal of this configuration is to maximize the write performance of the InnoDB engine, often at the expense of strict ACID durability.

ℹ️ Note: These settings assume a dedicated database server with 256GB+ RAM and NVMe storage. Adjust innodb_buffer_pool_size to 50-70% of available RAM.

Directive Description
skip-log-bin Disables binary logging. This significantly increases write performance because the database doesn't have to record every data modification. The trade-off is the inability to perform point-in-time recovery or use replication.
innodb_flush_log_at_trx_commit=0 The most critical setting for write performance. Instead of flushing the transaction log to disk after every single transaction (default: 1), the database flushes it only once per second. This dramatically speeds up writes but can lead to the loss of the last second's worth of transactions if the server crashes. Usually innodb_flush_log_at_trx_commit=2 is enough but for extreme CDR CPS this setting is crucial.
innodb_flush_log_at_timeout = 1800 Defines the timeout in seconds for the log flushing. However, it is effectively overridden by the more frequent one-second flush from the setting above.
max_heap_table_size = 24G Allows for the creation of very large temporary tables in memory. This is important for complex analytical queries that the VoIPmonitor GUI might run.
table_open_cache=9999 Keeps a large number of table definitions cached. Since VoIPmonitor uses hourly partitioning, it creates a vast number of tables, and this setting prevents the overhead of constantly opening and closing table files.
innodb_log_file_size = 5G Defines a very large size for InnoDB's redo log files. Larger files reduce the frequency of I/O-intensive checkpoint operations, improving overall throughput.
innodb_log_buffer_size = 2G Allocates a large buffer in memory (2 GB) for storing transaction logs before they are flushed to disk. This allows the system to handle large bursts of write activity without immediate disk I/O.
open_files_limit = 200000 Increases the file handle limit for the MySQL process, which is necessary to support the high table_open_cache value.
key_buffer_size = 2G This setting is for the MyISAM storage engine. While most of VoIPmonitor uses InnoDB, some system or temporary tables might still use MyISAM. A 2GB buffer is allocated for its indexes.
sort_buffer_size = 65M A large sort buffer for handling sorting operations (like ORDER BY or GROUP BY) for complex queries.
max_connections = 100000 An extremely high limit for connections. It acts as a safeguard against "too many connections" errors, though the real number of active connections will be much lower.
skip-name-resolve Prevents MySQL from resolving hostnames for incoming connections. This speeds up the connection process by bypassing DNS lookups.
innodb_read_io_threads = 40 Sets a high number of threads for handling read I/O operations, suitable for systems with fast multi-channel storage (like NVMe arrays).
innodb_write_io_threads = 40 Sets a high number of threads for handling write I/O operations, matching the read threads for balanced I/O.
innodb_purge_threads = 20 A high number of threads dedicated to purging old, unneeded data versions, which is crucial in a high-volume write environment to prevent performance degradation.
innodb_flush_neighbors = 0 An optimization for SSD/NVMe drives. It disables a feature useful for rotational HDDs where adjacent blocks on disk are flushed at the same time. This is counterproductive for SSDs.
innodb_io_capacity = 1000000 Informs InnoDB that it's running on a very high-performance storage system capable of 1,000,000 IOPS. This allows InnoDB to be much more aggressive with background I/O tasks.
innodb_io_capacity_max = 10000000 Sets the absolute maximum IOPS the system can burst to, allowing InnoDB to use the full potential of the storage during intensive operations.
innodb_doublewrite = 0 Disables the doublewrite buffer. This provides a significant performance boost but increases the risk of data corruption (a "torn page") if a power failure occurs during a write operation. It's a performance-vs-safety trade-off.
innodb_buffer_pool_size = 150G The most important parameter for InnoDB performance. A massive buffer pool size of 150 GB keeps a large portion of the database (data and indexes) in RAM, minimizing slow disk access.
innodb_flush_method = O_DIRECT Bypasses the operating system's file cache. This prevents double-caching data (once in the OS, once in InnoDB) and gives the database direct control over I/O operations.
transaction-isolation = READ-UNCOMMITTED The lowest level of transaction isolation. It allows transactions to read data that has not yet been committed. This offers maximum performance by minimizing locking.
performance_schema=0 Disables the Performance Schema, a detailed performance monitoring tool. Disabling it slightly reduces overhead and can increase performance.

VoIPmonitor Sniffer Configuration (voipmonitor.conf)

This configuration is tuned to enable the sniffer to process packets as quickly as possible, queue data efficiently, and avoid becoming a bottleneck.

Directive Description
mysqlstore_concat_limit = 10000 Batches up to 10,000 CDRs into a single INSERT statement, dramatically reducing database overhead.
mysqlstore_max_threads_cdr = 40 Sets 40 threads for writing CDRs to the database, essential for high-volume parallel processing.
mysql_enable_new_store = per_query Activates an alternative internal queueing mechanism for SQL queries, which can improve parallelization.
mysql_enable_set_id = yes The sniffer generates the primary key (ID) for records itself, rather than relying on AUTO_INCREMENT in the database. This reduces database locking.
server_sql_queue_limit = 1000000 Sets a massive in-memory queue (1,000,000) for SQL commands to buffer against database slowdowns.
partition_operations_enable_fromto = 4-6 Schedules database partition management to run only between 4:00 and 6:00 AM, avoiding impact during peak hours.
disable_partition_operations = yes Disables partition management operations (creating new partitions, dropping old ones) on this sniffer instance. Use this on all sniffer instances except one when running multiple sniffers that write to the same database. Only a single designated sniffer should manage the database partition structure to avoid metadata table locks and database unresponsiveness.
cleandatabase = N Reduces the data retention period for cleaning the database to N days.
mysql_redirect_cdr_queue = yes Enables a separate, dedicated queue for CDRs being redirected from other sniffers.
partition_operations_drop_first = yes Drops the oldest partition first before creating a new one during cleanup, which can be faster.
cdr_partition_by_hours = yes Partitions CDR tables hourly instead of daily. This keeps the active write table extremely small, which is critical for high performance.
disable_cdr_indexes_rtp = yes Disables some indexes on the cdr_rtp table to speed up writes, at the cost of slower searches on those fields.
sql_errors_skip = 1136,1054 Instructs the sniffer to ignore specific, non-critical database errors to prevent it from stopping during database maintenance.

AI Summary for RAG

Summary: Expert-level guide for configuring VoIPmonitor for extreme high-CPS deployments (10,000+ concurrent calls, 3000+ CDRs/second). Uses "centralized writer architecture": central sniffer collects CDRs from remote sensors, pre-assigns primary keys (mysql_enable_set_id=yes) to eliminate AUTO_INCREMENT locking, and performs batch writes (mysqlstore_concat_limit=10000). MySQL is tuned for maximum write throughput with relaxed durability (innodb_flush_log_at_trx_commit=0), disabled doublewrite, massive buffer pool (150GB), and high I/O capacity (1M IOPS). Hourly partitioning (cdr_partition_by_hours=yes) keeps active tables small. Resolves UI lag, unresponsive GUI, and SQL queue growth issues.

Keywords: high performance, high cps, centralized writer, batch writing, mysql_enable_set_id, innodb_flush_log_at_trx_commit, innodb_buffer_pool_size, cdr_partition_by_hours, mysqlstore_concat_limit, UI lag, SQL queue, innodb_io_capacity, 10000 concurrent calls

Key Questions:

  • How do I configure VoIPmonitor for very high call volumes (10,000+ concurrent calls)?
  • How do I fix unresponsive UI or GUI lag under high load?
  • What causes SQL queue (SQLq) growth and how do I resolve it?
  • What is the centralized writer architecture?
  • What are the recommended MySQL settings for high CDR ingestion rates?
  • How does mysql_enable_set_id=yes improve performance?
  • What are the benefits and drawbacks of hourly partitioning?
  • Why set innodb_flush_log_at_trx_commit to 0?