SQL queue is growing in a peaktime: Difference between revisions

From VoIPmonitor.org
(Add section on when hardware upgrade is required (CPU at 100%))
(Improve page: add storage speed section, VoIPmonitor tuning, MySQL tuning, reference High-Performance only for extreme cases)
 
(15 intermediate revisions by 2 users not shown)
Line 1: Line 1:
= Delay between active call and cdr view =
{{DISPLAYTITLE:SQL Queue Growing During Peak Time}}
The Active Calls view in the GUI displays the timestart of a calls(INVITEs) obtained from the VoIPmonitor sniffer service, whereas the CDR view shows the stored CDRs (after the call ends) from the database.
[[Category:Troubleshooting]]
[[Category:Database]]
[[Category:Performance]]


== Symptoms of Database Delays ==
= SQL Queue Growing During Peak Time =


When the database cannot keep up with CDR insertion rates, you may experience:
When the SQL queue (SQLq/SQLf) grows during peak traffic, the database cannot keep up with the CDR insertion rate. This causes delayed CDR visibility in the GUI and can lead to memory exhaustion.


* '''Slow CDR appearance in GUI''' - New calls take minutes to appear after they end
== Quick Diagnosis ==
* '''"Crontab log is too old" warning''' - The cron job runs slowly (every 5-10 minutes or more) instead of every minute due to database overload
* '''Lag between call end and reporting''' - Daily reports and alerts process outdated data


To diagnose cron-related delays specifically, see [[Alerts#Crontab_Log_is_Too_Old_Warning_-_Database_Performance_Issues|"Crontab log is too old" troubleshooting]].
<kroki lang="mermaid">
%%{init: {'flowchart': {'nodeSpacing': 15, 'rankSpacing': 30}}}%%
flowchart TD
    A[SQLq/SQLf growing] --> B{mysqld CPU 100%?}
    B -->|Yes| C[CPU bottleneck]
    B -->|No| D{High iowait?}
    D -->|Yes| E[I/O bottleneck]
    D -->|No| F[RAM/Buffer bottleneck]


== SQLq/SQLf ==
    C --> G[Upgrade CPU or optimize queries]
In the service status (expanded status, status line) under GUI -> Settings -> Sensors : status, the SQLq/SQLf values represent the size of the queue before the CDRs are pushed to the database. When SQLq/SQLf is high, it usually indicates that the database is unable to process requests in a timely manner, causing them to queue.
    E --> H[Upgrade to SSD/NVMe]
[[Logging#SQLq/SQLf]]
    F --> I[Increase innodb_buffer_pool_size]
</kroki>


== Make sure that db config is not causing io overhead ==
'''Monitor the queue:'''
If the MySQL configuration already follows the recommendations from the scaling section of our documentation (especially innodb_flush_log_at_trx_commit and innodb_buffer_pool_size):
<syntaxhighlight lang="bash">
journalctl -u voipmonitor -f | grep SQLf
</syntaxhighlight>
 
'''Check RRD graphs:''' Settings > Sensors > click graph icon > select peak time period. Look for growing "SQL Cache" or "SQL Cache Files".
 
{{Note|SQLf value should normally stay below 10. If it consistently exceeds 10 or grows during peak time, apply mitigations.}}
 
== Storage Speed is Critical ==
 
'''The most common cause of SQL queue growth is slow storage.''' MySQL write performance depends heavily on disk I/O speed.
 
{| class="wikitable"
! Storage Type !! Typical IOPS !! CDR Write Capacity !! Recommendation
|-
| HDD (7200 RPM) || ~150 || 10-50 CDR/s || '''Not suitable''' for production
|-
| SATA SSD || ~50,000 || 200-500 CDR/s || Suitable for small deployments
|-
| NVMe SSD || ~500,000+ || 1000+ CDR/s || '''Recommended''' for production
|}
 
'''Check your current I/O performance:'''
<syntaxhighlight lang="bash">
# Check disk utilization during peak time
iostat -x 1 5 | grep -E "Device|sda|nvme"
 
# If %util stays above 80%, storage is the bottleneck
</syntaxhighlight>
 
== VoIPmonitor Tuning ==
 
These parameters in <code>/etc/voipmonitor.conf</code> control how the sniffer writes to the database:
 
{| class="wikitable"
! Parameter !! Default !! Recommended !! Description
|-
| <code>mysqlstore_max_threads_cdr</code> || 1 || 4-8 || Number of parallel threads writing CDRs to database
|-
| <code>mysqlstore_concat_limit</code> || 400 || 1000-2000 || CDRs batched into single INSERT statement
|-
| <code>query_cache</code> || no || yes || Write queries to disk when queue grows (prevents OOM)
|}
 
'''Example configuration:'''
<syntaxhighlight lang="ini">
# /etc/voipmonitor.conf
mysqlstore_max_threads_cdr = 8
mysqlstore_concat_limit = 2000
query_cache = yes
</syntaxhighlight>
 
Restart after changes: <code>systemctl restart voipmonitor</code>
 
== MySQL Tuning ==
 
=== Essential Settings ===
 
These settings are safe for most deployments and significantly improve write performance:
 
<syntaxhighlight lang="ini">
# /etc/mysql/mysql.conf.d/mysqld.cnf


https://www.voipmonitor.org/doc/Scaling#SSDs
# Buffer pool - use 50-70% of available RAM
innodb_buffer_pool_size = 8G


== More threads/connections to a db ==
# Faster writes - flush log once per second instead of every transaction
you can also increase the number of threads used for connection to a db for particular use like CDRs - the VoIPmonitor sniffer service in /etc/voipmonitor.conf uses the option
# Small risk: up to 1 second of data loss on crash
innodb_flush_log_at_trx_commit = 2


mysqlstore_max_threads_cdr = 4
# I/O threads - adjust to your CPU core count
innodb_read_io_threads = 4
innodb_write_io_threads = 4
innodb_purge_threads = 2


If you are processing a high volume of SIP OPTIONS, SUBSCRIBE, or NOTIFY messages and see the <code>sip_msg</code> queue growing, you can increase threads specifically for those messages:
# Skip DNS lookups for faster connections
skip-name-resolve


mysqlstore_max_threads_sip_msg = 4
# Larger log files = fewer checkpoints = better performance
innodb_log_file_size = 1G
innodb_log_buffer_size = 256M
</syntaxhighlight>


However, if the database is waiting for storage I/O, increasing the number of threads will not help.
=== SSD/NVMe Specific Settings ===


== When Configuration Tuning Is Not Enough: Hardware Upgrade Required ==
If using SSD or NVMe storage, add these optimizations:


If you have applied all the configuration optimizations above and the SQL queue continues to grow or the database remains significantly behind the processing queue, the underlying issue may be insufficient hardware.
<syntaxhighlight lang="ini">
# Disable HDD optimization (not needed for SSD)
innodb_flush_neighbors = 0


=== Signs That Hardware Upgrade Is Necessary ===
# Tell InnoDB about fast storage (adjust to your disk specs)
innodb_io_capacity = 2000
innodb_io_capacity_max = 4000


* '''CPU load is consistently at or near 100%''' on both the database and GUI servers, even during normal traffic patterns
# Bypass OS cache - better for dedicated DB servers
* '''Old or aging hardware''' - Servers with CPUs more than several years old may lack the performance of modern processors
innodb_flush_method = O_DIRECT
* '''Configuration tuning provides only marginal improvement''' - After applying MySQL and VoIPmonitor optimizations, the delay between the "Last CDR in processing queue" and "Last CDR in database" remains significant
</syntaxhighlight>


=== Diagnosing Hardware Limitations ===
Restart MySQL after changes: <code>systemctl restart mysql</code>


Monitor CPU usage on both the database and GUI servers:
=== Verify Settings ===


<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
# Check CPU load during peak traffic
# Check current buffer pool size
top
mysql -e "SHOW VARIABLES LIKE 'innodb_buffer_pool_size';"
 
# Check if buffer pool is fully utilized
mysql -e "SHOW STATUS LIKE 'Innodb_buffer_pool_pages%';"
</syntaxhighlight>
 
== Prevention ==
 
* '''Schedule heavy operations off-peak:''' Configure alerts and reports to run at night
* '''Separate components:''' Use dedicated database server for high-traffic deployments
* '''Monitor trends:''' Set up alerts for SQLq growth before it becomes critical
 
== Extreme High-Volume Deployments ==
 
For deployments processing '''hundreds or thousands of CDRs per second''' (10,000+ concurrent calls), standard tuning is not sufficient. These scenarios require:


# Or use sar for historical data
* Disabling AUTO_INCREMENT and letting the sniffer generate IDs
sar -u 1 10
* Hourly table partitioning instead of daily
* Batch sizes of 10,000+ CDRs
* Aggressive MySQL settings that trade durability for performance


# Check CPU core usage per process
See [[High-Performance_VoIPmonitor_and_MySQL_Setup_Manual]] for detailed configuration.
mpstat -P ALL 1 5
</syntaxhighlight>


If the CPU load is consistently at or near 100% across multiple cores, the hardware cannot keep up with the traffic load. No amount of configuration tuning will solve this issue - the servers themselves need to be upgraded.
== See Also ==


=== Hardware Upgrade Recommendations ===
* [[Database_troubleshooting]] - Comprehensive database troubleshooting guide
* [[Emergency_procedures]] - RRD graph diagnostics for bottleneck identification
* [[Scaling]] - Performance tuning and scaling guidelines
* [[High-Performance_VoIPmonitor_and_MySQL_Setup_Manual]] - Configuration for 1000+ CDR/s deployments


Upgrade to a more modern CPU architecture with significantly better performance:


* '''CPU''' - Modern AMD EPYC or Intel Xeon Gold/Silver processors with more cores
* '''RAM''' - Ensure sufficient memory for the database buffer pool (see [[Scaling#Memory_Configuration|Memory Configuration]])
* '''Storage''' - Use NVMe SSDs for the database to eliminate I/O bottlenecks


=== Architecture Consideration: Merge GUI and Database ===


In some cases, merging the GUI and database roles onto a single, powerful new server can be more efficient than maintaining separate, underpowered servers. A single modern server with sufficient CPU cores and RAM can often handle both workloads more effectively than multiple older servers.


For hardware sizing examples, see the [[Hardware]] page, which includes real-world deployments for various call volumes.


== AI Summary for RAG ==
== AI Summary for RAG ==
'''Summary:''' This page explains the delay between Active Calls view and CDR view, which occurs when the database cannot keep up with CDR insertion rates. Symptoms include slow CDR appearance, "Crontab log is too old" warning, lag between call end and reporting. The SQLq/SQLf metric indicates queue size before CDRs are pushed to the database. Configuration fixes include: ensuring MySQL config follows recommendations from Scaling page (especially innodb_flush_log_at_trx_commit and innodb_buffer_pool_size), increasing database connection threads via mysqlstore_max_threads_cdr and mysqlstore_max_threads_sip_msg in voipmonitor.conf. CRITICAL: If CPU load is consistently at or near 100% on database/GUI servers even after configuration tuning, this indicates insufficient hardware and requires hardware upgrade. Monitor CPU usage using top, sar, or mpstat. Upgrade to modern AMD EPYC or Intel Xeon Gold/Silver with more cores, sufficient RAM for buffer pool, and NVMe SSD storage. Consider merging GUI and database onto a single powerful server instead of maintaining separate underpowered servers. See Hardware page for sizing examples.


'''Keywords:''' SQL queue, SQLq, SQLf, delay, active calls, CDR, database bottleneck, mysqlstore_max_threads_cdr, mysqlstore_max_threads_sip_msg, innodb_flush_log_at_trx_commit, innodb_buffer_pool_size, Crontab log, hardware upgrade, CPU 100%, aging hardware, inadequate hardware, server upgrade, modern CPU, AMD EPYC, Intel Xeon, NVMe SSD, merge GUI and database, insuff hardware limit, configuration not enough
'''Summary:''' Troubleshooting guide for SQL queue (SQLq/SQLf) growth during peak traffic periods. The queue grows when the database cannot keep up with CDR insertion rate. Diagnosis involves checking mysqld CPU usage, iowait, and RRD graphs in GUI (Settings > Sensors > graph icon). Immediate mitigations: enable <code>query_cache=yes</code> to prevent OOM, <code>quick_save_cdr=yes</code> to reduce GUI delay, <code>mysqlstore_max_threads_cdr=8</code> for parallel writes. Root causes are typically I/O bottleneck (upgrade HDD to SSD/NVMe), RAM bottleneck (increase <code>innodb_buffer_pool_size</code> to 50-70% of RAM), or CPU bottleneck (hardware upgrade needed). Prevention includes scheduling heavy operations off-peak and using dedicated database servers.
 
'''Keywords:''' SQL queue, SQLq, SQLf, peak time, database bottleneck, query_cache, quick_save_cdr, mysqlstore_max_threads_cdr, innodb_buffer_pool_size, I/O bottleneck, SSD upgrade, OOM, CDR delay, database performance, peak traffic
 
'''Key Questions:'''
* Why is the SQL queue growing during peak hours?
* How do I monitor SQL queue size in VoIPmonitor?
* What causes CDR delays in the GUI?
* How do I prevent out-of-memory errors when SQL queue grows?
* Should I upgrade to SSD for VoIPmonitor database?
* What is the recommended innodb_buffer_pool_size setting?
* How do I speed up CDR insertion during high traffic?

Latest revision as of 20:58, 20 January 2026


SQL Queue Growing During Peak Time

When the SQL queue (SQLq/SQLf) grows during peak traffic, the database cannot keep up with the CDR insertion rate. This causes delayed CDR visibility in the GUI and can lead to memory exhaustion.

Quick Diagnosis

Monitor the queue:

journalctl -u voipmonitor -f | grep SQLf

Check RRD graphs: Settings > Sensors > click graph icon > select peak time period. Look for growing "SQL Cache" or "SQL Cache Files".

ℹ️ Note: SQLf value should normally stay below 10. If it consistently exceeds 10 or grows during peak time, apply mitigations.

Storage Speed is Critical

The most common cause of SQL queue growth is slow storage. MySQL write performance depends heavily on disk I/O speed.

Storage Type Typical IOPS CDR Write Capacity Recommendation
HDD (7200 RPM) ~150 10-50 CDR/s Not suitable for production
SATA SSD ~50,000 200-500 CDR/s Suitable for small deployments
NVMe SSD ~500,000+ 1000+ CDR/s Recommended for production

Check your current I/O performance:

# Check disk utilization during peak time
iostat -x 1 5 | grep -E "Device|sda|nvme"

# If %util stays above 80%, storage is the bottleneck

VoIPmonitor Tuning

These parameters in /etc/voipmonitor.conf control how the sniffer writes to the database:

Parameter Default Recommended Description
mysqlstore_max_threads_cdr 1 4-8 Number of parallel threads writing CDRs to database
mysqlstore_concat_limit 400 1000-2000 CDRs batched into single INSERT statement
query_cache no yes Write queries to disk when queue grows (prevents OOM)

Example configuration:

# /etc/voipmonitor.conf
mysqlstore_max_threads_cdr = 8
mysqlstore_concat_limit = 2000
query_cache = yes

Restart after changes: systemctl restart voipmonitor

MySQL Tuning

Essential Settings

These settings are safe for most deployments and significantly improve write performance:

# /etc/mysql/mysql.conf.d/mysqld.cnf

# Buffer pool - use 50-70% of available RAM
innodb_buffer_pool_size = 8G

# Faster writes - flush log once per second instead of every transaction
# Small risk: up to 1 second of data loss on crash
innodb_flush_log_at_trx_commit = 2

# I/O threads - adjust to your CPU core count
innodb_read_io_threads = 4
innodb_write_io_threads = 4
innodb_purge_threads = 2

# Skip DNS lookups for faster connections
skip-name-resolve

# Larger log files = fewer checkpoints = better performance
innodb_log_file_size = 1G
innodb_log_buffer_size = 256M

SSD/NVMe Specific Settings

If using SSD or NVMe storage, add these optimizations:

# Disable HDD optimization (not needed for SSD)
innodb_flush_neighbors = 0

# Tell InnoDB about fast storage (adjust to your disk specs)
innodb_io_capacity = 2000
innodb_io_capacity_max = 4000

# Bypass OS cache - better for dedicated DB servers
innodb_flush_method = O_DIRECT

Restart MySQL after changes: systemctl restart mysql

Verify Settings

# Check current buffer pool size
mysql -e "SHOW VARIABLES LIKE 'innodb_buffer_pool_size';"

# Check if buffer pool is fully utilized
mysql -e "SHOW STATUS LIKE 'Innodb_buffer_pool_pages%';"

Prevention

  • Schedule heavy operations off-peak: Configure alerts and reports to run at night
  • Separate components: Use dedicated database server for high-traffic deployments
  • Monitor trends: Set up alerts for SQLq growth before it becomes critical

Extreme High-Volume Deployments

For deployments processing hundreds or thousands of CDRs per second (10,000+ concurrent calls), standard tuning is not sufficient. These scenarios require:

  • Disabling AUTO_INCREMENT and letting the sniffer generate IDs
  • Hourly table partitioning instead of daily
  • Batch sizes of 10,000+ CDRs
  • Aggressive MySQL settings that trade durability for performance

See High-Performance_VoIPmonitor_and_MySQL_Setup_Manual for detailed configuration.

See Also




AI Summary for RAG

Summary: Troubleshooting guide for SQL queue (SQLq/SQLf) growth during peak traffic periods. The queue grows when the database cannot keep up with CDR insertion rate. Diagnosis involves checking mysqld CPU usage, iowait, and RRD graphs in GUI (Settings > Sensors > graph icon). Immediate mitigations: enable query_cache=yes to prevent OOM, quick_save_cdr=yes to reduce GUI delay, mysqlstore_max_threads_cdr=8 for parallel writes. Root causes are typically I/O bottleneck (upgrade HDD to SSD/NVMe), RAM bottleneck (increase innodb_buffer_pool_size to 50-70% of RAM), or CPU bottleneck (hardware upgrade needed). Prevention includes scheduling heavy operations off-peak and using dedicated database servers.

Keywords: SQL queue, SQLq, SQLf, peak time, database bottleneck, query_cache, quick_save_cdr, mysqlstore_max_threads_cdr, innodb_buffer_pool_size, I/O bottleneck, SSD upgrade, OOM, CDR delay, database performance, peak traffic

Key Questions:

  • Why is the SQL queue growing during peak hours?
  • How do I monitor SQL queue size in VoIPmonitor?
  • What causes CDR delays in the GUI?
  • How do I prevent out-of-memory errors when SQL queue grows?
  • Should I upgrade to SSD for VoIPmonitor database?
  • What is the recommended innodb_buffer_pool_size setting?
  • How do I speed up CDR insertion during high traffic?