Emergency procedures: Difference between revisions
(Review: použití šablon {{Note}} a {{Warning}} místo prostého textu) |
(Rewrite: streamlined structure, improved flowchart, clearer diagnostic steps) |
||
| Line 1: | Line 1: | ||
{{DISPLAYTITLE:Emergency Procedures: GUI Performance Crisis}} | |||
[[Category:Troubleshooting]] | |||
[[Category:Database]] | |||
= GUI Performance Crisis: Database Bottleneck Diagnosis = | |||
This guide | When VoIPmonitor GUI becomes unresponsive or PHP processes are killed by OOM, the root cause is often a '''database bottleneck''', not PHP configuration. This guide shows how to diagnose using sensor RRD charts. | ||
{{Note|For general troubleshooting, see [[Database_troubleshooting]], [[GUI_troubleshooting]], or [[Sniffer_troubleshooting]].}} | |||
== Symptoms == | |||
* GUI extremely slow or unresponsive during peak hours | |||
* PHP processes killed by OOM killer | |||
* Dashboard and CDR views take long to load | |||
* Alerts/reports fail during high traffic | |||
* System fine during off-peak, degrades during peak | |||
== Diagnostic Flowchart == | == Diagnostic Flowchart == | ||
<kroki lang="mermaid"> | <kroki lang="mermaid"> | ||
%%{init: {'flowchart': {'nodeSpacing': 15, 'rankSpacing': 35}}}%% | |||
flowchart TD | flowchart TD | ||
A[GUI | A[GUI Slow / OOM Errors] --> B{Check RRD Charts<br/>Settings → Sensors → 📊} | ||
B --> C{SQL Cache | B --> C{SQL Cache growing<br/>during peak?} | ||
C -->|No| D[ | C -->|No| D[NOT database bottleneck<br/>Check PHP/Apache config] | ||
C -->|Yes| E[Database Bottleneck<br/>Confirmed] | C -->|Yes| E[Database Bottleneck<br/>Confirmed] | ||
E --> F | E --> F{mysqld CPU ~100%?} | ||
F -->|Yes| G[CPU Bottleneck<br/>→ Upgrade CPU] | |||
F -->|No| H{High iowait?<br/>HDD storage?} | |||
H -->|Yes| I[I/O Bottleneck<br/>→ Upgrade to SSD/NVMe] | |||
H -->|No| J[Memory Bottleneck<br/>→ Add RAM + tune buffer_pool] | |||
style A fill:#f9f,stroke:#333 | style A fill:#f9f,stroke:#333 | ||
style E fill:#ff9,stroke:#333 | style E fill:#ff9,stroke:#333 | ||
</kroki> | </kroki> | ||
== | == Step 1: Access RRD Charts == | ||
The | # Navigate to '''Settings → Sensors''' | ||
# Click the '''graph icon''' (📊) next to the sensor | |||
# Select time range covering problematic peak hours | |||
== Step 2: Identify Growing SQL Cache == | |||
The key indicator is '''SQL Cache''' or '''SQL Cache Files''' growing during peak hours: | |||
{| class="wikitable" | {| class="wikitable" | ||
| Line 58: | Line 50: | ||
! Metric !! What to Look For !! Indicates | ! Metric !! What to Look For !! Indicates | ||
|- | |- | ||
| '''SQL Cache''' || Consistently increasing | | '''SQL Cache''' || Consistently increasing, never decreasing || DB cannot keep up with inserts | ||
|- | |- | ||
| '''SQL Cache Files''' || Growing over time | | '''SQL Cache Files''' || Growing over time || Buffer pool too small or storage too slow | ||
|- | |- | ||
| '''CPU | | '''mysqld CPU''' || Near 100% || CPU bottleneck | ||
|- | |- | ||
| '''Disk I/O (mysql)''' || High | | '''Disk I/O (mysql)''' || High/saturated || Storage bottleneck (HDD vs SSD) | ||
|} | |} | ||
If | {{Warning|1=If SQL Cache is NOT growing, the problem is likely NOT the database. Check PHP/Apache configuration instead.}} | ||
== | == Step 3: Identify Bottleneck Type == | ||
=== CPU Bottleneck === | === CPU Bottleneck === | ||
* | * <code>mysqld</code> at or near 100% CPU | ||
* '''Solution:''' More CPU cores or faster CPU | |||
=== Memory Bottleneck === | === Memory Bottleneck === | ||
* | * SQL cache fills up and stays full | ||
* | * Buffer pool too small for dataset | ||
* '''Solution:''' Add RAM, tune <code>innodb_buffer_pool_size</code> | |||
=== Storage I/O Bottleneck (Most Common) === | === Storage I/O Bottleneck (Most Common) === | ||
* High | * High <code>iowait</code> during peak hours | ||
* Database on magnetic disks (HDD) | |||
* Database | * '''Solution:''' Upgrade to SSD/NVMe (10-50x improvement) | ||
== Solutions | == Solutions == | ||
=== | === Add RAM to Database Server === | ||
<syntaxhighlight lang="ini"> | |||
# /etc/mysql/my.cnf | |||
# Set to 50-70% of total RAM on dedicated DB server | |||
innodb_buffer_pool_size = 64G | |||
</syntaxhighlight> | |||
{| class="wikitable" | {| class="wikitable" | ||
|- | |- | ||
! Current RAM !! Recommended | ! Current RAM !! Recommended !! <code>innodb_buffer_pool_size</code> | ||
|- | |- | ||
| | | 32GB || 64-128GB || 32-64G | ||
|- | |- | ||
| | | 64GB || 128-256GB || 64-128G | ||
|} | |} | ||
=== Upgrade to SSD/NVMe === | |||
{| class="wikitable" | {| class="wikitable" | ||
|- | |- | ||
! Current | ! Current !! Upgrade To !! Expected Speedup | ||
|- | |- | ||
| 10K RPM | | 10K RPM HDD || NVMe SSD || 10-50x | ||
|- | |- | ||
| | | SAS HDD || Enterprise SSD || 5-20x | ||
|- | |- | ||
| Older SSD || | | Older SSD || NVMe (PCIe 4.0+) || 2-5x | ||
|} | |} | ||
=== Temporary Mitigation === | |||
If immediate hardware upgrade not possible: | |||
* '''Alerts:''' Reduce frequency or schedule during off-peak (2am-4am) | |||
* '''Reports:''' Schedule for off-peak hours | |||
* '''Dashboards:''' Simplify queries, avoid "All time" ranges | |||
=== | === Component Separation === | ||
For persistent issues, consider dedicated database server: | |||
* '''Host 1:''' Database (max RAM + SSD/NVMe) | |||
* '''Host 1:''' | |||
* '''Host 2:''' GUI web server | * '''Host 2:''' GUI web server | ||
* '''Host 3:''' Sensor(s) | * '''Host 3:''' Sensor(s) | ||
See [[Scaling#Scaling_Through_Component_Separation|Scaling - Component Separation]]. | |||
== Common Mistakes == | |||
= | {{Warning|1=These do NOT fix database bottlenecks:}} | ||
{| class="wikitable | {| class="wikitable" | ||
|- | |- | ||
! | ! Wrong Action !! Why It Fails | ||
|- | |- | ||
| | | Reducing PHP <code>memory_limit</code> || PHP waits for DB; less memory = earlier crashes | ||
| | |||
|- | |- | ||
| | | Adding more PHP-FPM workers || More workers pile up waiting for slow DB | ||
| More workers | |||
|- | |- | ||
| | | Reducing <code>innodb_buffer_pool_size</code> || Makes DB slower, increases disk I/O | ||
| | |||
|- | |- | ||
| | | Adding RAM to GUI server || Bottleneck is DB server, not GUI | ||
| | |||
|} | |} | ||
== Verification | == Verification == | ||
After implementing fix: | |||
# Monitor SQL cache during next peak period | |||
# Verify SQL cache does NOT grow uncontrollably | |||
# Confirm GUI responsiveness | |||
# Check for OOM killer events in system logs | |||
== | == See Also == | ||
* [[ | * [[Database_troubleshooting]] - SQL queue issues, CDR delays | ||
* [[Scaling | * [[Scaling]] - Performance tuning and database optimization | ||
* [[GUI_troubleshooting]] - HTTP 500, login issues, debug mode | |||
* [[ | |||
== AI Summary for RAG == | == AI Summary for RAG == | ||
'''Summary:''' | '''Summary:''' Emergency guide for diagnosing database bottlenecks affecting VoIPmonitor GUI using sensor RRD charts. Symptoms: GUI unresponsive during peak hours, OOM killer terminating PHP, slow dashboards. KEY DIAGNOSTIC: Access RRD charts (Settings → Sensors → graph icon), look for growing SQL cache during peak hours - primary indicator of DB bottleneck. Bottleneck types: CPU (mysqld at 100%), Memory (buffer pool full), Storage I/O (most common - high iowait, HDD storage). Solutions: (1) Add RAM and set innodb_buffer_pool_size to 50-70% of RAM; (2) Upgrade HDD to SSD/NVMe (10-50x speedup); (3) Schedule alerts/reports off-peak; (4) Component separation with dedicated DB server. WRONG solutions: Do NOT reduce PHP memory_limit, do NOT add PHP-FPM workers, do NOT reduce innodb_buffer_pool_size, do NOT add RAM to GUI server. | ||
'''Keywords:''' database bottleneck, RRD charts | '''Keywords:''' database bottleneck, RRD charts, SQL cache, peak hours, OOM killer, GUI slow, GUI unresponsive, innodb_buffer_pool_size, SSD upgrade, NVMe, iowait, component separation, emergency procedures, performance crisis | ||
'''Key Questions:''' | '''Key Questions:''' | ||
* How | * How to diagnose database bottlenecks in VoIPmonitor? | ||
* What | * What does growing SQL cache in RRD charts indicate? | ||
* Why is | * Why is VoIPmonitor GUI slow during peak hours? | ||
* How to fix OOM killer terminating PHP processes? | * How to fix OOM killer terminating PHP processes? | ||
* Should I upgrade RAM on GUI | * Should I upgrade RAM on GUI or database server? | ||
* What storage is recommended for VoIPmonitor database? | * What storage is recommended for VoIPmonitor database? | ||
* How to access sensor RRD charts | * How to access sensor RRD charts? | ||
* What are | * What are wrong solutions for database bottlenecks? | ||
Latest revision as of 16:48, 8 January 2026
GUI Performance Crisis: Database Bottleneck Diagnosis
When VoIPmonitor GUI becomes unresponsive or PHP processes are killed by OOM, the root cause is often a database bottleneck, not PHP configuration. This guide shows how to diagnose using sensor RRD charts.
ℹ️ Note: For general troubleshooting, see Database_troubleshooting, GUI_troubleshooting, or Sniffer_troubleshooting.
Symptoms
- GUI extremely slow or unresponsive during peak hours
- PHP processes killed by OOM killer
- Dashboard and CDR views take long to load
- Alerts/reports fail during high traffic
- System fine during off-peak, degrades during peak
Diagnostic Flowchart
Step 1: Access RRD Charts
- Navigate to Settings → Sensors
- Click the graph icon (📊) next to the sensor
- Select time range covering problematic peak hours
Step 2: Identify Growing SQL Cache
The key indicator is SQL Cache or SQL Cache Files growing during peak hours:
| Metric | What to Look For | Indicates |
|---|---|---|
| SQL Cache | Consistently increasing, never decreasing | DB cannot keep up with inserts |
| SQL Cache Files | Growing over time | Buffer pool too small or storage too slow |
| mysqld CPU | Near 100% | CPU bottleneck |
| Disk I/O (mysql) | High/saturated | Storage bottleneck (HDD vs SSD) |
⚠️ Warning: If SQL Cache is NOT growing, the problem is likely NOT the database. Check PHP/Apache configuration instead.
Step 3: Identify Bottleneck Type
CPU Bottleneck
mysqldat or near 100% CPU- Solution: More CPU cores or faster CPU
Memory Bottleneck
- SQL cache fills up and stays full
- Buffer pool too small for dataset
- Solution: Add RAM, tune
innodb_buffer_pool_size
Storage I/O Bottleneck (Most Common)
- High
iowaitduring peak hours - Database on magnetic disks (HDD)
- Solution: Upgrade to SSD/NVMe (10-50x improvement)
Solutions
Add RAM to Database Server
# /etc/mysql/my.cnf
# Set to 50-70% of total RAM on dedicated DB server
innodb_buffer_pool_size = 64G
| Current RAM | Recommended | innodb_buffer_pool_size
|
|---|---|---|
| 32GB | 64-128GB | 32-64G |
| 64GB | 128-256GB | 64-128G |
Upgrade to SSD/NVMe
| Current | Upgrade To | Expected Speedup |
|---|---|---|
| 10K RPM HDD | NVMe SSD | 10-50x |
| SAS HDD | Enterprise SSD | 5-20x |
| Older SSD | NVMe (PCIe 4.0+) | 2-5x |
Temporary Mitigation
If immediate hardware upgrade not possible:
- Alerts: Reduce frequency or schedule during off-peak (2am-4am)
- Reports: Schedule for off-peak hours
- Dashboards: Simplify queries, avoid "All time" ranges
Component Separation
For persistent issues, consider dedicated database server:
- Host 1: Database (max RAM + SSD/NVMe)
- Host 2: GUI web server
- Host 3: Sensor(s)
See Scaling - Component Separation.
Common Mistakes
⚠️ Warning: These do NOT fix database bottlenecks:
| Wrong Action | Why It Fails |
|---|---|
Reducing PHP memory_limit |
PHP waits for DB; less memory = earlier crashes |
| Adding more PHP-FPM workers | More workers pile up waiting for slow DB |
Reducing innodb_buffer_pool_size |
Makes DB slower, increases disk I/O |
| Adding RAM to GUI server | Bottleneck is DB server, not GUI |
Verification
After implementing fix:
- Monitor SQL cache during next peak period
- Verify SQL cache does NOT grow uncontrollably
- Confirm GUI responsiveness
- Check for OOM killer events in system logs
See Also
- Database_troubleshooting - SQL queue issues, CDR delays
- Scaling - Performance tuning and database optimization
- GUI_troubleshooting - HTTP 500, login issues, debug mode
AI Summary for RAG
Summary: Emergency guide for diagnosing database bottlenecks affecting VoIPmonitor GUI using sensor RRD charts. Symptoms: GUI unresponsive during peak hours, OOM killer terminating PHP, slow dashboards. KEY DIAGNOSTIC: Access RRD charts (Settings → Sensors → graph icon), look for growing SQL cache during peak hours - primary indicator of DB bottleneck. Bottleneck types: CPU (mysqld at 100%), Memory (buffer pool full), Storage I/O (most common - high iowait, HDD storage). Solutions: (1) Add RAM and set innodb_buffer_pool_size to 50-70% of RAM; (2) Upgrade HDD to SSD/NVMe (10-50x speedup); (3) Schedule alerts/reports off-peak; (4) Component separation with dedicated DB server. WRONG solutions: Do NOT reduce PHP memory_limit, do NOT add PHP-FPM workers, do NOT reduce innodb_buffer_pool_size, do NOT add RAM to GUI server.
Keywords: database bottleneck, RRD charts, SQL cache, peak hours, OOM killer, GUI slow, GUI unresponsive, innodb_buffer_pool_size, SSD upgrade, NVMe, iowait, component separation, emergency procedures, performance crisis
Key Questions:
- How to diagnose database bottlenecks in VoIPmonitor?
- What does growing SQL cache in RRD charts indicate?
- Why is VoIPmonitor GUI slow during peak hours?
- How to fix OOM killer terminating PHP processes?
- Should I upgrade RAM on GUI or database server?
- What storage is recommended for VoIPmonitor database?
- How to access sensor RRD charts?
- What are wrong solutions for database bottlenecks?