Comprehensive Guide to VoIP Voice Quality
| Quick Navigation | ||
|---|---|---|
| Degradation Factors | Measurement Methods | VoIPmonitor Features |
|
Traditional Impairments
IP Network Impairments |
Subjective Objective |
Core Features
Troubleshooting |
Introduction
Voice over IP quality depends on managing both traditional telephony impairments (loudness, noise, echo) and IP-network-specific issues (jitter, packet loss, delay). This guide covers:
- Voice quality degradation factors and their measurement
- Subjective and objective quality metrics (MOS, PESQ, POLQA, E-model)
- Practical monitoring with VoIPmonitor
Voice Quality Degradation Factors
Traditional Impairments
Volume and Loudness
Loudness is quantified using Loudness Ratings:
- Send Loudness Rating (SLR): Microphone/transmit gain
- Receive Loudness Rating (RLR): Earphone/receive sensitivity
- Overall Loudness Rating (OLR):
| OLR Range | Impact |
|---|---|
| < 5 dB | Too loud, discomfort |
| 5-15 dB | Optimal range |
| > 15 dB | Too quiet, speech difficult to understand |
Noise
Circuit noise (electrical hiss/hum) and background noise (ambient sound) reduce Signal-to-Noise Ratio.
| Level | Impact |
|---|---|
| -80 to -70 | Negligible |
| -65 to -55 | Noticeable degradation |
| > -40 | Severe degradation |
Sidetone
Sidetone is hearing your own voice in the earpiece. Measured by Sidetone Masking Rating (STMR):
- Optimal: 8-12 dB
- Too high → "dead phone" feeling
- Too low → distracting echo
Echo
- Talker Echo: Speaker hears their own voice reflected back
- Listener Echo: Same phenomenon from opposite perspective
| Parameter | Description |
|---|---|
| Echo Return Loss (ERL) | Attenuation of echo vs original |
| Echo Delay | Round-trip time to echo source |
| TELR | Talker Echo Loudness Rating |
ℹ️ Note: Echo becomes more annoying as delay increases. At <30ms it merges with sidetone; at >100ms it's highly disruptive.
Mitigation: Echo cancellers (ITU-T G.168), proper impedance matching, acoustic echo cancellation for speakerphones.
Frequency Response
- Narrowband (G.711): 300 Hz - 3.4 kHz
- Wideband (G.722, Opus): Up to 7 kHz (more natural sound)
Delay (Latency)
End-to-end one-way delay includes codec, packetization, network, and jitter buffer delays.
| One-way Delay | Assessment |
|---|---|
| 0-150 ms | Acceptable (unnoticeable) |
| 150-400 ms | Acceptable but impacts conversation flow |
| > 400 ms | Generally unacceptable |
Codec Impairments
Each codec has inherent quality limits expressed as Equipment Impairment Factor (Ie):
| Codec | Ie | Max MOS |
|---|---|---|
| G.711 (64k) | 0 | 4.4-4.5 |
| G.729A (8k) | 10-11 | ~4.0 |
| G.723.1 (5.3k) | 15-19 | ~3.6 |
IP Network Impairments
Jitter (Packet Delay Variation)
Jitter is the variation in packet arrival times. Receivers use jitter buffers to absorb this variation:
- Static buffer: Fixed size (simpler, higher latency)
- Adaptive buffer: Adjusts dynamically (lower latency, handles varying conditions)
Trade-off: Larger buffer = fewer late packets dropped, but more delay.
Jitter buffer types and their effect:
- Packet arrives within buffer window → played in sequence
- Packet arrives after playout deadline → effectively lost
Packet Loss
| Loss Rate | Impact |
|---|---|
| 0-1% | Negligible with good PLC |
| 1-3% | Minor degradation |
| 3-5% | Noticeable degradation |
| > 10% | Severe, often unacceptable |
Burst vs Random Loss: Burst losses (consecutive packets) are far more damaging than evenly distributed losses at the same percentage. The Burst Ratio (BurstR) quantifies this pattern.
Packet Loss Concealment (PLC): Modern codecs hide isolated losses by interpolating or repeating audio frames.
Voice Quality Measurement Methods
MOS and Subjective Testing
Mean Opinion Score (MOS) is a 1-5 scale:
| Score | Quality |
|---|---|
| 5 | Excellent |
| 4 | Good |
| 3 | Fair |
| 2 | Poor |
| 1 | Bad |
Listening-Only MOS (MOS-LQ)
Listeners rate one-way speech samples. Evaluates fidelity and distortion.
Conversational MOS (MOS-CQ)
Two people rate a real conversation. Captures delay, echo, and interaction factors.
Objective Measurement
Intrusive Models (PESQ, POLQA)
Compare reference audio with degraded audio:
- PESQ (P.862): Narrowband, widely used
- POLQA (P.863): Successor, supports wideband/super-wideband
The E-Model (ITU-T G.107)
Parametric model that predicts quality from network/codec parameters:
Where:
- R0 (~94): Base signal-to-noise rating
- Is: Simultaneous impairments (noise, sidetone)
- Id: Delay impairment
- Ie: Equipment/codec impairment
- A: Advantage factor (usually 0)
R-Factor to MOS mapping:
| R-Factor | MOS | Quality |
|---|---|---|
| 90+ | > 4.3 | Excellent |
| 80 | ~4.0 | Good |
| 70 | ~3.6 | Acceptable |
| 60 | ~3.1 | Fair |
| < 50 | < 2.6 | Poor |
ℹ️ Note: VoIPmonitor does NOT calculate R-Factor separately because it's mathematically redundant with MOS. See Glossary for details.
Monitoring Voice Quality with VoIPmonitor
VoIPmonitor passively captures SIP/RTP traffic and computes quality metrics for every call.
Triple MOS Calculation
VoIPmonitor provides three MOS values simulating different jitter buffer behaviors:
| MOS Type | Buffer Size | Use Case |
|---|---|---|
| MOS F1 | Fixed 50 ms | Simulates basic endpoints |
| MOS F2 | Fixed 200 ms | Simulates buffered endpoints |
| MOS Adapt | Adaptive up to 500 ms | Simulates modern endpoints |
Interpreting the values:
- If MOS F1 << MOS F2: Network jitter often exceeded 50ms but stayed under 200ms
- If all three are similar and high: Jitter was low
- If all three are low: Significant packet loss or extreme jitter
⚠️ Warning: MOS Calculation Requires RTP Flow Charts
MOS calculation REQUIRES enabling RTP flow charts:
- Set
savegraph=yesinvoipmonitor.conf, OR - Set
recordGRAPH=ONin capture rules
Without this, MOS scores will NOT be calculated.
Why High PDV Causes Low MOS Even With Zero Packet Loss
A call can show 0% packet loss and 0ms average jitter but still have low MOS. This occurs when:
- Packets arrive with high variation (e.g., some at 20ms, others at 200ms+)
- These delayed packets exceed jitter buffer capacity
- The E-model treats them as "effectively lost" for playout purposes
This is correct behavior - the audio stream cannot be reconstructed smoothly regardless of all packets eventually arriving.
Jitter Analysis
VoIPmonitor records jitter as a distribution across delay bins:
- 0-50 ms, 50-70 ms, 70-90 ms, 90-120 ms, 120-150 ms, 150-200 ms, >300 ms
Filter calls by PDV events: "find calls with at least 10 packets delayed > 120ms"
Packet Loss Analysis
- Total loss percentage
- Consecutive loss counters (burst analysis)
- Filter by patterns: "calls with > X consecutive packets lost"
MOS XR (RTCP-XR) Metrics
MOS XR values are reported by endpoints (phones, gateways) via RTCP-XR packets (RFC 3611), NOT calculated by VoIPmonitor.
| Feature | VoIPmonitor MOS | MOS XR |
|---|---|---|
| Source | VoIPmonitor sensor | Endpoint device |
| Method | E-model (parametric) | Endpoint algorithm |
| Perspective | Network path quality | User experience at device |
| Availability | All calls with RTP captured | Only if endpoint sends RTCP-XR |
Database columns: mos_xr_avg_all, mos_xr_min_all, mos_xr_avg_caller_all, mos_xr_avg_called_all, mos_xr_min_caller_all, mos_xr_min_called_all
Troubleshooting use:
- VoIPmonitor MOS low, MOS XR high → Problem is AFTER monitoring point
- VoIPmonitor MOS high, MOS XR low → Problem is at endpoint or last mile
- Both low → Network issues affecting entire path
Clipping and Silence Detection
- Clipping detection: Counts audio samples at maximum amplitude (distortion)
- Silence detection: Measures silence percentage (>95% indicates one-way audio)
For configuration details, see Silence_detection.
MOS Configuration
# /etc/voipmonitor.conf
# Jitterbuffer simulation (all enabled by default)
jitterbuffer_f1 = yes
jitterbuffer_f2 = yes
jitterbuffer_adapt = yes
# Packet Loss Concealment (keep enabled)
plcdisable = no
# G.729 specific MOS (enable if using G.729 extensively)
mos_g729 = no
| jitterbuffer_* Value | Effect |
|---|---|
yes |
MOS calculated normally |
no |
Static 4.5 pushed to GUI |
null |
NULL pushed to GUI |
Troubleshooting Quality Issues
Diagnosing Where Issues Occur
When MOS is low but users report no issues (or vice versa), determine the problem location:
| Metric | Before Probe | Between Probe & Endpoint | Sensor Overload |
|---|---|---|---|
| VoIPmonitor MOS | Low | Acceptable | Low |
| RTP Graph Loss/PDV | High | Low | High (sniffer drops) |
| RTCP Loss (endpoint) | High | High | Low |
| User Complaints | Yes | Yes | No |
| RRD Buffer Usage | Normal | Normal | 100% |
Scenario A: Issue Before Probe
Indicators: Low MOS + high RTP graph loss/PDV
Cause: Network problems in path TO monitoring point (congestion, jitter, loss upstream)
Scenario B: Issue Between Probe and Endpoint
Indicators: Acceptable VoIPmonitor MOS + high RTCP loss from endpoints
Cause: Problems in network segment FROM probe TO endpoint (last mile, endpoint network)
Scenario C: Sensor Overload (False Low MOS)
Indicators:
- Low MOS across many calls
- Users report no issues
- RTP shows loss, but RTCP reports show low/no loss
- RRD charts show buffer usage at 100%, packet drops
Cause: Sensor is CPU/IO bound, dropping packets at capture interface
Fixing Sensor Overload
# /etc/voipmonitor.conf
# Enable modern threading
threading_expanded = yes
# For high traffic: threading_expanded = high_traffic
# Increase ring buffer (default 50, use 500+ for >100 Mbit)
ringbuffer = 500
# Reduce processing load if needed
pcap_dump = no
Also check:
- Switch SPAN port for drops (interface counters)
- SPAN buffer capacity during peak traffic
Validate fix:
systemctl restart voipmonitor
Check RRD charts: buffer usage should stay well below 100%.
Missing Quality Metrics on One Leg
If one call leg shows 0 packets or missing metrics:
Asymmetric Routing
RTP from one endpoint bypasses monitoring point.
Solution: Move sensor to see all traffic, or use ERSPAN from remote switch.
NAT Mismatch
SDP contains public IPs but RTP uses private IPs.
Solution: Configure IP mapping:
# /etc/voipmonitor.conf
# Map public IP (from SDP) to private IP (actual RTP)
natalias = 1.1.1.1 10.0.0.3
Multiple natalias lines can be used for multiple mappings.
External Links
- RFC 3611 - RTCP-XR
- ITU-T G.107 - E-model
- ITU-T G.114 - Delay recommendations
- ITU-T P.800 - Subjective MOS testing
See Also
- Glossary - Definitions of VoIP quality terms
- Silence_detection - Clipping and silence detection configuration
- Sniffer_configuration - Sensor configuration reference
- Scaling - Performance tuning for high traffic
AI Summary for RAG
Summary: Comprehensive guide to VoIP voice quality covering degradation factors (loudness, noise, echo, delay, codec impairments, jitter, packet loss), measurement methods (MOS, PESQ, POLQA, E-model), and VoIPmonitor monitoring features. VoIPmonitor calculates triple MOS (F1: 50ms buffer, F2: 200ms buffer, Adapt: 500ms adaptive) using E-model to simulate different jitter buffer behaviors. CRITICAL: MOS requires savegraph=yes or recordGRAPH=ON. High PDV causes low MOS even with 0% packet loss because delayed packets are treated as effectively lost. MOS XR (RTCP-XR, RFC 3611) are endpoint-reported quality scores vs VoIPmonitor's network-side parametric MOS. Troubleshooting framework: compare VoIPmonitor MOS, RTP graph loss/PDV, and RTCP loss to diagnose if issues are before probe, between probe and endpoint, or sensor overload. Sensor overload indicators: low MOS, no user complaints, high sniffer loss but low RTCP loss, RRD buffer at 100%. Fix with threading_expanded=yes, ringbuffer=500+. Missing metrics on one leg: check asymmetric routing or NAT mismatch (use natalias).
Keywords: voice quality, MOS, Mean Opinion Score, jitter, PDV, packet loss, burst ratio, E-model, R-factor, G.711, G.729, codec impairment, echo, TELR, delay, latency, PESQ, POLQA, VoIPmonitor, triple MOS, MOS F1, MOS F2, MOS Adapt, savegraph, MOS XR, RTCP-XR, RFC 3611, jitter buffer, sensor overload, ringbuffer, threading_expanded, natalias, clipping detection, silence detection, troubleshooting
Key Questions:
- What factors affect VoIP voice quality?
- How does VoIPmonitor calculate MOS scores?
- What is the difference between MOS F1, F2, and Adapt?
- Why is MOS low even with 0% packet loss?
- How do I enable MOS calculation in VoIPmonitor?
- What is MOS XR and how does it differ from VoIPmonitor MOS?
- How do I diagnose where quality issues are occurring?
- What indicates sensor overload vs real network problems?
- How do I fix sensor overload causing false low MOS?
- Why are quality metrics missing on one call leg?
- How do I configure natalias for NAT environments?
- What is the E-model and how is R-factor converted to MOS?
- What are ITU-T G.114 delay recommendations?
- How does packet loss burst ratio affect quality?