Glossary: Difference between revisions

Latest revision as of 16:59, 8 January 2026

Quick reference for VoIP quality metrics and monitoring concepts used in VoIPmonitor.

Network Quality Metrics

Packet Loss

Lost data packets cause audible gaps, clicks, or dropouts. Common causes: congestion, faulty hardware, misconfiguration.

VoIPmonitor approach: Records loss distribution (how many consecutive packets lost in various intervals) rather than just average percentage. This matters because 2% random loss is far less noticeable than a 2-second burst of 100% loss.

Packet Delay Variation (PDV) / Jitter

Variation in packet arrival times from expected intervals. High jitter = erratic packet bursts = degraded quality even without packet loss.

VoIPmonitor approach: Measures packets exceeding delay thresholds:

50–70ms, 70–90ms, 90–120ms, 120–150ms, 150–300ms, >300ms

ℹ️ Note: Constant jitter values (same value throughout call) indicate clock mismatch in device, not network issues. Zero jitter with large initial delay = one-time buffering spike, not ongoing jitter.

Post-Dial Delay (PDD)

Time from last digit dialed to first feedback (ringback/busy tone). Long PDD causes users to hang up prematurely.

Mitigation Mechanisms

Jitter Buffer

Temporary storage at receiving end that delays and reorders packets for smooth playback. Types:

Fixed: Constant buffer size
Adaptive: Dynamically adjusts based on network conditions

Packet Loss Concealment (PLC)

Masks lost packets since retransmission is not feasible in real-time voice:

Technique	Description
Zero Insertion	Replace with silence (crudest method)
Waveform Substitution	Repeat last known frame (common, per G.711 Appendix I)
Model-Based	Interpolate missing audio using speech models (best quality)

Voice Quality Metrics

Mean Opinion Score (MOS)

MOS is a standardized numerical rating of perceived voice quality, ranging from 1 (bad) to 5 (excellent). Originally a subjective test where human listeners would rate call quality, it is now typically calculated objectively using algorithms like the one defined in the ITU-T P.862 (PESQ) standard.

MOS	Quality	Impairment
5	Excellent	Imperceptible
4	Good	Perceptible but not annoying
3	Fair	Slightly annoying
2	Poor	Annoying
1	Bad	Very annoying

Codec baseline scores:

Codec	Bitrate	Typical MOS
G.711	64 kbit/s	4.1
iLBC	15.2 kbit/s	4.14
AMR	12.2 kbit/s	4.14
G.729	8 kbit/s	3.92
G.723.1	6.3 kbit/s	3.9

VoIPmonitor MOS Calculation

VoIPmonitor calculates parametric MOS from network metrics (packet loss, PDV), not the audio signal. It simulates jitter buffer performance:

Score	Jitter Buffer Model	Use Case
MOS F1	Fixed 50ms	Very sensitive to jitter
MOS F2	Fixed 200ms	Moderate tolerance
MOS adapt	Adaptive up to 500ms	Real-world endpoint simulation

Default calculation uses G.711 codec with PLC for consistent cross-call comparison.

R-Factor

⚠️ Warning: VoIPmonitor does NOT calculate R-Factor. R-Factor (ITU-T G.107 E-Model, 0-100 scale) is redundant because MOS provides equivalent information with direct mathematical correlation.

Recommended approach instead:

Track MOS percentiles (%95, %99) not averages
Monitor changes over time against historical baselines
Use VoIPmonitor's aggregation by source IP/number

RTCP

RTP Control Protocol provides endpoint-reported statistics: transmitted/lost packets, jitter, round-trip delay. VoIPmonitor parses RTCP reports for alternative view of call quality.

Carrier-Grade Metrics

ASR (Answer-Seizure Ratio)

Percentage of answered calls from total attempts (ITU E.411):

ASR = (Answered Calls / Total Seizures) × 100

Low ASR can indicate network problems but is also affected by user behavior (busy, no answer).

NER (Network Effectiveness Ratio)

Like ASR but measures only network capability—calls reaching destination but rejected by user (busy, no answer) count as "successful." Configure which SIP codes are successful in Settings.

ACD (Average Call Duration)

Average length of answered calls. Low ACD combined with low ASR often indicates quality problems (users hanging up due to poor audio).

Statistical Concepts

Percentiles

Value below which a given percentage of observations falls.

Example: MOS %95 = 3.2 means 5% of calls had MOS of 3.2 or worse. More useful than averages for identifying systemic problems.

AI Summary for RAG

Summary: Glossary of VoIP quality metrics and monitoring terms for VoIPmonitor. Covers network metrics (Packet Loss with distribution tracking, PDV/Jitter with threshold intervals, PDD), mitigation mechanisms (Jitter Buffer types, PLC techniques), voice quality measurements (MOS 1-5 scale with codec baselines, VoIPmonitor's parametric MOS calculation using F1/F2/adapt jitter buffer simulations), and carrier metrics (ASR, NER, ACD). Key clarification: VoIPmonitor does NOT calculate R-Factor - use MOS percentiles (%95, %99) and historical trend monitoring instead. RTCP provides endpoint-reported alternative statistics.

Keywords: glossary, packet loss, pdv, jitter, jitter buffer, mos, mos f1, mos f2, mos adapt, r-factor, pesq, pdd, rtcp, asr, ner, acd, plc, packet loss concealment, percentile, g.711, g.729, codec, quality, kpi, metric, voip quality, parametric mos

Key Questions:

What is Packet Delay Variation (PDV) and how does VoIPmonitor measure it?
How does VoIPmonitor calculate MOS? What are MOS F1, F2, and adapt?
Does VoIPmonitor calculate R-Factor? What should I use instead?
What is the difference between ASR and NER?
What is Packet Loss Concealment (PLC) and what techniques exist?
How do I interpret MOS percentile scores like %95?
What is a Jitter Buffer and what types exist?
What MOS score should I expect for different codecs?

@@ Line 1: / Line 1: @@
- [[Category:GUI manual]]
+[[Category:GUI manual]]
+{{DISPLAYTITLE:Glossary of VoIP & Monitoring Terms}}
-{{DISPLAYTITLE:Desired Title:test}}
+'''Quick reference for VoIP quality metrics and monitoring concepts used in VoIPmonitor.'''
-=Packet loss=
+== Network Quality Metrics ==
-Packet loss occurs when one or more packets of data travelling across a computer network fail to reach their destination. Packet loss is one of the three main error types encountered in digital communications. Packet loss can be caused by signal degradation over the network medium due to multi-path fading, packet drop because of channel congestion, corrupted packets rejected in-transit, faulty networking hardware, faulty network drivers or normal routing routines.
-==VoIPmonitor loss==
+=== Packet Loss ===
-VoIPmonitor detects packet loss and stores loss distribution to 10 loss intervals so it is able to find larger consecutive losses. This is important because between two calls with two percent package loss, one with random losses throughout will be heard much better than one with a string of consecutive losses.
+Lost data packets cause audible gaps, clicks, or dropouts. Common causes: congestion, faulty hardware, misconfiguration.
-=Packet delay variation PDV=
+'''VoIPmonitor approach:''' Records loss distribution (how many consecutive packets lost in various intervals) rather than just average percentage. This matters because 2% random loss is far less noticeable than a 2-second burst of 100% loss.
-In computer networking, packet delay variation (PDV) is the difference in end-to-end one-way delay between selected packets in a flow with any lost packets being ignored. The effect is sometimes referred to as jitter and although not in electronics, usage of the term jitter may cause confusion. In this document jitter will always mean PDV.
-The delay is from the start of the packet being transmitted at the source to the end of the packet being received at the destination. A component of the delay which does not vary from packet to packet can be ignored, hence if the packet sizes are the same and packets always take the same time to be processed at the destination then the packet arrival time at the destination could be used instead of the time the end of the packet is received. For interactive real-time applications, e.g., VoIP, PDV can be a serious issue and hence VoIP transmissions may need Quality of Service-enabled networks to provide a high-quality channel.
+=== Packet Delay Variation (PDV) / Jitter ===
+Variation in packet arrival times from expected intervals. High jitter = erratic packet bursts = degraded quality even without packet loss.
-The effects of PDV in multimedia streams can be removed by a properly sized jitter buffer at the receiver, which may only cause a detectable delay before the start of media playback.
+'''VoIPmonitor approach:''' Measures packets exceeding delay thresholds:
+* 50–70ms, 70–90ms, 90–120ms, 120–150ms, 150–300ms, >300ms
-== VoIPmonitor Packet delay variation ==
+{{Note|1='''Constant jitter values''' (same value throughout call) indicate clock mismatch in device, not network issues. '''Zero jitter with large initial delay''' = one-time buffering spike, not ongoing jitter.}}
-VoIPmonitor compares each RTP packet if the delay differs from the optimal value (for most cases the delay between two RTP packets are 20ms). If the delay is higher than 50ms it will be counted to one of PDV intervals which is stored for each RPT direction in cdr table. There are those PDV intervals: 50 – 70ms, 70 – 90ms, 90 – 120ms, 120 – 150ms, 150-200ms, > 300ms.
-The main advantage over traditional standard jitter metric value is that you can search calls for specific delays characteristics.
+=== Post-Dial Delay (PDD) ===
+Time from last digit dialed to first feedback (ringback/busy tone). Long PDD causes users to hang up prematurely.
-= Jitter buffer =
+== Mitigation Mechanisms ==
-Jitter buffers or de-jitter buffers are used to counter PDV (jitter) introduced by queuing in packet switched networks a continuous stream of audio (or video) is transmitted over the network The maximum jitter that can be countered by a de-jitter buffer is equal to the buffering delay introduced before starting the play-out of the mediastream. In the context of packet-switched networks, the term packet delay variation is often preferred over jitter.
+=== Jitter Buffer ===
-Some systems use sophisticated delay-optimal de-jitter buffers that are capable of adapting the buffering delay to changing network jitter characteristics. These are known as adaptive de-jitter buffers and the adaptation logic is based on the jitter estimates calculated from the arrival characteristics of the media packets. Adaptive de-jittering involves introducing discontinuities in the media play-out, which may be irritating to the listener or viewer. Adaptive de-jittering is usually used for audio play-outs that feature a VAD/DTX encoded audio, which  allows the lengths of the silence periods to be adjusted, thus minimizing the perceptible impact of the adaptation.
+Temporary storage at receiving end that delays and reorders packets for smooth playback. Types:
+* '''Fixed:''' Constant buffer size
+* '''Adaptive:''' Dynamically adjusts based on network conditions
-=MOS score=
+=== Packet Loss Concealment (PLC) ===
-Mean opinion score (MOS) is a test that has been used for decades in telephonnetworks to obtain the human user's view of the quality of the network. Historically, and implied by the word Opinion in its name, MOS was a subjective measurement where listeners would sit in a "quiet room" and score call quality as they perceived it; per ITU-T recommendation P.800, "The talker should be seated in a quiet room with volume between 30 and 120 m3 and a reverberation time less than 500 ms (preferably in the range 200-300 ms). The room noise level must be below 30 dBA with no dominant peaks in the spectrum." Measuring Voice over IP (VoIP) is more objective, and is instead a calculation based on performance of the IP network over which it is carried. The calculation, which is defined in the ITU-T PESQ P.862 standard. Like most standards, the implementation is somewhat open to interpretation by the equipment or software manufacturer. Moreover, due to technological progress of phone manufacturers, a calculated MOS of 3.9 in a VoIP network may actually sound better than the formerly subjective score of > 4.0.
+Masks lost packets since retransmission is not feasible in real-time voice:
-In multimedia (audio, voice telephony, or video) especially when codecs are used to compress the bandwidth requirement (for example, of a digitized voice connection from the standard 64 kilobit/second PCM modulation), the MOS provides a numerical indication of the perceived quality of received media from the users' perspective after compression and/or transmission. The MOS is expressed as a single number in the range 1 to 5, where 1 is lowest perceived audio quality, and 5 is the highest.
+{| class="wikitable"
+|-
+! Technique !! Description
+|-
+| Zero Insertion || Replace with silence (crudest method)
+|-
+| Waveform Substitution || Repeat last known frame (common, per G.711 Appendix I)
+|-
+| Model-Based || Interpolate missing audio using speech models (best quality)
+|}
+== Voice Quality Metrics ==
-MOS tests for voice are specified by ITU-T recommendation P.800
+=== Mean Opinion Score (MOS) ===
-The MOS is generated by averaging the results of a set of standard, subjective tests where a number of listeners rate the audio quality of test sentences read aloud by both male and female speakers over the communications medium being tested. A listener is required to give each sentence a rating using the following rating scheme:
+MOS is a standardized numerical rating of perceived voice quality, ranging from 1 (bad) to 5 (excellent). Originally a subjective test where human listeners would rate call quality, it is now typically calculated objectively using algorithms like the one defined in the ITU-T P.862 (PESQ) standard.
 {| class="wikitable"
@@ Line 50: / Line 64: @@
 |}
+'''Codec baseline scores:'''
-The MOS is the arithmetic mean of all the individual scores, and can range from 1 (worst) to 5
-(best).
-Compressor/decompressor (codec) systems and digital signal processing (DSP) are commonly used in voice communications, and can be configured to conserve bandwidth, but there is a trade-off between voice quality and bandwidth conservation. The best codecs provide the most bandwidth conservation while producing the least degradation of voice quality. Bandwidth can be measured quantitatively, but voice quality requires human interpretation, although estimates of voice quality can be made by automatic test systems.
-As an example, the following are mean opinion scores for one implementation of different codecs
 {| class="wikitable"
 |-
-! Codec !! Data rate [kbit/s] !! MOS
+! Codec !! Bitrate !! Typical MOS
 |-
-| G.711 (ISDN) || 64 || 4.1
+| G.711 || 64 kbit/s || 4.1
 |-
-| iLBC || 15.2 || 4.14
+| iLBC || 15.2 kbit/s || 4.14
 |-
-| AMR || 12.2 || 4.14
+| AMR || 12.2 kbit/s || 4.14
 |-
-| G.729 || 8 || 3.92
+| G.729 || 8 kbit/s || 3.92
 |-
-| G.723.1 r63 || 6.3 || 3.9
+| G.723.1 || 6.3 kbit/s || 3.9
+|}
+==== VoIPmonitor MOS Calculation ====
+VoIPmonitor calculates '''parametric MOS''' from network metrics (packet loss, PDV), not the audio signal. It simulates jitter buffer performance:
+{| class="wikitable"
 |-
-| GSM EFR || 12.2 || 3.8
+! Score !! Jitter Buffer Model !! Use Case
 |-
-| G.726 ADPCM || 32 || 3.85
+| '''MOS F1''' || Fixed 50ms || Very sensitive to jitter
 |-
-| G.729a || 8 || 3.7
+| '''MOS F2''' || Fixed 200ms || Moderate tolerance
 |-
-| GSM FR || 12.2 || 3.5
+| '''MOS adapt''' || Adaptive up to 500ms || Real-world endpoint simulation
 |}
-== VoIPmonitor MOS prediction ==
+Default calculation uses G.711 codec with PLC for consistent cross-call comparison.
-VoIPmonitor transforms [[Glossary#Packet_delay_variation_PDV|PDV]] and packet loss into MOS score according to ITU-T E‑model (please note that jitter is PDV). The voipmonitor MOS does not represent audio signal but network parameters. Because the relation between PDV and MOS scores depends on jitterbuffer implementation voipmonitor implements three jitterbuffer simulators and thus 3 MOS scores:
+[[File:mos.png|VoIPmonitor calculates MOS based on simulated packet loss and PDV, using a pre-calculated surface for G.711 with PLC.]]
-*MOS F1 – fixed jitterbuffer simulator up to 50 ms buffer. Any PDV higher than 50ms will produce packet loss even though there is no packet loss in the stream.
+=== R-Factor ===
-*MOS F2 – fixed jitterbuffer simulator up to 200 ms buffer. Any PDV higher than 200ms will produce packet loss.
+{{Warning|1='''VoIPmonitor does NOT calculate R-Factor.''' R-Factor (ITU-T G.107 E-Model, 0-100 scale) is redundant because MOS provides equivalent information with direct mathematical correlation.}}
-*MOS adapt – adaptive jitterbuffer simulator up to 500ms buffer. Any PDV higher than current buffer length which is changing adaptively will produce packet loss.
-If a call is long enough and there are only a few packet loss / PDV problems the MOS score can be averaged to good values although user remembers that the call had problems. This can happen on calls >15 minutes. We plan in future to calculate MOS scores after 20 seconds intervals and remember the worst MOS score.
+'''Recommended approach instead:'''
+* Track '''MOS percentiles''' (%95, %99) not averages
+* Monitor '''changes over time''' against historical baselines
+* Use VoIPmonitor's aggregation by source IP/number
-VoIPmonitor uses our own equations which comes from PESQ subjective simulator. We have simulated random packet loss between RTP sender and receiver on a scale from 0 - 60% using Markov model distribution. Degraded audio signal for every packet loss simulation is compared with original sound by the PESQ which produces MOS score. Resulting data is on following chart where there is three surfaces. Top surface is MOS score (which is on Z axe) for G.711 codec with [[Glossary#PLC|PLC]] implementation (asterisks internal [[Glossary#PLC|PLC]]). Middle surface is for G.729 with native [[Glossary#PLC|PLC]] and the bottom surface is for G.711 without [[Glossary#PLC|PLC]].
+=== RTCP ===
+RTP Control Protocol provides endpoint-reported statistics: transmitted/lost packets, jitter, round-trip delay. VoIPmonitor parses RTCP reports for alternative view of call quality.
+== Carrier-Grade Metrics ==
-[[File:mos.png]]
+=== ASR (Answer-Seizure Ratio) ===
+Percentage of answered calls from total attempts (ITU E.411):
-VoIPmonitor sniffer uses the G.711 PLC surface variant for all calls regardless on codec and this is the reason why the MOS score starts at 4.5 for every good call regardless on codec. This is our intention because MOS score is for searching for calls with bad packet loss / PDV combinations regardless on codec used or for watching sudden changes in MOS scores across whole SIP trunks. Mixing G729 and G711 MOS scores would be difficult to know if the 3.9 MOS score (which is the highest number for G.729) is because of G.729 calls or if 3.9 is due to drop in quality for G.711 calls.
+<code>ASR = (Answered Calls / Total Seizures) × 100</code>
-Ans how the MOS score is exactly calculated? Based on our simulation data we have created approximate function which transforms data based on Ppl and BurstR into MOS score. The function is hardcoded directly in the sniffer.
+Low ASR can indicate network problems but is also affected by user behavior (busy, no answer).
-= Post Dial Delay (PDD) =
+=== NER (Network Effectiveness Ratio) ===
-Post Dial Delay (PDD) is experienced by the customer originating the call from the time the final digit is dialled to the point at which they hear ring tone or other in-band information. Where the originating network is required to play an announcement before completing the call then this definition of PDD excludes the duration of such announcements.
+Like ASR but measures only network capability—calls reaching destination but rejected by user (busy, no answer) count as "successful." Configure which SIP codes are successful in Settings.
-=RTCP=
+=== ACD (Average Call Duration) ===
-The RTP Control Protocol (RTCP) is a sister protocol of the Real-time Transport Protocol (RTP). Its basic functionality and packet structure is defined in the RTP specification RFC 3550 superseding its original standardization in 1996 (RFC 1889).RTCP provides out-of-band statistics and control information for an RTP flow. It partners RTP in the delivery and packaging of multimedia data, but does not transport any media streams itself. Typically RTP will be sent on an even-numbered UDP port, with RTCP messages being sent over the next higher odd-numbered port. The primary function of RTCP is to provide feedback on the quality of service (QoS) in media distribution by periodically sending statistics information to participants in a streaming multimedia session.RTCP gathers statistics for a media connection and information such as transmitted octet and packet counts, lost packet counts, jitter, and round-trip delay time. An application may use this information to control quality of service parameters, perhaps by limiting flow, or using a different codec.VoIPmonitor (version >= 5) is able to parse and store RTCP statistics. For each call RTCP jitter, fraction loss and total loss is saved for each direction.
+Average length of answered calls. Low ACD combined with low ASR often indicates quality problems (users hanging up due to poor audio).
-= ACD =
+== Statistical Concepts ==
-The average call duration is a measurement that reflects an average length of telephone calls.
+=== Percentiles ===
+Value below which a given percentage of observations falls.
-= ASR =
+'''Example:''' <code>MOS %95 = 3.2</code> means 5% of calls had MOS of 3.2 or worse. More useful than averages for identifying systemic problems.
-Answer Seizure Ratio
+== See Also ==
+* [[Comprehensive_Guide_to_VoIP_Voice_Quality]] - Detailed voice quality analysis
+* [[Alerts]] - Configure quality-based alerts using these metrics
+* [[Charts]] - Visualize metrics over time
+* [[Silence_detection]] - Additional audio quality analysis
-ASR is a measure of network quality defined in ITU SG2 Recommendation E.411. Its calculated by taking the number of successfully answered calls and dividing by the total number of calls attempted (seizures). Since busy signals and other rejections by the called number count as call failures, the calculated ASR value can vary depending on user behavior.
+== AI Summary for RAG ==
-= PLC =
+'''Summary:''' Glossary of VoIP quality metrics and monitoring terms for VoIPmonitor. Covers network metrics (Packet Loss with distribution tracking, PDV/Jitter with threshold intervals, PDD), mitigation mechanisms (Jitter Buffer types, PLC techniques), voice quality measurements (MOS 1-5 scale with codec baselines, VoIPmonitor's parametric MOS calculation using F1/F2/adapt jitter buffer simulations), and carrier metrics (ASR, NER, ACD). Key clarification: VoIPmonitor does NOT calculate R-Factor - use MOS percentiles (%95, %99) and historical trend monitoring instead. RTCP provides endpoint-reported alternative statistics.
-Packet loss concealment (PLC) is a technique to mask the effects of packet loss in VoIP communications. Because the voice signal is sent as packets on a VoIP network, they may travel different routes to get to destination. At the receiver a packet might arrive very late, corrupted or simply might not arrive. One of the cases in which the last situation could happen is where a packet is rejected by a server which has a full buffer and cannot accept any more data. In a VoIP connection, error-control techniques such as ARQ are not feasible and the receiver should be able to cope with packet loss.
+'''Keywords:''' glossary, packet loss, pdv, jitter, jitter buffer, mos, mos f1, mos f2, mos adapt, r-factor, pesq, pdd, rtcp, asr, ner, acd, plc, packet loss concealment, percentile, g.711, g.729, codec, quality, kpi, metric, voip quality, parametric mos
-== PLC techniques ==
+'''Key Questions:'''
-* '''Zero insertion''': the lost speech frames are replaced with zero
+* What is Packet Delay Variation (PDV) and how does VoIPmonitor measure it?
-* '''Waveform substitution''': the missing gap is reconstructed by repeating a portion of already received speech. The simplest form of this would be to repeat the last received frame. Other techniques account for fundamental frequency, gap duration etc. Waveform substitution methods are popular because of their simplicity to understand and implement. An example of such an algorithm is proposed in ITU recommendation G.711 Appendix I.
+* How does VoIPmonitor calculate MOS? What are MOS F1, F2, and adapt?
-* '''Model-based methods''': an increasing number of algorithms that take advantage of speech models of interpolating and extrapolating speech gaps are being introduced and developed.
+* Does VoIPmonitor calculate R-Factor? What should I use instead?
+* What is the difference between ASR and NER?
+* What is Packet Loss Concealment (PLC) and what techniques exist?
+* How do I interpret MOS percentile scores like %95?
+* What is a Jitter Buffer and what types exist?
+* What MOS score should I expect for different codecs?