WEB API: Difference between revisions

From VoIPmonitor.org
(Add cdrId parameter to getShareURL documentation)
(VG-3075: Document getVoipCalls response fields including new lastSIPresponseNum)
 
(25 intermediate revisions by the same user not shown)
Line 1: Line 1:
{{DISPLAYTITLE:Comprehensive Guide to VoIP Voice Quality}}
= VoIPmonitor Web API =


{| class="wikitable" style="width:100%; background:#f8f9fa; border:2px solid #3366cc; margin-bottom:20px;"
This page documents VoIPmonitor's Web APIs for programmatic access to CDR data, recordings, active calls, and system functions.
 
== Overview ==
 
<kroki lang="mermaid">
%%{init: {'flowchart': {'nodeSpacing': 15, 'rankSpacing': 30}}}%%
flowchart LR
    subgraph Client
        A[Application]
    end
    subgraph API["API Endpoints"]
        B["/php/api.php"]
        C["/php/model/sql.php"]
    end
    subgraph Tasks
        E[getVoipCalls]
        F[getVoiceRecording]
        G[getPCAP]
        H[listActiveCalls]
        I[getShareURL]
    end
    A -->|user/password| B
    A -->|Session cookie| C
    B --> E & F & G & H & I
</kroki>
 
{| class="wikitable"
|-
|-
! colspan="3" style="background:#3366cc; color:white; font-size:1.2em; padding:10px;" | Quick Navigation
! API !! Endpoint !! Authentication !! Use Case
|-
|-
! style="width:33%; background:#e6f2ff; padding:8px; vertical-align:top;" | Degradation Factors
| '''HTTP API 2''' || <code>/php/api.php</code> || user/password params || CDR queries, recordings, PCAP, active calls
! style="width:33%; background:#e6ffe6; padding:8px; vertical-align:top;" | Measurement Methods
! style="width:33%; background:#fff2e6; padding:8px; vertical-align:top;" | Practical Monitoring
|-
|-
| style="vertical-align:top; padding:10px;" |
| '''CDR HTTP API''' || <code>/php/model/sql.php</code> || Session cookie || Advanced CDR filtering with GUI filters
'''Traditional Impairments'''
* [[#Volume (Loudness) Level|Loudness & Volume]]
* [[#Circuit Noise and Background Noise|Circuit & Background Noise]]
* [[#Sidetone (Local Loop Feedback)|Sidetone]]
* [[#Echo (Talker Echo and Listener Echo)|Echo (Talker & Listener)]]
* [[#Absolute Delay (Latency)|Absolute Delay]]
* [[#Non-Linear Distortion|Non-Linear Distortion]]
* [[#Codec Compression Impairments|Codec Impairments]]
 
'''IP Network Impairments'''
* [[#Delay and Delay Variation (Jitter)|Delay & Jitter]]
* [[#Packet Loss|Packet Loss]]
* [[#Packet Reordering|Packet Reordering]]
| style="vertical-align:top; padding:10px;" |
'''Subjective Methods'''
* [[#MOS and Subjective Testing|MOS Testing (P.800)]]
* [[#Listening-Only MOS (MOS-LQ)|MOS-LQ (Listening)]]
* [[#Conversational MOS (MOS-CQ)|MOS-CQ (Conversational)]]
 
'''Objective Methods'''
* [[#Intrusive Objective Models (PESQ, POLQA)|PESQ (P.862)]]
* [[#Intrusive Objective Models (PESQ, POLQA)|POLQA (P.863)]]
* [[#Non-Intrusive Models (P.563)|P.563 (Single-ended)]]
 
'''Parametric Model'''
* [[#The E-Model (ITU-T G.107)|E-Model (G.107)]]
* [[#Key E-Model Parameters|E-Model Parameters]]
* [[#From R-Factor to MOS|R-Factor → MOS]]
| style="vertical-align:top; padding:10px;" |
'''VoIPmonitor Features'''
* [[#Passive Call Capture and RTP Analysis|RTP Analysis]]
* [[#Jitter Analysis and PDV|Jitter & PDV Statistics]]
* [[#Jitter Buffer Simulation|Jitter Buffer Simulation]]
* [[#Packet Loss Monitoring and Burst Analysis|Loss & Burst Analysis]]
* [[#MOS Calculation in VoIPmonitor|MOS Calculation]]
* [[#Clipping and Silence Detection|Clipping Detection]]
* [[#Proactive Quality Management|Quality Alerts]]
|}
|}


__TOC__
{{Warning|1='''Correct path is critical:''' Always use <code>/php/api.php</code> (not <code>/api.php</code>). Missing the <code>/php/</code> directory causes "action parameter missing" errors.}}


This comprehensive guide details voice quality degradation factors in telephony and VoIP networks, measurement methods, and practical monitoring with VoIPmonitor.
== HTTP API 2 ==


== Introduction ==
Preferred API for programmatic access. Requests via HTTP POST or GET to <code>/php/api.php</code>.


Voice over IP (VoIP) has become a dominant technology for telephony, merging the world of traditional telephones with IP networks. Ensuring high voice quality in VoIP is critical for user satisfaction and successful network operation. Unlike legacy circuit-switched telephony, IP networks introduce new challenges such as variable packet delay and packet loss, which can significantly degrade call quality. Understanding all the factors that impair voice transmission – from analog acoustics to digital network issues – is essential for telecommunications professionals.
=== Authentication ===


In this comprehensive guide, we detail the myriad of voice quality degradation factors and how they map to measurable parameters. We then explore both subjective and objective metrics for quantifying voice quality, including the well-known Mean Opinion Score (MOS) and modern evaluation methods like PESQ, POLQA, and the ITU E-model. Throughout, we reference relevant ITU-T and IETF standards to ground the discussion in current international recommendations.
Include <code>user</code> and <code>password</code> parameters in every request:


Crucially, we also demonstrate how these concepts apply in practice, with a focus on '''VoIPmonitor''' – a specialized VoIP monitoring tool. VoIPmonitor offers unique capabilities for analyzing call quality: detailed jitter and Packet Delay Variation (PDV) statistics, jitter buffer behavior simulation, MOS score computation, and even detection of audio clipping and silence periods. By leveraging passive monitoring of live calls, VoIPmonitor can alert operators to quality issues in real time and provide forensic data to troubleshoot problems.
<syntaxhighlight lang="bash">
curl 'http://server/php/api.php?task=getVoipCalls&user=USER&password=PASSWORD&params=...'
</syntaxhighlight>


This guide is intended as a full-length professional resource for telecom and VoIP engineers. It assumes a technical background and aims to be as detailed and up-to-date as possible. Whether you're designing a network, operating a VoIP service, or developing monitoring solutions, this article will serve as a valuable reference on voice transmission quality in IP networks and how to measure and assure it.
{{Note|1=API authentication uses local VoIPmonitor database credentials. It does NOT use LDAP or custom_login.php.}}


== Voice Quality Degradation Factors ==
'''Rate Limiting:''' Enable in GUI > Settings > System Configuration > "API maximal concurrent connections".


In any voice communication system – traditional or VoIP – there are numerous factors that can degrade the quality of the speech heard by users. These range from acoustic and analog impairments in the endpoints to network-induced impairments in packet-switched transport. Below we examine all key degradation factors in detail, explaining their nature and how they affect perceived quality. We also note how each factor can be measured or quantified, either via objective metrics or through standards-based parameters.
=== getVoipCalls ===


[[File:VoIP_Impairments_E_Model_Parameters.png|thumb|600px|center|E-model reference connection showing all impairment parameters including SLR, RLR, TELR, delay, noise, and codec factors]]
Retrieve CDR records by search criteria.


=== Volume (Loudness) Level ===
'''Parameters:'''
{| class="wikitable"
|-
! Parameter !! Description
|-
| <code>startTime</code> || Calls starting >= this time (required)
|-
| <code>startTimeTo</code> || Calls starting <= this time
|-
| <code>callEnd</code> || Calls ending <= this time
|-
| <code>caller</code> / <code>called</code> || Caller/called number
|-
| <code>callId</code> || SIP Call-ID
|-
| <code>cdrId</code> || Database ID
|-
| <code>id_sensor</code> || Sensor number
|-
| <code>msisdn</code> || Match caller OR called (instead of AND)
|-
| <code>onlyConnected</code> || <code>1</code> = only answered calls
|-
| <code>customHeaders</code> || Comma-separated custom header names to return
|}


'''Description:''' The loudness level of the voice signal is critical for clear communication. If the volume is too low, the speech may be hard to discern; if too high, it can cause discomfort or even distortion. In telephony engineering, loudness is quantified using '''Loudness Ratings'''. The '''Send Loudness Rating (SLR)''' measures the talker's side volume level (microphone gain and telephone set output), and the '''Receive Loudness Rating (RLR)''' measures the listener's side volume (earphone sensitivity). A mismatch in loudness levels between endpoints can cause one party's voice to be too quiet or too loud for the other party.
'''Examples:'''


The '''Overall Loudness Rating (OLR)''' is the sum of SLR and RLR:
<syntaxhighlight lang="bash">
# HTTP GET - by time range and caller
curl 'http://server/php/api.php?task=getVoipCalls&user=USER&password=PASSWORD&params={"startTime":"2024-01-01","startTimeTo":"2024-01-31","caller":"123456"}'


:<math>OLR = SLR + RLR</math>
# HTTP POST - by Call-ID
echo '{"task":"getVoipCalls","user":"USER","password":"PASSWORD","params":{"startTime":"2024-01-01","callId":"abc123"}}' | curl -X POST -d @- http://server/php/api.php


According to ITU-T recommendations:
# With custom headers
* Optimal OLR range: '''5-15 dB'''
curl -G 'http://server/php/api.php?task=getVoipCalls&user=USER&password=PASSWORD' \
* Values below 5 dB cause discomfort due to excessive loudness
  --data-urlencode 'params={"startTime":"2024-01-01","caller":"123","customHeaders":"X-Custom-Header"}'
* Values above 15 dB make speech difficult to understand
</syntaxhighlight>
'''Response fields:'''
{| class="wikitable"
|-
! Field !! Description
|-
| <code>cdrId</code> || Database CDR ID
|-
| <code>id_sensor</code> || Sensor ID
|-
| <code>calldate</code> / <code>callend</code> || Call start/end time
|-
| <code>duration</code> / <code>connect_duration</code> || Total / connected duration
|-
| <code>caller</code> / <code>called</code> || Caller/called numbers
|-
| <code>sipcallerip</code> / <code>sipcalledip</code> || SIP endpoint IPs
|-
| <code>codec_a</code> / <code>codec_b</code> || Audio codecs
|-
| <code>lastSIPresponseNum</code> || '''(New in 2026.1)''' Final SIP response code (e.g., 200, 404, 503)
|-
| <code>callId</code> || SIP Call-ID
|}
=== getVoiceRecording ===


'''Impacts on Quality:''' Suboptimal loudness leads to user frustration: low volume causes the listener to strain to hear, while excessive volume can be perceived as shouting or can introduce distortion. Very high levels might also trigger echo or feedback in some systems.
Download audio recording (WAV format).


'''Measurement:''' Loudness is measured in decibels relative to a reference level. The ITU-T recommends nominal SLR and RLR values for telephone terminals to ensure comfortable listening. In objective planning, one might target an overall Operational Loudness Rating (OLR) that falls within a range that yields a good loudness MOS. Modern digital VoIP phones generally adhere to these loudness standards, but misconfigurations (like incorrect gain settings) can still occur. In testing, one can use test signals and measure levels in dBm0 or use an analog loudness rating measurement per ITU-T P.79.
'''Parameters:'''
{| class="wikitable"
|-
! Parameter !! Description
|-
| <code>cdrId</code> || Database CDR ID (preferred)
|-
| <code>callId</code> || SIP Call-ID
|-
| <code>calldate</code> || Date hint (default: today)
|-
| <code>zip</code> || <code>true</code> = return ZIP archive
|-
| <code>ogg</code> || <code>true</code> = return OGG format
|-
| <code>saveaudio_afterconnect</code> || <code>"yes"</code> = audio only after call connect
|}


'''Mitigation:''' Proper calibration of device gains and using automatic level control can keep volume levels within optimum range.
'''Examples:'''


[[File:VoIP_Impairments_Loudness_vs_Noise.png|thumb|500px|center|Mean Opinion Score as a function of attenuation (OLR) at different circuit noise levels. Source: ITU-T P.11]]
<syntaxhighlight lang="bash">
# By CDR ID
curl 'http://server/php/api.php?task=getVoiceRecording&user=USER&password=PASSWORD&params={"cdrId":12345}' > call.wav


=== Circuit Noise and Background Noise ===
# By Call-ID
curl 'http://server/php/api.php?task=getVoiceRecording&user=USER&password=PASSWORD&params={"callId":"abc123"}' > call.wav


'''Description:''' Circuit noise refers to the electrical noise present in the communication path, even when no one is speaking. In traditional telephony, this could be a low-level hiss or hum on the line (often measured as circuit noise in dBrn). In VoIP, digital circuits themselves do not add audible hiss, but background noise from the environment or analog interfaces (microphone noise, room ambiance) still affects call quality. Background noise is the ambient sound at the caller or callee's location – for example, office chatter, traffic noise, or wind – that gets picked up and transmitted along with speech.
# Multiple recordings (returns ZIP)
echo '{"task":"getVoiceRecording","user":"USER","password":"PASSWORD","params":{"cdrId":[6,7]}}' | php api.php > calls.zip
</syntaxhighlight>


'''Impacts on Quality:''' Noise competes with speech for the listener's attention, effectively reducing the Signal-to-Noise Ratio (SNR). High background or circuit noise can mask parts of speech, making conversation tiring. If noise is too loud, important speech sounds (especially softer consonants) may be missed by the listener. In subjective terms, excessive noise contributes to listener fatigue and lower quality ratings.
=== getPCAP ===


'''Measurement:''' Noise level is typically measured in decibels. Circuit noise in telecom is often measured with a weighted filter (e.g., C-message weighting in analog lines, or psophometric weighting) and expressed in '''dBm0p''' (decibels referred to 0 dBm, psophometric). For background noise, one can measure the noise floor in the absence of speech. The E-model (ITU-T G.107) includes parameters for send-side and receive-side noise levels (denoted as <math>N_s</math> and <math>N_r</math>), which contribute to the transmission rating. Acceptable background noise for comfortable conversation is generally < 30 dBA in a quiet environment. Beyond that, MOS ratings start to drop. VoIP systems often employ noise suppression algorithms to reduce background noise transmission.
Download PCAP file. Automatically merges multiple legs and returns ZIP if needed.


'''Parameters:'''
{| class="wikitable"
{| class="wikitable"
|+ Circuit Noise Impact on Voice Quality
! Noise Level (dBmp) !! Quality Impact
|-
|-
| -80 to -70 || Negligible impact
! Parameter !! Description
|-
|-
| -70 to -65 || Minor degradation
| <code>cdrId</code> || Database CDR ID
|-
|-
| -65 to -55 || Noticeable degradation
| <code>callId</code> || SIP Call-ID
|-
|-
| -55 to -40 || Significant degradation
| <code>cidInterval</code> || Time window (seconds) to search for matching Call-ID
|-
|-
| > -40 || Severe degradation
| <code>cidMerge</code> || <code>true</code> = merge multiple legs
|-
| <code>zip</code> || <code>true</code> = force ZIP output
|}
|}


'''Mitigation:''' Use of noise-cancelling microphones, acoustic treatments in loud environments, and digital noise reduction algorithms can lower the noise that is sent across the call. On analog gateways, ensuring good shielding and proper impedance matching can reduce hum and hiss.
'''Examples:'''
 
[[File:VoIP_Impairments_MOS_vs_Noise.png|thumb|450px|center|Mean Opinion Score as a function of circuit noise level]]
 
=== Sidetone (Local Loop Feedback) ===
 
'''Description:''' Sidetone is the phenomenon where a talker hears a small portion of their own voice in their telephone earpiece as they speak. In traditional analog phones, sidetone is deliberately introduced via the phone's hybrid circuit – it's essentially a form of immediate echo that reassures the speaker that the phone is working and allows them to modulate their speaking volume naturally. The sidetone needs to be at an appropriate level: too low and the user feels like talking into a dead phone; too high and it becomes a loud echo of one's own voice.
 
'''Impacts on Quality:''' Sidetone primarily affects the talker's comfort. Proper sidetone makes speaking feel more natural and prevents the talker from inadvertently shouting (if they hear no feedback, they may speak louder than necessary). Excessive sidetone, however, can be distracting or annoying (like hearing an echo of yourself immediately). While sidetone doesn't directly impair the listener's experience, it indirectly influences call quality by affecting how the talker behaves. Very poor sidetone (either none or far too much) can lead to lower conversational quality scores in subjective tests.
 
'''Measurement:''' Sidetone level is quantified by the '''Sidetone Masking Rating (STMR)'''. STMR is defined in ITU-T recommendations as the measure of how much the talker's own voice is reduced (masked) in the feedback they hear:


* Optimal STMR: '''8-12 dB'''
<syntaxhighlight lang="bash">
* Higher STMR (excessive loss of sidetone) means very weak sidetone
# By CDR ID
* Lower STMR means strong sidetone (potentially too loud)
curl 'http://server/php/api.php?task=getPCAP&user=USER&password=PASSWORD&params={"cdrId":"12345"}' > call.pcap


Sidetone characteristics are usually engineered into telephone handsets, but VoIP softphones or headsets might not provide sidetone unless simulated in software.
# By Call-ID with merge
curl -G 'http://server/php/api.php?task=getPCAP&user=USER&password=PASSWORD' \
  --data-urlencode 'params={"callId":"abc123","cidInterval":60,"cidMerge":true}'
</syntaxhighlight>


'''Mitigation:''' In VoIP endpoints, if users complain of "not hearing themselves," some softphone or headset settings provide a sidetone mix. Conversely, if sidetone is too loud (which is rare in digital endpoints unless feedback loops exist), it would require adjusting device firmware.
{{Tip|Use <code>curl -G --data-urlencode</code> for complex JSON params to avoid encoding issues.}}


=== Frequency Response and Attenuation Distortion ===
=== listActiveCalls ===


'''Description:''' Frequency response refers to how evenly the communication channel transmits different audio frequencies. The human voice spans roughly '''300 Hz to 3400 Hz''' in traditional telephony (narrowband PSTN quality), and up to about '''7 kHz''' for wideband voice. Attenuation distortion occurs if some frequency components of speech are attenuated more than others during transmission. For example, if high frequencies are not transmitted well, speech will sound muffled; if low frequencies drop off, speech may sound thin. Classic analog lines and telephony equipment specified a frequency response tolerance (e.g., within ±x dB over the voice band).
Get currently active calls from sensor.


In digital codecs, frequency response is dictated by the codec's design and sampling rate:
'''Parameters:''' <code>sensorId</code> (optional), <code>callId</code> (optional)
* '''G.711''' (narrowband) covers up to ~3.4 kHz
* '''G.722''' (wideband) covers up to 7 kHz, giving more natural sound


'''Impacts on Quality:''' Uneven frequency response can reduce intelligibility. Critical speech components (like sibilants in the high frequencies or vowel formants in mid frequencies) might be weakened, making it harder to distinguish sounds. Muffled audio (loss of highs) often yields lower MOS because it's simply harder to understand, even if loudness is sufficient. Conversely, over-emphasis of certain frequencies can make the sound harsh or cause circuit resonance. A flat frequency response within the necessary band yields the best quality for that band limit. Wideband audio (50–7000 Hz) significantly improves perceived clarity and naturalness, which is why HD Voice services using codecs like G.722 or Opus achieve higher MOS than narrowband G.711 under the same conditions.
<syntaxhighlight lang="bash">
curl 'http://server/php/api.php?task=listActiveCalls&user=USER&password=PASSWORD&params={"sensorId":"1"}'
</syntaxhighlight>


'''Measurement:''' Frequency response is measured by sending test tones or speech spectra and measuring the gain vs frequency. Telephony standards like ITU-T G.122 and G.130 specified acceptable attenuation distortion (e.g., within ±1 or 2 dB across 300–3400 Hz). Modern digital systems either meet these inherently or intentionally band-limit for known reasons (like narrowband codecs). Subjectively, poor frequency response shows up in MOS tests as lower scores on "clarity" even if no noise or loss is present.
{{Note|1=IP addresses in response are integers. Convert with <code>INET_NTOA()</code> in MySQL or equivalent in your code.}}


'''Mitigation:''' Using wideband codecs where possible is the primary way to improve frequency-related quality. Ensure that any analog components (microphones, speakers, analog gateways) have appropriate frequency handling. Avoid using overly aggressive filters or low-bitrate codecs that truncate important frequency content unless necessary for bandwidth.
=== handleActiveCall ===


=== Group Delay Distortion ===
Pause/unpause RTP recording on active call.


'''Description:''' Group delay distortion refers to frequency-dependent delay – i.e., when some frequency components of the signal are delayed more than others. In an ideal system, all frequency components of a voice signal would experience the same propagation delay. However, certain filters or network elements might introduce varying delays across the spectrum. The result can be phase distortion of the voice signal, which particularly affects the shape of speech waveforms.
'''Parameters:'''
* <code>sensorId</code> - Sensor number
* <code>command</code> - <code>pausecall</code> or <code>unpausecall</code>
* <code>callRef</code> - Call reference from listActiveCalls


In analog long-distance systems (like multiplexed carrier systems), group delay distortion was a concern; for example, frequencies at the edge of the band might be delayed slightly more. In modern digital VoIP, group delay distortion is usually negligible inside a codec's passband, but if acoustic echoes or frequency-specific processing occur, it could contribute subtly.
<syntaxhighlight lang="bash">
curl 'http://server/php/api.php?task=handleActiveCall&user=USER&password=PASSWORD&params={"sensorId":"1","command":"pausecall","callRef":"0x7f0e4c3c2680"}'
</syntaxhighlight>


'''Impacts on Quality:''' Minor group delay distortion is generally not perceptible by users on its own. However, in severe cases, it can smear speech transients (making speech sound slightly "slurred" or less crisp). Historically, if group delay distortion exceeded certain thresholds, it could degrade the quality rating, especially in combination with other impairments. In most VoIP scenarios, this is not a dominant factor – codecs typically preserve phase well within the band, and network delays are frequency-independent (packets affect all frequencies equally).
=== getShareURL ===


'''Measurement:''' Group delay distortion can be measured by sending test pulses or multi-frequency signals and measuring delay as a function of frequency. It is usually expressed in milliseconds of variation across the band. For example, older ITU specs might have said something like "not more than 1 ms variation in delay from 500–2800 Hz" for toll-quality circuits. In E-model terms, group delay distortion isn't explicitly parameterized; it would typically be subsumed under general frequency response or just not considered if within nominal limits.
Generate public shareable link for a CDR.
 
'''Mitigation:''' Using modern digital equipment essentially mitigates this – good design of filters (e.g., linear phase filters) ensures minimal group delay distortion. If one were chaining multiple analog devices or filters, ensuring each is well-equalized would help. In essence, this factor is mentioned for completeness but is rarely a standalone issue in today's VoIP telephony.
 
=== Absolute Delay (Latency) ===
 
'''Description:''' Absolute delay is the end-to-end one-way transit time from the speaker's mouth to the listener's ear. In VoIP, this includes all sources of latency:
* '''Codec delay''' – encoding/decoding processing time
* '''Packetization delay''' – assembling voice samples into packets
* '''Network propagation delay''' – transmission through the IP network
* '''Jitter buffer delay''' – buffering to smooth packet arrival variations
* '''Playout delay''' – final processing and D/A conversion
 
Unlike the delay distortion above, absolute delay affects the conversational dynamic but not the audible fidelity of individual words.
 
'''Impacts on Quality:''' Humans are very sensitive to conversational delay. Small one-way delays (say < 100 ms) are almost unnoticeable and cause no harm to normal conversation. As delay increases into the few hundreds of milliseconds, it begins to disrupt the conversational interaction – you start hearing the other person's responses later than expected, which can lead to awkward pauses, interruptions (talker overlap), and the feeling that the call is "laggy." Very high delays (> 500 ms one-way) make natural conversation nearly impossible – parties will frequently double-talk (both speak at once without realizing) or leave long gaps waiting for responses, often leading to confusion.
 
Importantly, delay also amplifies the perception of echo. If an echo is present, a short delay echo (< 20 ms) just sounds like sidetone; but a longer delay echo (say 100+ ms) becomes distinctly audible and highly annoying. So absolute latency and echo interplay is crucial: even a low-level echo can be tolerated if it returns almost immediately, but if it comes back after a significant delay, it will disturb the talker. Thus, network planners aim to minimize one-way delay not just for conversational smoothness but also to avoid echo issues (or ensure echo cancellers have enough time span to cover the delay).


'''Parameters:'''
{| class="wikitable"
{| class="wikitable"
|+ ITU-T G.114 Delay Recommendations
! One-way Delay !! Quality Assessment
|-
|-
| 0-150 ms || Acceptable for most applications (essentially unnoticeable or only slightly noticeable)
! Parameter !! Description
|-
| <code>callId</code> or <code>cdrId</code> || Call identifier
|-
| <code>cidInterval</code> || Time window for Call-ID search
|-
| <code>rtp</code> || <code>true</code> = include RTP data
|-
| <code>sip_history</code> || <code>true</code> = SIP history only (no RTP)
|-
|-
| 150-400 ms || Acceptable but may impact conversation flow; at the limit of many echo cancellers
| <code>anonIps</code> || <code>true</code> = anonymize IPs
|-
|-
| > 400 ms || Generally unacceptable for two-way conversation
| <code>validDays</code> || Link validity in days
|}
|}


'''Measurement:''' One-way delay can be measured with synchronized endpoints or by using protocols like RTCP that can report timing. In a live VoIP call, measuring one-way delay precisely is challenging unless both endpoints or measurement points are time-synchronized (for example, using NTP or GPS timing). Often, only round-trip delay is measured (e.g., via ping or RTCP round-trip time) and one-way is inferred by halving if path is symmetric. VoIPmonitor, for instance, focuses more on relative delay variation (jitter) since absolute one-way delay requires special instrumentation; however, if both call legs are captured, it can estimate differences in timing between streams.
<syntaxhighlight lang="bash">
 
curl 'http://server/php/api.php?task=getShareURL&user=USER&password=PASSWORD&params={"callId":"abc123","rtp":true,"validDays":15}'
'''Mitigation:''' To control latency, one should minimize unnecessary buffering (while still using enough jitter buffer to handle jitter), use faster networks/QoS for propagation, and avoid long serialization delays (e.g., use appropriate packetization – smaller packets reduce per-packet serialization delay at the cost of overhead). Many VoIP codecs add some algorithmic delay (lookahead, frame size); using a low-delay codec can help if latency is critical. For echo, deploy echo cancellers (per ITU-T G.168) in gateways or phones to remove line echo up to the expected tail length. Well-implemented echo cancellation is essential especially when delays are moderately high.
</syntaxhighlight>


=== Echo (Talker Echo and Listener Echo) ===
=== reportSummary ===


'''Description:''' Echo in a call is the phenomenon of hearing one's own voice reflected back after a delay. There are two types, defined by who hears the echo:
Generate pre-configured summary report as image or JSON.


* '''Talker Echo:''' This is when the person speaking hears their own voice back after a delay. It typically results from the far-end hybrid or acoustic coupling. For example, if Alice is talking to Bob, and Bob's phone or network reflects Alice's voice back to her, Alice experiences talker echo.
{{Warning|1=Reports must be created in '''GUI → Reports → Daily Report''' BEFORE using the API. The <code>report_name</code> parameter must match the report's Description field.}}
* '''Listener Echo:''' This is when the person listening hears their own voice coming back from the other side. It's essentially the same physical echo, but described from the opposite perspective.


'''Causes:''' In traditional PSTN, the main cause of echo is the '''hybrid''' – the 2-wire to 4-wire conversion in the analog network. Imperfect impedance matching in hybrids causes a portion of the transmit signal to leak into the receive path, creating an echo. In VoIP contexts, the hybrid echo can occur at analog gateways or digital/analog phone interfaces. Another common cause is '''acoustic echo''': if one party is using a speakerphone or has the handset volume very high, their microphone can pick up the audio from their speaker and send it back.
'''Parameters:'''
* <code>report_name</code> - Description field from GUI report
* <code>datetime_from</code> / <code>datetime_to</code> - Date range
* <code>json</code> - <code>true</code> = return JSON instead of image


[[File:VoIP_Impairments_Hybrid_Echo.png|thumb|500px|center|2-wire/4-wire hybrid converter – source of talker echo due to impedance mismatch]]
<syntaxhighlight lang="bash">
curl 'http://server/php/api.php?task=reportSummary&user=USER&password=PASSWORD&params={"report_name":"Daily Summary","datetime_from":"2024-01-01","datetime_to":"2024-01-31"}'
</syntaxhighlight>


'''Impacts on Quality:''' Echo can be extremely disruptive, particularly when the delay is noticeable. As mentioned, if the echo delay is very short (< ~30 ms), the brain may not distinguish it as a separate sound – it either merges with sidetone or just gives a sense of reverberation. But beyond that, echo is heard as a distinct repetition of your own words, which is very distracting. The loudness of the echo (usually quantified by '''Echo Return Loss, ERL''', which indicates how much the echo is attenuated) and the delay of the echo are the key factors. A loud echo with a long delay is most annoying. Even a relatively quiet echo can be troublesome if significantly delayed.
=== Other Tasks ===
 
The E-model captures echo impairment in a parameter called '''Talker Echo Loudness Rating (TELR)''' which is related to the level of echo and the delay (reflected in the <math>I_{d,te}</math> component of delay impairment).
 
[[File:VoIP_Impairments_TELR_Tolerance.png|thumb|450px|center|TELR tolerance limits as a function of round-trip delay (ITU-T G.131). The "Acceptable" and "Limiting case" curves show that as delay increases, higher TELR (more echo attenuation) is required.]]
 
'''Measurement:''' Echo is measured by sending known signals and analyzing the returned echo. Key metrics are:
* '''Echo Return Loss (ERL):''' how many dB down the echo is compared to the original voice at the point of reflection
* '''Echo Delay (<math>T_{echo}</math>):''' the one-way delay from talker to the echo source and back
* '''Echo Return Loss Enhancement (ERLE):''' if an echo canceller is present, how much additional attenuation it provides
 
'''Mitigation:''' Echo cancellers are the primary solution. They work by modeling the echo path and subtracting the echoed signal from the incoming audio. All modern gateways and even handsets include some form of echo cancellation as per ITU-T G.168. Ensuring proper impedance matching on analog interfaces reduces initial echo. For acoustic echo, strategies include using good echo-cancelling speakerphones, or in software using acoustic echo cancellation algorithms. If users report hearing themselves, that's an indication the far-end might have echo issues; one may need to troubleshoot that user's device or environment. Also, keeping delays low helps – even if some echo leaks through, if it's under ~50 ms, many users won't notice it strongly, whereas the same level of echo at 150 ms delay would be intolerable.
 
=== Non-Linear Distortion ===
 
'''Description:''' Non-linear distortion occurs when the voice signal is altered due to non-linearities in the system. Unlike simple attenuation or filtering (linear distortions), non-linear effects create new frequency components (harmonics or intermodulation) not present in the original signal. Common causes include:
* Overloading an analog circuit (clipping the waveform)
* Non-linear amplifier behavior (e.g., too high gain causing distortion)
* Quantization effects in certain codecs beyond simple quantization noise
* In VoIP, non-linear distortion could also stem from audio coding algorithms that introduce artifacts
 
An example in analog domain: if the signal level is too high for a handset amplifier, the peaks of the waveform may saturate, resulting in a clipped, harsh sound. In digital domain, if audio samples exceed the max representable value, you get digital clipping (which sounds like crackling on peaks).
 
'''Impacts on Quality:''' Non-linear distortion generally produces a harsh, unnatural sound. Clipping, for instance, makes speech sound crackly and can reduce intelligibility (soft consonants might get lost if they fall below the clipping threshold relative to noise, while loud vowels get distorted). Users will describe it as "static" or "crackle" or "robotic" depending on the nature. Even mild non-linear distortion can significantly degrade listening MOS because our ears are sensitive to the introduction of frequencies that shouldn't be there or the flattening of peaks (which affects timbre).
 
'''Measurement:''' In lab environments, one can measure '''Total Harmonic Distortion (THD)''' by sending a pure tone and measuring the harmonic content at output. Subjective tests (P.800 listening tests) have documented the impact: for example, earlier research graphed MOS versus distortion level (like cubic or quadratic distortion added to speech). These show MOS falls as distortion increases, often rapidly once distortion goes beyond a few percent THD.
 
During live network monitoring, detecting non-linear distortion might involve analyzing audio levels versus digital sample clipping or measuring if the waveform peaks are flat-topped. '''VoIPmonitor has the capability to detect clipping events''' – essentially moments where the audio waveform reaches the maximum amplitude (0 dBFS) and flattens. Clipping detection algorithms count how many samples are at max amplitude in a row, indicating likely distortion. For example, VoIPmonitor sensors (when enabled) can report if a certain percentage of audio frames were "clipped".
 
'''Mitigation:''' Prevent overload at all stages: ensure levels are set such that typical speech peaks do not saturate analog inputs or DACs. Use AGC (automatic gain control) or limiter circuits carefully to avoid hard clipping (a gentle limiter is preferable if truly needed). In digital, ensure your analog gain staging is correct – e.g., a gateway's input gain too high can clip the ADC. If using codecs, choose ones with sufficient bit depth and low distortion. If clipping is detected in monitoring, adjust the gain on the offending side or advise the user to lower microphone sensitivity.
 
=== Codec Compression Impairments ===
 
'''Description:''' Unlike G.711 PCM (which is a simple waveform quantization), most VoIP codecs use lossy compression to reduce bitrate – examples include G.729, G.723.1, AMR, Opus, etc. These codecs achieve low bitrates by discarding some information and using models of human speech perception. '''Codec impairments''' refer to the distortions or losses in fidelity introduced by this compression process.
 
Each codec has a certain baseline quality in ideal conditions (no packet loss, minimal jitter):
* '''G.711''' (64 kbps PCM) is considered the reference with very high quality (MOS ~4.4–4.5 on narrowband scale)
* '''G.729''' (8 kbps) has inherently lower fidelity – even with no network issues, its best MOS might be around 3.7–4.0 (narrowband MOS) under ideal conditions
* '''G.723.1''' at 5.3k has MOS ~3.6 ideal
* Modern wideband codecs (e.g., Opus at 16–20 kbps) can achieve very high quality (wideband MOS > 4)
 
These impairments can manifest as slight muffling, robotic or "watery" sound, loss of certain speech subtleties, etc., depending on the codec.
 
'''Impacts on Quality:''' This is a baseline quality limiter. If you choose a low-bit-rate codec, even a perfectly stable network call will have a certain quality level below that of a higher-rate codec. In MOS terms, each codec can be given an '''Equipment Impairment Factor (<math>I_e</math>)''' in the E-model, which quantifies how much it degrades R-score relative to a hypothetical perfect codec:


{| class="wikitable"
{| class="wikitable"
|+ Codec Equipment Impairment Factors (I<sub>e</sub>)
! Codec !! I<sub>e</sub> Value !! Approximate Max MOS
|-
|-
| G.711 (PCM 64k) || 0 || 4.4-4.5
! Task !! Description
|-
|-
| G.729A (8k) || 10-11 || ~4.0
| <code>listCdrIds</code> || List CDR IDs with basic info. Params: <code>offset</code>, <code>size</code>
|-
|-
| G.723.1 (5.3k) || ~15-19 || ~3.6
| <code>getAudioGraph</code> || Get spectrogram/waveform image. Params: <code>cdrId</code>, <code>type</code> (S/P/ALL), <code>side</code> (L/R)
|}
|}


'''Measurement:''' Objective voice quality scores like MOS can be predicted for each codec. The E-model uses those <math>I_e</math> values to adjust the expected R (and thus MOS) for the codec in use. Another way is using POLQA or PESQ – these algorithms, given reference and degraded audio, will output a MOS score.
== CDR HTTP API ==


In monitoring, '''VoIPmonitor by default assumes calls use G.711''' (for its MOS calculation) so that it provides a uniform scale for comparison. It notes that G.711 has a maximum MOS of 4.5 (narrowband) and that if a call actually used G.729 (max ~4.0–4.1), the displayed MOS might seem "too high" for that codec. VoIPmonitor allows adjusting for the actual codec if desired, so you can get a realistic MOS reflecting codec choice.
Session-based API using GUI filter parameters. Requires login first.


'''Mitigation:''' The only "solution" here is to choose an appropriate codec for the scenario. Use a higher-fidelity codec when quality is more important than bandwidth. For HD voice, use wideband codecs (which not only avoid narrowband limitation but often also have lower compression artifacts at similar bit rates due to better algorithms).
=== Authentication ===


[[File:VoIP_Impairments_Codec_PESQ.png|thumb|500px|center|PESQ scores for different codecs (G.729A, G.711 with various PLC methods) under packet loss conditions]]
<syntaxhighlight lang="bash">
# Step 1: Login and get session
curl -X POST 'http://server/php/model/sql.php?module=bypass_login&user=USER&pass=PASSWORD'
# Returns: {"SID":"abc123...","cookie_name":"PHPSESSID","success":true}


=== Quantization Distortion ===
# Step 2: Use session cookie for requests
curl -X POST --cookie "PHPSESSID=abc123..." \
  'http://server/php/model/sql.php?task=LISTING&module=CDR&fdatefrom=2024-01-01T00:00:00&fcaller=123'
</syntaxhighlight>


'''Description:''' Quantization distortion is a specific kind of impairment resulting from the digital representation of an analog signal. In G.711 PCM, the analog waveform is sampled and each sample is quantized to an 8-bit value (μ-law or A-law companding) – this process introduces a small rounding error called '''quantization noise'''. For linear PCM of sufficient bit-depth, this noise is very low (for 8-bit logarithmic PCM, effectively ~12-bit linear equivalent in best case, giving around 38 dB signal-to-quantization-noise ratio for small signals, and ~55 dB for loud signals due to companding). In modern terms, quantization noise in G.711 is negligible in perceived quality (hence G.711 is very good).
{{Note|1=If GUI is in subdirectory (e.g., <code>/demo</code>), cookie name changes to <code>PHPSESSID-demo</code>.}}


For historical network planning purposes, the '''Quantization Distortion Unit (QDU)''' was established, where one unit corresponds to the quantization distortion that occurs during a single A/D conversion using ITU-T G.711 codec.
'''Disable authorization:''' GUI > Settings > System Configuration > "Disable authorization for API usage"


'''Impacts on Quality:''' For G.711, the quantization noise floor is low enough that users don't perceive "hiss" from it under normal circumstances. Only if you compare to a higher fidelity system or have many tandem stages would the slight quality reduction be noted. On an MOS scale, G.711's quantization is what limits it to MOS 4.4–4.5 (since it's not absolutely transparent to analog original, but very close).
=== Filter Parameters ===


'''Measurement:''' Quantization noise can be measured by techniques like Signal-to-Quantization-Noise Ratio (SQNR) for a given input level. The E-model actually has a term for quantizing distortion unit (qdu) in analog network planning (older usage), but when using <math>I_e</math> for digital codecs, one typically doesn't separately calculate quantization impairment – it's lumped into the codec's overall impairment factor.
'''Mandatory:''' <code>task=LISTING</code>, <code>module=CDR</code>, and at least one of <code>fdatefrom</code> or <code>fdateto</code>.
 
'''Mitigation:''' Ensure using sufficient bit depth for any analog-to-digital conversion. All standard codecs and equipment already do this (8-bit companded PCM is standard; if more fidelity needed, wideband uses 16-bit linear internally before compression, etc.). Avoid multiple unnecessary A/D conversions (each one adds a bit of noise). This is why transcoding between lossy codecs is discouraged unless necessary.
 
=== Degradation Factors in IP Networks ===
 
All the factors above exist (to varying extents) in any voice system, including the traditional PSTN. However, VoIP over IP networks introduces '''additional impairment factors''' stemming from the packet-based, best-effort nature of IP transport. The key network-induced degradations are '''delay (and its variation)''', '''packet loss''', and '''packet reordering'''. These directly affect voice quality in ways distinct from analog impairments, and they require special handling (like jitter buffers and loss concealment) to mitigate.


{| class="wikitable"
{| class="wikitable"
|+ IP Network Parameters to Voice Quality Mapping
! Network Parameter !! Conversion !! Voice Quality Impact
|-
|-
| IP Transfer Delay (IPTD) || IPTD + source delay + destination delay || Average end-to-end delay
! Parameter !! Description !! Example
|-
| <code>fdatefrom</code> / <code>fdateto</code> || Date range || <code>2024-01-01T00:00:00</code>
|-
| <code>fcaller</code> / <code>fcalled</code> || Caller/called number || <code>123456</code>
|-
| <code>fcallerd_type</code> || <code>0</code>=OR, <code>1</code>=AND || <code>1</code>
|-
| <code>fcaller_domain</code> / <code>fcalled_domain</code> || SIP domain || <code>sip.example.com</code>
|-
| <code>fsipcallerip</code> / <code>fsipcalledip</code> || SIP IP address || <code>192.168.1.1</code>
|-
|-
| IP Delay Variation (IPDV) || Combined with jitter buffer behavior || Adds to delay or causes frame loss
| <code>fcallid</code> || SIP Call-ID || <code>abc123</code>
|-
|-
| Packet Errors || IP + UDP + RTP header errors || Audio frame loss
| <code>fsipresponse</code> || SIP response code || <code>503</code>
|-
|-
| Packet Reordering || May be treated as loss || Audio frame loss
| <code>fdurationgt</code> / <code>fdurationlt</code> || Duration (seconds) || <code>10</code>
|-
|-
| Lost Packets || IP loss + all audio defects || Audio frame loss
| <code>fsensor_id</code> || Sensor ID || <code>1</code>
|-
|-
| Burst Packet Loss || Depends on burst length || Call interruption
| <code>fcodec</code> || Codec numbers (comma-sep) || <code>0,8</code>
|-
| <code>suppress_results</code> || <code>1</code>=count only || <code>1</code>
|}
|}


==== Delay and Delay Variation (Jitter) ====
'''Paging:''' <code>page</code>, <code>start</code>, <code>limit</code>
 
'''Description:''' We discussed absolute one-way delay (latency) earlier as a conversational impairment. In IP networks, aside from the base latency, the '''variation in packet delay''' is a critical factor. This variation is commonly called '''jitter'''. Technically, the term "jitter" can be defined in multiple ways, but in VoIP context it refers to the variability in inter-packet arrival times relative to the original send intervals. Network congestion, queuing, and route changes cause some packets to take longer than others.
 
For clarity, the precise term defined by standards is '''Packet Delay Variation (PDV)'''. ITU-T Y.1540 and IETF RFC 3393 define PDV as the differences in one-way delay between selected packets. Many people use "jitter" informally to mean PDV. In this article, we will use jitter to mean packet delay variation, as is common in VoIP discussions.
 
The '''Inter-Packet Delay Variation (IPDV)''' is calculated as:
 
:<math>IPDV(i) = D(i) - D(i-1)</math>
 
where <math>D(i)</math> is the delay of packet <math>i</math>.
 
The '''Packet Delay Variation (PDV)''' relative to minimum delay:
 
:<math>PDV(i) = D(i) - D_{min}</math>
 
Because voice is real-time, the receiver cannot just wait indefinitely for delayed packets – doing so adds unacceptable latency. Instead, receivers implement a '''jitter buffer (de-jitter buffer)''' that buffers incoming packets for a short time and plays them out in steady stream. This buffer absorbs timing variations up to a point. If a packet is delayed beyond what the buffer can absorb (i.e., arrives late after its scheduled playout time), that packet is effectively lost (discarded).
 
[[File:VoIP_Impairments_Dejitter_Buffer.png|thumb|550px|center|De-jitter buffer architecture in VoIP receive path, showing packet flow from network through RTP/UDP, de-jitter buffer, decoder with PLC, and D/A converter]]
 
'''Impacts on Quality:''' Jitter itself, if managed perfectly by the jitter buffer, only introduces additional delay (the buffering delay). Small jitter is handled by a small buffer, adding maybe 20–50 ms latency, which might be negligible. High jitter either forces a large jitter buffer (adding significant delay, which as we know harms conversational quality) or causes many late packets to be dropped (which is perceived as loss gaps or audio clipping). So the impact of jitter is indirect: it translates into either more delay or more loss, both of which degrade quality.
 
Jitter buffer types:
* '''Static (Fixed)''' – fixed buffer size (simpler, higher latency)
* '''Adaptive''' – dynamic buffer size based on network conditions (lower latency, more complex)
 
Trade-offs:
* Larger buffer → lower packet loss, higher delay
* Smaller buffer → higher packet loss, lower delay
 
[[File:VoIP_Impairments_Delay_Jitter_Graph.png|thumb|450px|center|Delay and jitter measurement showing packet delay variation over time. The graph shows delay (blue), MAPDV2 (green), and MPPDV measurements across packet sequence numbers.]]
 
'''Measurement:''' Jitter can be quantified in various ways:
* '''RTP Jitter (as per RFC 3550):''' A smoothed statistical variance measure that endpoints often report via RTCP
* '''Packet Delay Variation metrics:''' ITU Y.1541/Y.1540 define metrics like IPDV, often measured as the difference between some percentile of delay vs a lower percentile
* '''Mean Absolute Packet Delay Variation (MAPDV2):''' An average deviation measure
* '''Jitter Distribution:''' Tools like VoIPmonitor actually record jitter distribution into buckets – e.g., how many packets were delayed by 0–50 ms, 50–70 ms, 70–90 ms, etc.
 
VoIPmonitor's approach: it logs detailed jitter statistics per call, and even color-codes a Delay column in its CDRs which shows distribution of jitter events. This allows quick visual identification of calls that had high jitter.


'''Mitigation:''' Quality of Service (QoS) mechanisms can reduce network jitter by prioritizing voice packets and preventing long queues. Over-provisioning bandwidth and avoiding slow network segments also helps. On the endpoint side, using an adaptive jitter buffer is crucial. Adaptive jitter buffers dynamically size themselves based on network conditions. They try to balance delay and loss: if jitter increases, the buffer may widen (increasing delay slightly to avoid packet drops). When jitter decreases, they can shrink to reduce latency.
'''Sorting:''' <code>sort=[{"property":"calldate","direction":"DESC"}]</code>


==== Packet Loss ====
{{Warning|1=Use <code>=</code> (equals) for parameters, not <code>:</code> (colon). Example: <code>fcallerd_type=1</code> (correct), not <code>fcallerd_type:1</code> (wrong).}}


'''Description:''' Packet loss is the failure of one or more packets to reach the receiver. In IP networks, packets can be lost due to congestion (buffers overflow and packets are dropped), routing issues, link errors, etc. In VoIP, any lost RTP packet means a chunk of audio (typically 20ms or so) is missing from the stream. Unlike circuit networks where bit errors might cause slight noise, in packet networks a lost packet results in a gap unless concealed.
=== Wildcards ===


'''Impacts on Quality:''' Packet loss is one of the most severe impairments for VoIP. Each lost voice frame results in missing audio. The distribution of losses matters greatly:
Use <code>%25</code> (URL-encoded <code>%</code>) for SQL LIKE patterns:


* '''Random isolated losses:''' A packet here or there is lost. Modern decoders apply '''Packet Loss Concealment (PLC)''' algorithms to hide the loss – for example, inserting a copy of the last packet's audio, or some noise, to mask the gap. Humans are quite tolerant to an occasional 20ms concealment.
<syntaxhighlight lang="bash">
* '''Burst losses:''' Multiple packets in a row lost (e.g., a burst of 100ms of audio missing) is much more damaging to speech intelligibility and quality. A 5% loss rate that comes as one 100-packet burst in a second is far worse than a 5% evenly spread (one packet every 20 packets lost).
# Caller starting with "00"
curl -G 'http://server/php/api.php?task=getVoipCalls&user=USER&password=PASSWORD' \
  --data-urlencode 'params={"startTime":"2024-01-01","caller":"%2500%25"}'
</syntaxhighlight>


This is why two calls with the same average loss can have very different quality. One could have evenly spaced tiny gaps (somewhat tolerable), the other could have entire words or syllables dropping out at once (very annoying).
=== Codec Values ===


{| class="wikitable"
{| class="wikitable"
|+ Packet Loss Impact on Voice Quality
! Loss Rate !! Quality Impact
|-
|-
| 0-1% || Negligible with good PLC
! Value !! Codec !! Value !! Codec
|-
| 0 || PCMU (G.711 μ-law) || 8 || PCMA (G.711 A-law)
|-
| 3 || GSM || 9 || G.722
|-
|-
| 1-3% || Minor degradation noticeable
| 4 || G.723 || 12 || QCELP
|-
|-
| 3-5% || Noticeable degradation
| 18 || G.729 || 97 || iLBC
|-
|-
| 5-10% || Significant degradation
| 98 || Speex || 301-305 || SILK variants
|-
|-
| > 10% || Severe degradation, often unacceptable
| 306-308 || iSAC variants || 1000 || T.38 (Fax)
|}
|}


The '''Packet Loss Percentage (Ppl)''' is calculated as:
=== RTP Quality Filters ===
 
:<math>P_{pl} = \frac{N_{lost}}{N_{total}} \times 100</math>
 
The '''Burst Ratio (BurstR)''' characterizes loss patterns:
 
:<math>BurstR = \frac{MBL_R}{MBL_B}</math>
 
where:
* <math>MBL_R</math> = Mean Burst Length under random loss model
* <math>MBL_B</math> = Mean Burst Length actually observed
 
If BurstR > 1, it indicates more bursty than random loss (BurstR=1 means Poisson loss). The E-model uses BurstR along with average loss to adjust the impairment (since bursty loss is penalized more).
 
[[File:VoIP_Impairments_MOS_vs_PacketLoss.png|thumb|450px|center|Mean Opinion Score degradation as a function of packet loss percentage for different distortion types]]
 
'''Measurement:''' Packet loss is measured as a percentage of packets sent that are not received. VoIPmonitor counts lost packets and crucially also records the '''distribution of consecutive losses'''. It keeps counters of how often 1 packet was lost, how often 2 in a row were lost, etc., up to 10+ in a row. This allows it to characterize burstiness.
 
VoIPmonitor's approach: it computes the E-model's packet-loss impairment factoring in burstiness for its MOS scores. If configured, it also can incorporate codec-specific robustness (the Bpl parameter) in this calculation.
 
[[File:VoIP_Impairments_Codec_Loss_Comparison.png|thumb|500px|center|MOS comparison of different codecs (DoD-LPC, DoD-CELP with FEC) under varying packet loss conditions]]
 
'''Mitigation:''' The primary way to combat packet loss in VoIP is ensuring network QoS and capacity to avoid congestion losses. Over-provision links or prioritize RTP so losses are minimal. On the application side, use PLC in decoders – all modern VoIP codecs have built-in PLC strategies. Some systems employ '''forward error correction (FEC)''' (e.g., redundant audio packets as per RFC 2198, or newer Opus FEC) which add a bit of overhead but can reconstruct lost frames up to a certain percentage.
 
[[File:VoIP_Impairments_MOS_3D_Ppl_BurstR.png|thumb|500px|center|3D visualization of MOS as a function of packet loss probability (Ppl) and burst ratio (BurstR), showing how bursty loss degrades quality more severely]]
 
==== Packet Reordering ====
 
'''Description:''' Packet reordering means packets arrive out of their original sent order. In IP networks, reordering can happen when parallel routes exist and packets take different paths, or due to router hardware parallelism, etc. If packets are time-stamped and sequenced (as RTP is), the receiver can detect that a packet with a higher sequence number arrived before a lower one (i.e., an earlier packet was delayed more and came later). Reordering is closely related to jitter – in fact, a severely delayed packet arriving after later packets is one way to define a reorder event.
 
'''Impacts on Quality:''' Minor reordering is usually handled by jitter buffers as well. If a packet comes late but still within the buffer's wait time, it gets inserted in the correct order for playout. The listener never knows it was out of order. If reordering is extreme (packet comes so late that by the time it arrives, its place in sequence has already been played out with a gap or concealment), then effectively that packet is treated as lost (since it missed its playout deadline). So reordering is harmful only insofar as it contributes to effective loss or additional delay.
 
'''Measurement:''' Reordering is measured by counting how many packets were received out of sequence. Tools might track metrics like percentage of packets out-of-order or the extent of reordering. RFC 4737 defines some reorder metrics. In practice, VoIPmonitor looks at sequence numbers and can flag if there are gaps that later filled (implying reorder) or simply count any instance of sequence inversion.
 
'''Mitigation:''' Avoid network scenarios that cause reordering – e.g., sending packets over multiple WAN links without proper sequencing, or using technologies where packet ordering isn't preserved. At the receiver, a sufficiently large jitter buffer will put small out-of-order packets back in order (provided they arrive before the playout deadline).
 
=== Summary of Degradation Factors ===
 
Having covered the gamut of degradation factors, we see that VoIP call quality is influenced by a mixture of analog, digital, and network phenomena. In summary:
* '''Classic factors''' (loudness, noise, frequency response, sidetone, echo, etc.) determine the baseline quality and user comfort
* '''Codec choice''' sets an upper limit on fidelity
* '''Network factors''' (delay, jitter, loss) often become the make-or-break issues in VoIP, causing even a great codec to sound bad if the network is poor
 
== Voice Quality Measurement Methods ==
 
How do we quantify voice quality in a meaningful way? We have described many impairments qualitatively, but network engineers and researchers need numeric measures to evaluate and compare call quality. Broadly, quality can be assessed by '''subjective methods''' (asking people for their opinion) or '''objective methods''' (using algorithms or formulas to predict opinion). The gold standard is human opinion, but obviously one cannot have people rating every phone call! Thus, several objective models have been developed to estimate call quality.
 
The most well-known metric is the '''Mean Opinion Score (MOS)'''. MOS originated from subjective tests but is now often produced by objective estimators.
 
=== MOS and Subjective Testing ===
 
'''Mean Opinion Score (MOS)''' is a numerical measure of perceived quality, typically on a scale from 1 to 5 for voice. The MOS concept comes from having a panel of human listeners rate the quality of speech samples in formal tests. The listeners give ratings according to a scoring system:


{| class="wikitable"
{| class="wikitable"
|+ MOS Rating Scale
! Score !! Quality !! Description
|-
|-
| 5 || Excellent || Imperceptible impairment
! Parameter !! Description
|-
| <code>fmosf1</code>, <code>fmosf2</code>, <code>fmosadapt</code> || MOS score filters
|-
|-
| 4 || Good || Perceptible but not annoying impairment
| <code>frtcp_maxjitter</code>, <code>frtcp_avgjitter</code> || RTCP jitter
|-
|-
| 3 || Fair || Slightly annoying impairment
| <code>frtcp_maxfr</code>, <code>frtcp_avgfr</code> || RTCP frame rate
|-
|-
| 2 || Poor || Annoying impairment
| <code>floss1</code> - <code>floss10</code> || Packet loss distribution
|-
|-
| 1 || Bad || Very annoying or nearly unintelligible
| <code>f_d50</code>, <code>f_d70</code>, <code>f_d90</code>, ... || Delay distribution (50ms, 70ms, etc.)
|}
|}


The mean of all listeners' scores is the MOS for that sample/call. Subjective MOS testing is defined by '''ITU-T P.800''', which lays out rigorous methods to conduct listening tests (quiet room conditions, how to select listeners, material, etc.).
== Utility Endpoints ==
 
==== Listening-Only MOS (MOS-LQ) ====
 
Listening-only tests present participants with speech samples (one side of a conversation, usually just a recording of someone speaking a standardized passage) that have been degraded by the system under test. Listeners then rate the listening quality of what they heard using the MOS scale. This yields '''MOS-LQ''' (Mean Opinion Score for Listening Quality). ITU-T P.800 describes this methodology and it's very common for evaluating codecs or one-way audio processing effects.
 
MOS-LQ mainly evaluates fidelity and distortion aspects (noise, codec clarity, etc.) because the listener is not engaged in a conversation, just hearing the result.
 
==== Conversational MOS (MOS-CQ) ====
 
Conversational tests involve two people actually talking with each other (often with a prescribed conversational task) over the connection being tested. Afterward, they rate the overall quality of the conversation. This type of test captures the interactivity factors: delay, echo, how two-way impairments affect flow, etc. The output is '''MOS-CQ''' (Conversational Quality).
 
For example, if there is high latency or if echo is present, MOS-CQ will drop, even if the one-way audio fidelity was fine, because the conversation was difficult. ITU-T has recommendations (P.805) that discuss conversational test methods.
 
==== Talker MOS and Other Subjective Measures ====
 
Another category sometimes referenced is '''talker opinion''' – essentially, how the person speaking perceives the quality (from their perspective). Typically, talker-side issues involve sidetone and echo: e.g., if there's an echo, the talker's experience is bad.
 
Subjective testing is the ground truth, but it's expensive and slow. You need many people, controlled environments, and statistical analysis to get reliable MOS values for a given condition. Therefore, it's not practical for routine monitoring or equipment testing beyond design labs. This led to development of objective models that can predict MOS.
 
[[File:VoIP_Impairments_Assessment_Flow.png|thumb|450px|center|Speech quality assessment hierarchy showing subjective and objective methods, from MOS-LQS through intrusive (double-ended) and non-intrusive (single-ended) measurements]]
 
=== Objective Measurement Techniques ===
 
Objective methods aim to estimate voice quality without asking people directly, using algorithms or formulas. They fall into a few categories:
 
* '''Intrusive (reference-based) metrics:''' Require the original clean reference signal and the degraded signal for comparison. Example: PESQ (P.862) and POLQA (P.863).
* '''Non-intrusive (single-ended) metrics:''' Only need the degraded signal. Example: ITU-T P.563.
* '''Parametric (network/model-based) metrics:''' Rely on known parameters of the call (codec type, loss rate, jitter, etc.) to predict quality. The E-model (ITU-T G.107) is the prime example.
 
==== Intrusive Objective Models (PESQ, POLQA) ====
 
'''PESQ (Perceptual Evaluation of Speech Quality)''', ITU-T P.862, was a breakthrough algorithm standardized in 2001. It takes a reference audio sample (typically a known speech recording) and a degraded sample (the same recording after passing through the system under test), and it outputs a '''MOS-LQO''' (Mean Opinion Score – Listening Quality – Objective). PESQ works by aligning the signals, then comparing them using a perceptual model (basically mimicking human auditory processing to see how the degradation would be perceived). The result is a score usually mapped to the MOS scale.
 
'''POLQA (Perceptual Objective Listening Quality Analysis)''', P.863, is the successor to PESQ, standardized in 2011 and updated subsequently. POLQA extends the approach to wideband and super-wideband audio (covering HD voice) and better handles newer distortions (VoIP clock drifts, wideband artifacts, etc.).
 
[[File:VoIP_Impairments_PESQ_Architecture.png|thumb|500px|center|PESQ perceptual model architecture showing signal flow from original and degraded signals through perceptual models to MOS output]]
 
[[File:VoIP_Impairments_MOS_PESQ_Scatter.png|thumb|450px|center|Correlation between subjective MOS and PESQ scores, showing good correlation (green line) with some scatter depending on loss rate conditions]]
 
These intrusive methods are very powerful, but you need to have the ability to inject and capture reference signals. That means they are used in '''active test systems''': e.g., send a known test waveform through a VoIP call, record it at the far end, and then compute MOS via PESQ/POLQA.
 
==== Non-Intrusive Models (P.563) ====


'''ITU-T P.563''' is a recommendation for single-ended voice quality estimation (often called 3SQM – Single Sided Speech Quality Measure). The idea is to take a recording of a call (just what one person heard) and predict what the MOS would be without needing the reference. P.563 does this by extracting features from the audio: detecting if there's background noise, how much speech is clipped or distorted, etc., and uses an internal model to guess the quality.
=== Direct PCAP Download ===


[[File:VoIP_Impairments_P563_Parameters.png|thumb|500px|center|P.563 speech quality parameters extraction showing pre-processing, characteristic speech parameter calculation, and speech quality model]]
<syntaxhighlight lang="bash">
curl 'http://server/php/pcap.php?id=12345'          # Full PCAP
curl 'http://server/php/pcap.php?id=12345&disable_rtp=1'  # SIP only
</syntaxhighlight>


While conceptually extremely useful (since you can just tap a call and get a MOS estimate), P.563 had mixed success. It works reasonably for some conditions but is less accurate for others, especially when multiple impairments combine.
=== CDR URL for Browser ===


==== Parametric Models (E-model) ====
<syntaxhighlight lang="text">
http://server/admin.php?cdr_filter={fcallid:"abc123"}
</syntaxhighlight>


Parametric models use known network and codec parameters to predict voice quality without analyzing the actual signal. The prime example is the '''E-model (ITU G.107)''' which we will detail in the next section.
=== SIP History ===


VoIPmonitor uses a parametric approach to compute MOS: it essentially implements an E-model-style calculation internally. Instead of analyzing audio, it takes the measured PDV (jitter) and loss from jitter buffer simulation, and knowing the codec (assumed G.711 unless configured otherwise), it calculates MOS directly.
Requires session authentication first.


Note: Unlike the standard E-model, VoIPmonitor does '''not''' calculate the R-Factor metric. R-Factor is considered redundant to the VoIPmonitor MOS score because there is a direct mathematical correlation between the two - monitoring MOS provides equivalent information. See the [[Glossary#R-Factor|R-Factor definition]] for details.
<syntaxhighlight lang="bash">
# JSON data
curl --cookie "PHPSESSID=..." 'http://server/php/pcap2text.php?action=brief_data&id=12345'


[[File:VoIP_Impairments_Quality_Methods.png|thumb|550px|center|Voice quality measurement methodology overview showing subjective testing, objective measures (P.834, P.833), and E-model (G.107) approaches]]
# HTML table
curl --cookie "PHPSESSID=..." 'http://server/php/pcap2text.php?action=brief&id=12345'


=== The E-Model (ITU-T G.107) ===
# MSC diagram
curl --cookie "PHPSESSID=..." 'http://server/php/pcap2text.php?action=getMSC&id=12345'
</syntaxhighlight>


The E-model is a computational model that predicts voice quality (in terms of a user opinion rating) from a set of impairment factors. It was originally designed for transmission planning: to help network planners estimate if end-to-end quality would be acceptable when combining various impairments (like codec choice, echo, delay, etc.). Over time it has also been applied in on-line quality monitoring.
=== License Check ===


The E-model produces an '''R-value''' (sometimes called R-score) that ranges from 0 (extremely poor) to 100 (essentially perfect quality). The formula for the R-value as given in G.107:
<syntaxhighlight lang="bash">
# Basic license check
curl 'http://server/php/apilicensecheck.php?task=licenseCheck'


:<math>R = R_0 - I_s - I_d - I_e + A</math>
# Concurrent calls limit check
curl 'http://server/php/apilicensecheck.php?task=licenseCallsLimitCheck'
</syntaxhighlight>


Where:
== Share CDR Configuration ==
* <math>R_0</math> is the basic signal-to-noise rating (this includes the effects of noise floor, sidetone, and other transducer effects in an ideal network). <math>R_0</math> is about '''94''' for a typical modern connection with good devices and no noise
* <math>I_s</math> is the impairment factor for '''simultaneous impairments''' (like too loud sidetone, or background noise effects)
* <math>I_d</math> is the impairment factor for '''delay''' (includes pure delay effects and talker echo if not fully cancelled)
* <math>I_e</math> (or <math>I_{e,eff}</math>) is the '''equipment impairment factor''', mainly codec-related (and any packet loss effect combined)
* <math>A</math> is the '''advantage factor''', which gives a scoring advantage if the user is in a constrained use case where they expect lower quality. Typically A=0 for normal network planning


==== Key E-Model Parameters ====
=== Share Link Types ===


'''Codec impairment (<math>I_e</math>):''' Each codec has a base value. Higher means worse quality. E.g., <math>I_e</math>=0 for G.711, 11 for G.729A, 19 for G.723.1(5.3k), etc.
Use <code>/php/model/utilities.php</code> with <code>task=shareCdr</code>:
 
'''Packet loss robustness (<math>B_{pl}</math>):''' A codec parameter indicating how well quality holds up with increasing packet loss. A higher <math>B_{pl}</math> means the codec can handle loss better.
 
'''Packet loss rate (<math>P_{pl}</math>) and burstiness (<math>BurstR</math>):''' These feed into the formula for <math>I_{e,eff}</math>. The standard formula:
 
:<math>I_{e,eff} = I_e + (95 - I_e) \cdot \frac{P_{pl}}{P_{pl} + B_{pl}}</math>
 
Essentially, as <math>P_{pl}</math> increases, the effective impairment skyrockets, approaching 95.
 
'''One-way delay (T) and echo parameters:''' E-model calculates <math>I_d</math> as sum of <math>I_{d,te}</math> (talker echo impairment) and <math>I_{d,d}</math> (pure delay impairment). For T <= 100ms, <math>I_{d,d}</math> = 0 (no impairment). As delay goes beyond 100ms, <math>I_{d,d}</math> increases gradually.
 
==== From R-Factor to MOS ====
 
After computing R, the E-model provides a mapping to MOS. For narrowband telephony, the mapping is:
 
:<math>MOS = \begin{cases} 1 & R < 0 \\ 1 + 0.035R + R(R-60)(100-R) \times 7 \times 10^{-6} & 0 \leq R \leq 100 \\ 4.5 & R > 100 \end{cases}</math>
 
This nonlinear polynomial mapping was determined by fitting typical subjective data. Essentially:
* R of 100 maps to MOS ~4.5 (the theoretical max for narrowband)
* R of 0 maps to MOS 1
* R of 94 (which is about max realistic) maps to MOS ~4.4
 
[[File:VoIP_Impairments_MOS_R_Factor.png|thumb|400px|center|MOS to R-factor conversion curve showing the nonlinear relationship]]


{| class="wikitable"
{| class="wikitable"
|+ R-Factor to MOS Interpretation
! R-Factor !! MOS !! Quality
|-
| 90+ || > 4.3 || Excellent
|-
|-
| 80 || ~4.0 || Good
! Type !! subType !! Description
|-
|-
| 70 || ~3.6 || Lower end of acceptable
| Local Public || <code>self_protected_link</code> || Password-protected local link
|-
|-
| 60 || ~3.1 || Fair
| Local Private || <code>self_login_link</code> || Requires GUI login
|-
|-
| 50 || ~2.6 || Poor
| voipmonitor.org || <code>share_link</code> || Public via share.voipmonitor.org
|-
| < 50 || < 2.6 || Very bad
|}
|}


The E-model is extremely useful because it allows additivity of impairments in a simple way. You can plug in different factors and see their combined effect. Network planners used it to budget how much loss and delay can we tolerate given a codec, etc.
=== Custom Branding ===


== Monitoring Voice Quality with VoIPmonitor ==
'''Product Name:''' Edit <code>config/system_configuration.php</code>:
<syntaxhighlight lang="php">
define('BRAND_NAME', 'YourCompany');
</syntaxhighlight>


'''VoIPmonitor''' is a specialized tool designed to passively monitor VoIP calls on a network and provide detailed metrics and recordings for analysis. Unlike a general packet analyzer (e.g., Wireshark) which can show RTP streams and basic stats for a single call, VoIPmonitor is built to handle high volumes of calls, systematically compute quality metrics, and store long-term data for all calls.
'''Share Domain:''' Edit <code>brand.php</code> to use your domain instead of share.voipmonitor.org:
<syntaxhighlight lang="php">
define('BRAND_SHARESITE', 'share.yourdomain.com');
define('BRAND_DOMAIN', 'yourdomain.com');
</syntaxhighlight>


=== Passive Call Capture and RTP Analysis ===
== Custom Login (LDAP/SSO) ==


VoIPmonitor works by sniffing network traffic (for example, via a SPAN port or network tap) to capture SIP signaling and RTP media packets. It reconstructs call sessions from SIP, then correlates the RTP streams for each call. This is all done non-intrusively; VoIPmonitor is observing copies of packets, not in the call path.
Custom authentication via <code>scripts/custom_login.php</code>. This applies to GUI login only, NOT to API authentication.


For each call, it can:
=== Basic Structure ===
* Save the RTP audio to disk (and even decode to audio files if needed)
* Analyze the RTP streams in real-time to gather quality metrics
* Measure exactly how many packets were lost (did not arrive in sequence)
* Calculate the arrival timing of each packet (to calculate jitter)
* Store call records (CDRs) with these metrics in a database


=== Jitter Analysis and PDV ===
<syntaxhighlight lang="php">
<?php
function custom_login($user, $password) {
    // Authenticate against external system (LDAP, etc.)
    // ...


VoIPmonitor excels at jitter analysis. Instead of giving one opaque "jitter number," it records jitter as a '''distribution of delay variation events'''. Specifically, VoIPmonitor logs how many packets fell into delay bins: e.g., 0–50 ms delay, 50–70 ms, 70–90 ms, etc., relative to nominal timing. It defines thresholds (50ms, 70ms, 90ms, 120ms, 150ms, 200ms, >300ms for example bins) and counts how many packets experienced those delays.
    return array(
        'username' => $user,
        'id' => $uniqueNumericId,  // REQUIRED: Must be unique per user!
        'is_admin' => false,
        'id_group' => 1,           // Optional: GUI group ID
        'enable_sensors' => array(1,2,3)  // Optional: restrict to sensors
    );
}
?>
</syntaxhighlight>


This granular data allows the system or the user to query calls not just by "average jitter" but by severity of jitter incidents. For example, the GUI allows filtering calls by PDV events: "find all calls with at least 10 packets delayed > 120ms".
{{Warning|1='''The <code>id</code> field MUST be unique per user.''' If multiple users return the same ID, they share ALL settings (timezone, dashboard, etc.). Use LDAP <code>uidnumber</code> or <code>crc32($email)</code> for uniqueness.}}


=== Jitter Buffer Simulation ===
=== LDAP Example ===


VoIPmonitor's unique '''triple MOS output''' is directly tied to jitter buffer simulation:
See <code>scripts/ldap_custom_login_example.php</code> in GUI directory.


* '''MOS F1 (Fixed 50 ms):''' Assumes the receiver had a fixed jitter buffer of 50 ms. Any packet arriving more than 50 ms late is considered lost
'''Requirements:''' <code>php-ldap</code> package installed.
* '''MOS F2 (Fixed 200 ms):''' Same idea, but with a much larger buffer. Fewer packets would miss a 200 ms deadline, so it results in fewer losses but higher latency impairment
* '''MOS Adapt (Adaptive up to 500 ms):''' Simulates an adaptive jitter buffer that can stretch up to 500 ms if needed


By comparing these MOS values, an operator can infer:
'''Debug from CLI:'''
* If MOS F1 << MOS F2: the network jitter often exceeded 50 ms but stayed under 200 ms
<syntaxhighlight lang="bash">
* If all three MOS are similar and high: jitter was low anyway
cd /var/www/html/scripts/
php custom_login.php
</syntaxhighlight>


VoIPmonitor allows configuring whether to assume G.711 for MOS (max 4.5) or adjust for actual codec's max MOS.
Enable debug by uncommenting the debug block at top of script.


==== Important: High PDV Causes Low MOS Even With Zero Packet Loss ====
=== Troubleshooting ===


It is a common misconception that calls must have packet loss or high jitter statistics to show poor MOS scores. The MOS calculation directly penalizes '''high Packet Delay Variation (PDV)''' patterns, and this is '''expected behavior'''.
'''Users share settings:''' Return unique <code>id</code> per user (see warning above).


Consider this real-world scenario:
'''LDAP users can't view CDRs:''' Ensure <code>is_admin => false</code> and correct <code>id_group</code> is returned.
* A call shows '''0% packet loss''' and '''0ms jitter''' in standard statistics
* However, packets arrive with high variation: some packets arrive at the expected 20ms interval, but many arrive at 200ms or more
* The MOS score will still be '''low**' because the jitter buffer simulation considers these delayed packets as "effectively lost" for playout purposes


This happens because:
'''Script being deleted:''' GUI antivirus deletes scripts with <code>shell_exec()</code>, <code>exec()</code>, etc.
1. When RTP packets arrive with large variations in delay (e.g., >200ms instead of the expected ~20ms), they exceed what typical dejitter buffers can tolerate
2. VoIPmonitor's MOS calculation (based on the E-model) applies a penalty proportional to this PDV pattern
3. The three MOS scores (F1, F2, Adapt) use different buffer thresholds to simulate realistic endpoint behavior


'''Why this is correct:'''
'''Solutions:'''
Even if all packets eventually arrive, extremely variable arrival times mean the receiver cannot play them out smoothly. A jitter buffer sized for 50ms cannot hold packets that arrive 200ms late without adding 200ms of latency (which would itself degrade conversational quality). The E-model accounts for this by reducing the MOS score to reflect that the audio stream cannot be reconstructed smoothly.
# Disable antivirus in <code>config/system_configuration.php</code>:
<syntaxhighlight lang="php">
define('DISABLE_ANTIVIRUS', true);
</syntaxhighlight>
# Or move shell commands to external file outside web directory and <code>require_once()</code> it.


'''Example scenario:'''
=== Azure AD / Microsoft SSO ===
* Call shows: Loss = 0%, Jitter = 0ms, but MOS = 2.5
* Investigation reveals: PDV distribution shows many packets in the 150-300ms and >300ms bins
* Explanation: Packets are arriving but with such large delays that they exceed jitter buffer capacity, causing the same degraded quality as actual packet loss


This behavior is intentional and reflects real-world voice quality. The MOS score is telling you: "Even though all bits arrived, the timing was so erratic that the audio cannot be played out smoothly."
For Microsoft Entra ID integration, see [[Microsoft_Sign_in_usage]]. For Google, see [[Google_Sign_in_usage]].
=== Packet Loss Monitoring and Burst Analysis ===


Packet loss metrics in VoIPmonitor include:
=== Return Array Parameters ===
* Total packets expected vs received (to calculate % loss)
* '''Consecutive loss events counters''' – you can see if the 2% loss was mostly single packets or chunks
* Loss Rate in intervals: The GUI can show if losses happened in isolated periods or continuous
* You can filter calls by patterns like "calls with > X consecutive packets lost" or "calls with Y% total loss"


This burst analysis is important because of its disproportionate effect on voice quality.
Full list of available return parameters for custom_login:


=== MOS Calculation in VoIPmonitor ===
<div class="mw-collapsible mw-collapsed">
<syntaxhighlight lang="text">
username, name, id, id_group, group_name, is_admin, email
enable_sensors - array of sensor IDs user can access
can_cdr, can_write_cdr, can_play_audio, can_download_audio
can_listen_active_call, can_pcap, can_upload_pcap
can_messages, can_view_content_message, can_graphs
can_activecalls, can_register, can_sip_msg, can_livesniffer
can_capture_rules, can_audit, can_alerts_edit, can_reports_edit
can_dashboard, can_ipacc, can_mtr, can_sensors_operations
can_network, can_edit_codebooks, hide_license_information
ip, number, domain, vlan (user restrictions)
custom_headers_cdr, custom_headers_message
blocked, blocked_reason, req_2fa
</syntaxhighlight>
</div>


VoIPmonitor calculates parametric MOS values:
== See Also ==
* It defaults to assuming G.711 codec for MOS, yielding max MOS 4.5 on that scale
* If you configure the sniffer with actual codec info, it can adjust
* It uses the E-model, meaning the MOS reflects network impairments only
* It provides 3 MOS scores (F1, F2, Adapt), which is relatively unique


The manual notes the MOS should not be taken as a definitive absolute, but for filtering and indicating potentially bad calls.
* [[Call_Detail_Record_-_CDR#Filter_Form_button|CDR Filter Form]] - GUI filter options
* [[Active_calls|Active Calls]] - Active calls monitoring
* [[User_Management|User Management]] - User permissions
* [[Google_Sign_in_usage|Google Sign-In]] - Google OAuth setup
* [[Microsoft_Sign_in_usage|Microsoft Sign-In]] - Azure AD setup


=== Clipping and Silence Detection ===


This is a particularly interesting feature of VoIPmonitor that goes beyond standard packet stats. VoIPmonitor can detect '''audio clipping''' and '''silence periods''' if enabled.
== AI Summary for RAG ==


'''Clipping detection:''' Clipping is typically detected by checking if the audio signal hits the maximum amplitude for an abnormal duration. VoIPmonitor counts such occurrences. In the filter options, you can search for calls with a certain number of clipped frames. This is valuable because packet stats wouldn't show that – to the network, everything was delivered fine, but the audio quality was bad due to source clipping or distortion.
'''Summary:''' VoIPmonitor Web API reference with two main APIs: (1) HTTP API 2 at <code>/php/api.php</code> using user/password authentication for tasks like getVoipCalls, getVoiceRecording, getPCAP, listActiveCalls, getShareURL, handleActiveCall, reportSummary; (2) CDR HTTP API at <code>/php/model/sql.php</code> using session cookies with full GUI filter support. Critical: correct endpoint path is <code>/php/api.php</code> (not <code>/api.php</code>). reportSummary requires pre-configured reports in GUI. Custom login via <code>scripts/custom_login.php</code> enables LDAP/SSO - requires unique numeric ID per user to avoid shared settings. Antivirus may delete scripts with shell_exec - disable with DISABLE_ANTIVIRUS or move to external file. Branding: BRAND_NAME for product name, BRAND_SHARESITE/BRAND_DOMAIN in brand.php for custom share domain.
 
'''Silence period detection:''' VoIPmonitor can also measure the percentage of the call that was silence. You can filter calls by e.g. "calls that had > 50% silence". This could identify one-way audio issues (if one side was silent most of the call, maybe that side's audio didn't get through).
 
=== Proactive Quality Management ===
 
How a telecom professional would use VoIPmonitor day-to-day:
 
* '''Automated Alerts:''' Set thresholds, e.g., alert if MOS falls below 3 for more than X calls in an hour, or if packet loss > 5% on any call
* '''Drill-down troubleshooting:''' When a complaint comes in about a specific call, find that call in VoIPmonitor and examine its detailed stats – jitter graph, loss stats, MOS values, silence percentage, clipping count
* '''Long-term reports:''' See trends like average MOS per hour or per trunk
* '''Compare jitter buffer effects:''' If many calls show MOS F1 << MOS F2, perhaps endpoints are using too small jitter buffer
 
Finally, an advantage of VoIPmonitor is that it allows '''listening to call audio''' (it can save and even decode to WAV). Numbers and graphs are great, but listening is the ultimate troubleshooting tool: you can literally hear what the user heard.
 
== Conclusion ==
 
Delivering high voice quality over VoIP requires understanding and managing many technical factors. Traditional impairments like loudness, noise, and echo still matter in VoIP calls, while network-induced issues like jitter, packet loss, and delay play a dominant role in IP telephony. We have explored each of these factors in detail and seen how they map to measurable parameters and thresholds (often informed by ITU-T standards such as P.800, G.114, G.107, etc.).
 
To quantify call quality, metrics like MOS provide a bridge between engineering parameters and user perception. Subjective MOS gathered from real listeners is the ultimate benchmark, but objective models – from the signal-based PESQ/POLQA to the parameter-based E-model – enable ongoing assessment of calls without human intervention.
 
We then focused on VoIPmonitor, illustrating how a modern VoIP monitoring tool applies these concepts. VoIPmonitor passively captures calls and computes a wealth of quality data: jitter distributions, loss patterns, and even audio-based analyses like clipping detection. Its implementation of the E-model yields instant MOS estimates that factor in jitter buffer effects, helping operators pinpoint whether issues stem from network timing, loss, or other sources.
 
Tools like VoIPmonitor are invaluable in this landscape. They implement the theoretical knowledge (from ITU-T and other research) into practical systems that watch over the network 24/7. By using VoIPmonitor on a VoIP network, one can rapidly correlate objective data with subjective quality – essentially bridging the gap between packet-level events and the human opinion of call clarity.
 
Armed with the comprehensive coverage of concepts in this guide, technical professionals can better design, troubleshoot, and optimize their VoIP systems. From understanding why a certain call sounded bad (maybe too much jitter causing late packets) to justifying upgrades and QoS policies (by showing improved MOS scores after changes), the knowledge here and the capabilities of monitoring tools will together ensure that voice quality remains high. After all, regardless of technology, clear and reliable communication is the ultimate goal.
 
== External Links ==
 
* [https://www.itu.int/rec/T-REC-G.107 ITU-T G.107 - The E-model]
* [https://www.itu.int/rec/T-REC-G.114 ITU-T G.114 - One-way transmission time]
* [https://www.itu.int/rec/T-REC-P.800 ITU-T P.800 - Methods for subjective determination of transmission quality]
* [https://www.itu.int/rec/T-REC-P.862 ITU-T P.862 - PESQ algorithm]
* [https://www.itu.int/rec/T-REC-P.863 ITU-T P.863 - POLQA algorithm]
* [[Documentation|VoIPmonitor Documentation]]
 
== See Also ==


* [[Codec|Voice Codecs]]
'''Keywords:''' web api, http api, api.php, sql.php, getVoipCalls, getVoiceRecording, getPCAP, getShareURL, listActiveCalls, reportSummary, handleActiveCall, CDR filter, custom login, LDAP, authentication, session, curl, JSON, branding, BRAND_NAME, BRAND_SHARESITE, antivirus, DISABLE_ANTIVIRUS, unique user id, uidnumber


[[Category:Voice Quality]]
'''Key Questions:'''
[[Category:VoIP]]
* What is the correct VoIPmonitor API endpoint path?
[[Category:Technical Reference]]
* How do I retrieve CDRs via the VoIPmonitor API?
[[Category:Monitoring]]
* How to download PCAP or voice recordings via API?
* How to list active calls via API?
* How to generate reports via API?
* How to configure LDAP authentication for VoIPmonitor?
* Why do LDAP users share settings? (need unique ID)
* Why is custom_login.php being deleted? (antivirus)
* How to customize VoIPmonitor branding?
* What's the difference between api.php and sql.php?
* How to use wildcards in API filter parameters?

Latest revision as of 12:16, 19 January 2026

VoIPmonitor Web API

This page documents VoIPmonitor's Web APIs for programmatic access to CDR data, recordings, active calls, and system functions.

Overview

API Endpoint Authentication Use Case
HTTP API 2 /php/api.php user/password params CDR queries, recordings, PCAP, active calls
CDR HTTP API /php/model/sql.php Session cookie Advanced CDR filtering with GUI filters

⚠️ Warning: Correct path is critical: Always use /php/api.php (not /api.php). Missing the /php/ directory causes "action parameter missing" errors.

HTTP API 2

Preferred API for programmatic access. Requests via HTTP POST or GET to /php/api.php.

Authentication

Include user and password parameters in every request:

curl 'http://server/php/api.php?task=getVoipCalls&user=USER&password=PASSWORD&params=...'

ℹ️ Note: API authentication uses local VoIPmonitor database credentials. It does NOT use LDAP or custom_login.php.

Rate Limiting: Enable in GUI > Settings > System Configuration > "API maximal concurrent connections".

getVoipCalls

Retrieve CDR records by search criteria.

Parameters:

Parameter Description
startTime Calls starting >= this time (required)
startTimeTo Calls starting <= this time
callEnd Calls ending <= this time
caller / called Caller/called number
callId SIP Call-ID
cdrId Database ID
id_sensor Sensor number
msisdn Match caller OR called (instead of AND)
onlyConnected 1 = only answered calls
customHeaders Comma-separated custom header names to return

Examples:

# HTTP GET - by time range and caller
curl 'http://server/php/api.php?task=getVoipCalls&user=USER&password=PASSWORD&params={"startTime":"2024-01-01","startTimeTo":"2024-01-31","caller":"123456"}'

# HTTP POST - by Call-ID
echo '{"task":"getVoipCalls","user":"USER","password":"PASSWORD","params":{"startTime":"2024-01-01","callId":"abc123"}}' | curl -X POST -d @- http://server/php/api.php

# With custom headers
curl -G 'http://server/php/api.php?task=getVoipCalls&user=USER&password=PASSWORD' \
  --data-urlencode 'params={"startTime":"2024-01-01","caller":"123","customHeaders":"X-Custom-Header"}'

Response fields:

Field Description
cdrId Database CDR ID
id_sensor Sensor ID
calldate / callend Call start/end time
duration / connect_duration Total / connected duration
caller / called Caller/called numbers
sipcallerip / sipcalledip SIP endpoint IPs
codec_a / codec_b Audio codecs
lastSIPresponseNum (New in 2026.1) Final SIP response code (e.g., 200, 404, 503)
callId SIP Call-ID

getVoiceRecording

Download audio recording (WAV format).

Parameters:

Parameter Description
cdrId Database CDR ID (preferred)
callId SIP Call-ID
calldate Date hint (default: today)
zip true = return ZIP archive
ogg true = return OGG format
saveaudio_afterconnect "yes" = audio only after call connect

Examples:

# By CDR ID
curl 'http://server/php/api.php?task=getVoiceRecording&user=USER&password=PASSWORD&params={"cdrId":12345}' > call.wav

# By Call-ID
curl 'http://server/php/api.php?task=getVoiceRecording&user=USER&password=PASSWORD&params={"callId":"abc123"}' > call.wav

# Multiple recordings (returns ZIP)
echo '{"task":"getVoiceRecording","user":"USER","password":"PASSWORD","params":{"cdrId":[6,7]}}' | php api.php > calls.zip

getPCAP

Download PCAP file. Automatically merges multiple legs and returns ZIP if needed.

Parameters:

Parameter Description
cdrId Database CDR ID
callId SIP Call-ID
cidInterval Time window (seconds) to search for matching Call-ID
cidMerge true = merge multiple legs
zip true = force ZIP output

Examples:

# By CDR ID
curl 'http://server/php/api.php?task=getPCAP&user=USER&password=PASSWORD&params={"cdrId":"12345"}' > call.pcap

# By Call-ID with merge
curl -G 'http://server/php/api.php?task=getPCAP&user=USER&password=PASSWORD' \
  --data-urlencode 'params={"callId":"abc123","cidInterval":60,"cidMerge":true}'

💡 Tip: Use curl -G --data-urlencode for complex JSON params to avoid encoding issues.

listActiveCalls

Get currently active calls from sensor.

Parameters: sensorId (optional), callId (optional)

curl 'http://server/php/api.php?task=listActiveCalls&user=USER&password=PASSWORD&params={"sensorId":"1"}'

ℹ️ Note: IP addresses in response are integers. Convert with INET_NTOA() in MySQL or equivalent in your code.

handleActiveCall

Pause/unpause RTP recording on active call.

Parameters:

  • sensorId - Sensor number
  • command - pausecall or unpausecall
  • callRef - Call reference from listActiveCalls
curl 'http://server/php/api.php?task=handleActiveCall&user=USER&password=PASSWORD&params={"sensorId":"1","command":"pausecall","callRef":"0x7f0e4c3c2680"}'

getShareURL

Generate public shareable link for a CDR.

Parameters:

Parameter Description
callId or cdrId Call identifier
cidInterval Time window for Call-ID search
rtp true = include RTP data
sip_history true = SIP history only (no RTP)
anonIps true = anonymize IPs
validDays Link validity in days
curl 'http://server/php/api.php?task=getShareURL&user=USER&password=PASSWORD&params={"callId":"abc123","rtp":true,"validDays":15}'

reportSummary

Generate pre-configured summary report as image or JSON.

⚠️ Warning: Reports must be created in GUI → Reports → Daily Report BEFORE using the API. The report_name parameter must match the report's Description field.

Parameters:

  • report_name - Description field from GUI report
  • datetime_from / datetime_to - Date range
  • json - true = return JSON instead of image
curl 'http://server/php/api.php?task=reportSummary&user=USER&password=PASSWORD&params={"report_name":"Daily Summary","datetime_from":"2024-01-01","datetime_to":"2024-01-31"}'

Other Tasks

Task Description
listCdrIds List CDR IDs with basic info. Params: offset, size
getAudioGraph Get spectrogram/waveform image. Params: cdrId, type (S/P/ALL), side (L/R)

CDR HTTP API

Session-based API using GUI filter parameters. Requires login first.

Authentication

# Step 1: Login and get session
curl -X POST 'http://server/php/model/sql.php?module=bypass_login&user=USER&pass=PASSWORD'
# Returns: {"SID":"abc123...","cookie_name":"PHPSESSID","success":true}

# Step 2: Use session cookie for requests
curl -X POST --cookie "PHPSESSID=abc123..." \
  'http://server/php/model/sql.php?task=LISTING&module=CDR&fdatefrom=2024-01-01T00:00:00&fcaller=123'

ℹ️ Note: If GUI is in subdirectory (e.g., /demo), cookie name changes to PHPSESSID-demo.

Disable authorization: GUI > Settings > System Configuration > "Disable authorization for API usage"

Filter Parameters

Mandatory: task=LISTING, module=CDR, and at least one of fdatefrom or fdateto.

Parameter Description Example
fdatefrom / fdateto Date range 2024-01-01T00:00:00
fcaller / fcalled Caller/called number 123456
fcallerd_type 0=OR, 1=AND 1
fcaller_domain / fcalled_domain SIP domain sip.example.com
fsipcallerip / fsipcalledip SIP IP address 192.168.1.1
fcallid SIP Call-ID abc123
fsipresponse SIP response code 503
fdurationgt / fdurationlt Duration (seconds) 10
fsensor_id Sensor ID 1
fcodec Codec numbers (comma-sep) 0,8
suppress_results 1=count only 1

Paging: page, start, limit

Sorting: sort=[{"property":"calldate","direction":"DESC"}]

⚠️ Warning: Use = (equals) for parameters, not : (colon). Example: fcallerd_type=1 (correct), not fcallerd_type:1 (wrong).

Wildcards

Use %25 (URL-encoded %) for SQL LIKE patterns:

# Caller starting with "00"
curl -G 'http://server/php/api.php?task=getVoipCalls&user=USER&password=PASSWORD' \
  --data-urlencode 'params={"startTime":"2024-01-01","caller":"%2500%25"}'

Codec Values

Value Codec Value Codec
0 PCMU (G.711 μ-law) 8 PCMA (G.711 A-law)
3 GSM 9 G.722
4 G.723 12 QCELP
18 G.729 97 iLBC
98 Speex 301-305 SILK variants
306-308 iSAC variants 1000 T.38 (Fax)

RTP Quality Filters

Parameter Description
fmosf1, fmosf2, fmosadapt MOS score filters
frtcp_maxjitter, frtcp_avgjitter RTCP jitter
frtcp_maxfr, frtcp_avgfr RTCP frame rate
floss1 - floss10 Packet loss distribution
f_d50, f_d70, f_d90, ... Delay distribution (50ms, 70ms, etc.)

Utility Endpoints

Direct PCAP Download

curl 'http://server/php/pcap.php?id=12345'           # Full PCAP
curl 'http://server/php/pcap.php?id=12345&disable_rtp=1'  # SIP only

CDR URL for Browser

http://server/admin.php?cdr_filter={fcallid:"abc123"}

SIP History

Requires session authentication first.

# JSON data
curl --cookie "PHPSESSID=..." 'http://server/php/pcap2text.php?action=brief_data&id=12345'

# HTML table
curl --cookie "PHPSESSID=..." 'http://server/php/pcap2text.php?action=brief&id=12345'

# MSC diagram
curl --cookie "PHPSESSID=..." 'http://server/php/pcap2text.php?action=getMSC&id=12345'

License Check

# Basic license check
curl 'http://server/php/apilicensecheck.php?task=licenseCheck'

# Concurrent calls limit check
curl 'http://server/php/apilicensecheck.php?task=licenseCallsLimitCheck'

Share CDR Configuration

Share Link Types

Use /php/model/utilities.php with task=shareCdr:

Type subType Description
Local Public self_protected_link Password-protected local link
Local Private self_login_link Requires GUI login
voipmonitor.org share_link Public via share.voipmonitor.org

Custom Branding

Product Name: Edit config/system_configuration.php:

define('BRAND_NAME', 'YourCompany');

Share Domain: Edit brand.php to use your domain instead of share.voipmonitor.org:

define('BRAND_SHARESITE', 'share.yourdomain.com');
define('BRAND_DOMAIN', 'yourdomain.com');

Custom Login (LDAP/SSO)

Custom authentication via scripts/custom_login.php. This applies to GUI login only, NOT to API authentication.

Basic Structure

<?php
function custom_login($user, $password) {
    // Authenticate against external system (LDAP, etc.)
    // ...

    return array(
        'username' => $user,
        'id' => $uniqueNumericId,  // REQUIRED: Must be unique per user!
        'is_admin' => false,
        'id_group' => 1,           // Optional: GUI group ID
        'enable_sensors' => array(1,2,3)  // Optional: restrict to sensors
    );
}
?>

⚠️ Warning: The id field MUST be unique per user. If multiple users return the same ID, they share ALL settings (timezone, dashboard, etc.). Use LDAP uidnumber or crc32($email) for uniqueness.

LDAP Example

See scripts/ldap_custom_login_example.php in GUI directory.

Requirements: php-ldap package installed.

Debug from CLI:

cd /var/www/html/scripts/
php custom_login.php

Enable debug by uncommenting the debug block at top of script.

Troubleshooting

Users share settings: Return unique id per user (see warning above).

LDAP users can't view CDRs: Ensure is_admin => false and correct id_group is returned.

Script being deleted: GUI antivirus deletes scripts with shell_exec(), exec(), etc.

Solutions:

  1. Disable antivirus in config/system_configuration.php:
define('DISABLE_ANTIVIRUS', true);
  1. Or move shell commands to external file outside web directory and require_once() it.

Azure AD / Microsoft SSO

For Microsoft Entra ID integration, see Microsoft_Sign_in_usage. For Google, see Google_Sign_in_usage.

Return Array Parameters

Full list of available return parameters for custom_login:

username, name, id, id_group, group_name, is_admin, email
enable_sensors - array of sensor IDs user can access
can_cdr, can_write_cdr, can_play_audio, can_download_audio
can_listen_active_call, can_pcap, can_upload_pcap
can_messages, can_view_content_message, can_graphs
can_activecalls, can_register, can_sip_msg, can_livesniffer
can_capture_rules, can_audit, can_alerts_edit, can_reports_edit
can_dashboard, can_ipacc, can_mtr, can_sensors_operations
can_network, can_edit_codebooks, hide_license_information
ip, number, domain, vlan (user restrictions)
custom_headers_cdr, custom_headers_message
blocked, blocked_reason, req_2fa

See Also


AI Summary for RAG

Summary: VoIPmonitor Web API reference with two main APIs: (1) HTTP API 2 at /php/api.php using user/password authentication for tasks like getVoipCalls, getVoiceRecording, getPCAP, listActiveCalls, getShareURL, handleActiveCall, reportSummary; (2) CDR HTTP API at /php/model/sql.php using session cookies with full GUI filter support. Critical: correct endpoint path is /php/api.php (not /api.php). reportSummary requires pre-configured reports in GUI. Custom login via scripts/custom_login.php enables LDAP/SSO - requires unique numeric ID per user to avoid shared settings. Antivirus may delete scripts with shell_exec - disable with DISABLE_ANTIVIRUS or move to external file. Branding: BRAND_NAME for product name, BRAND_SHARESITE/BRAND_DOMAIN in brand.php for custom share domain.

Keywords: web api, http api, api.php, sql.php, getVoipCalls, getVoiceRecording, getPCAP, getShareURL, listActiveCalls, reportSummary, handleActiveCall, CDR filter, custom login, LDAP, authentication, session, curl, JSON, branding, BRAND_NAME, BRAND_SHARESITE, antivirus, DISABLE_ANTIVIRUS, unique user id, uidnumber

Key Questions:

  • What is the correct VoIPmonitor API endpoint path?
  • How do I retrieve CDRs via the VoIPmonitor API?
  • How to download PCAP or voice recordings via API?
  • How to list active calls via API?
  • How to generate reports via API?
  • How to configure LDAP authentication for VoIPmonitor?
  • Why do LDAP users share settings? (need unique ID)
  • Why is custom_login.php being deleted? (antivirus)
  • How to customize VoIPmonitor branding?
  • What's the difference between api.php and sql.php?
  • How to use wildcards in API filter parameters?