Whisper
Integrate OpenAI's Whisper ASR with VoIPmonitor for on-demand or automatic call transcription.
Overview
VoIPmonitor supports Whisper, a speech recognition system trained on 680,000 hours of multilingual data. Two integration modes are available:
| Mode | Location | Use Case |
|---|---|---|
| On-Demand | GUI server | User clicks "Transcribe" on individual calls |
| Automatic | Sensor | All calls transcribed automatically after ending |
Whisper Engines
| Engine | Pros | Cons | Recommended For |
|---|---|---|---|
| whisper.cpp (C++) | Fast, low resource usage, CUDA support (30x speedup) | Requires compilation | Server-side processing |
| OpenAI Whisper (Python) | Easy install (pip install) |
Slower, requires ffmpeg | Quick testing |
💡 Tip: Use whisper.cpp for production deployments. It's significantly faster and supports GPU acceleration.
Quick Start: GUI On-Demand (No Compilation)
The simplest setup - download a pre-built model and start transcribing immediately.
# Download model to GUI bin directory
wget https://download.voipmonitor.org/whisper/ggml-base.bin -O /var/www/html/bin/ggml-base.bin
# Set ownership (Debian/Ubuntu)
chown www-data:www-data /var/www/html/bin/ggml-base.bin
# For RedHat/CentOS, use: chown apache:apache
The "Transcribe" button now appears on call detail pages. No configuration changes needed.
GUI On-Demand: Advanced Setup
For custom model paths or using the Python engine.
Option 1: whisper.cpp with Custom Model
# Compile whisper.cpp
git clone https://github.com/ggerganov/whisper.cpp.git
cd whisper.cpp && make -j
# Download model
./models/download-ggml-model.sh base.en
Configure /var/www/html/config/configuration.php:
define('WHISPER_NATIVE', true);
define('WHISPER_MODEL', '/path/to/whisper.cpp/models/ggml-base.en.bin');
define('WHISPER_THREADS', 4); // Optional
Option 2: OpenAI Whisper (Python)
pip install openai-whisper
apt install ffmpeg # or dnf install ffmpeg
Configure /var/www/html/config/configuration.php:
define('WHISPER_MODEL', '/opt/whisper_models/small.pt');
define('WHISPER_THREADS', 4);
Automatic Transcription (Sniffer)
Transcribe all calls automatically on the sensor after they end.
Basic Configuration
Edit /etc/voipmonitor.conf:
# Enable transcription
audio_transcribe = yes
# Using whisper.cpp (recommended)
whisper_native = yes
whisper_model = /path/to/whisper.cpp/models/ggml-small.bin
# OR using Python (slower)
# whisper_native = no
# whisper_model = small
Restart: systemctl restart voipmonitor
Configuration Parameters
| Parameter | Default | Description |
|---|---|---|
audio_transcribe |
no | Enable/disable transcription |
audio_transcribe_connect_duration_min |
10 | Minimum call duration (seconds) to transcribe |
audio_transcribe_threads |
2 | Concurrent transcription jobs |
audio_transcribe_queue_length_max |
100 | Max queue size |
whisper_native |
no | Use whisper.cpp (yes) or Python (no)
|
whisper_model |
small | Model name (Python) or absolute path to .bin file (whisper.cpp) |
whisper_language |
auto | Language code (en, de), auto, or by_number
|
whisper_threads |
2 | CPU threads per transcription job |
whisper_timeout |
300 | Timeout in seconds (Python only) |
whisper_deterministic_mode |
yes | Consistent results (Python only) |
whisper_python |
- | Custom Python binary path (Python only) |
whisper_native_lib |
- | Path to libwhisper.so (advanced) |
Advanced: CUDA GPU Acceleration
Compile whisper.cpp with NVIDIA CUDA for up to 30x speedup.
# Install CUDA toolkit (see nvidia.com/cuda-downloads)
# Add to ~/.bashrc:
export PATH=/usr/local/cuda/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
# Compile with CUDA
cd /path/to/whisper.cpp
make clean
WHISPER_CUDA=1 make -j
WHISPER_CUDA=1 make libwhisper.so -j
Advanced: Loadable Module
Use whisper.cpp as a separate library (update without recompiling sniffer):
# Build libraries
cd /path/to/whisper.cpp
make libwhisper.so -j
make libwhisper.a -j
# Optional: Install system-wide
ln -s $(pwd)/whisper.h /usr/local/include/whisper.h
ln -s $(pwd)/libwhisper.so /usr/local/lib64/libwhisper.so
Configure in voipmonitor.conf:
whisper_native_lib = /path/to/whisper.cpp/libwhisper.so
Troubleshooting
Model Download Fails
Test connectivity:
curl -I https://download.voipmonitor.org/whisper/ggml-base.bin
If blocked:
- Check firewall:
iptables -L -v -n,ufw status - Check proxy: Set
HTTP_PROXY/HTTPS_PROXYenvironment variables - Check DNS:
nslookup download.voipmonitor.org
Workaround: Download manually on another machine and copy via SCP.
Testing from CLI
/var/www/html/bin/vm --audio-transcribe='/tmp/audio.wav {}' \
--json_config='[{"whisper_native":"yes"},{"whisper_model":"/path/to/ggml-small.bin"}]' \
-v1,whisper
AI Summary for RAG
Summary: VoIPmonitor integrates Whisper ASR for call transcription via two modes: on-demand (GUI button) and automatic (sniffer background processing). Two engines available: whisper.cpp (C++, recommended, fast, CUDA support) and OpenAI Whisper (Python, easier install). Quick start: download pre-built model from https://download.voipmonitor.org/whisper/ggml-base.bin to /var/www/html/bin/, set ownership to www-data. Sniffer config: enable audio_transcribe=yes and whisper_native=yes with absolute path to model in whisper_model. Key parameters: audio_transcribe_connect_duration_min (min call length), whisper_threads (CPU threads), whisper_language (auto/code/by_number). CUDA acceleration available for whisper.cpp (30x speedup).
Keywords: whisper, transcription, asr, speech to text, openai, whisper.cpp, audio_transcribe, whisper_native, whisper_model, cuda, gpu, ggml-base.bin, libwhisper.so, automatic transcription, on-demand
Key Questions:
- How do I enable call transcription in VoIPmonitor?
- What is the quickest way to enable Whisper transcription?
- How do I download the Whisper model for the GUI?
- What is the difference between whisper.cpp and OpenAI Whisper?
- How do I configure automatic transcription on the sniffer?
- What parameters control Whisper transcription behavior?
- How do I enable GPU acceleration for Whisper?
- Why is the model download failing and how do I fix it?
- How do I test Whisper transcription from the command line?