Understanding the RTP Protocol: Difference between revisions

From VoIPmonitor.org
No edit summary
(No difference)

Revision as of 00:13, 12 December 2025

The Real-time Transport Protocol (RTP) is an Internet-standard transport protocol for real-time audio, video, and other time-sensitive data transmission, defined in RFC 3550. This comprehensive guide covers all essential aspects of RTP including packet structure, RTCP, profiles, signaling integration, and security.

Quick Navigation

Protocol Fundamentals Control & Feedback Session Management Advanced Topics

Introduction

RTP provides end-to-end delivery services for media streams, including:

  • Sequence numbering - detect packet loss and reorder packets
  • Timestamping - enable jitter calculation and synchronization
  • Payload type identification - identify media encoding
  • Monitoring - via companion RTCP protocol

RTP is typically used on top of UDP for its low overhead and latency. Unlike TCP, RTP/UDP does not guarantee delivery or ordering, nor does it provide congestion control or QoS guarantees. Instead, it tolerates some packet loss and reordering as a trade-off for timely delivery - a lost packet is preferable to a delayed one in interactive communications.

RTP Key Characteristics
Feature Description
Transport UDP (typically), can use other transports
Reliability No guarantee - tolerates loss for timeliness
Companion Protocol RTCP for control, feedback, and monitoring
Profiles Extensible via profiles (AVP, AVPF, SAVP, etc.)
Scalability Supports unicast and multicast, one-to-one to large conferences
Session Setup External (SIP/SDP, H.323, WebRTC signaling)

RTP Packet Structure

Every RTP packet consists of a header followed by a payload (the media data). The fixed RTP header is 12 bytes long and contains fields that enable proper delivery and playback of real-time media.

RTP Header Fields

RTP Header Fields Reference
Field Bits Description
Version (V) 2 RTP version number. Always 2 for RFC 3550.
Padding (P) 1 If set, packet contains padding bytes at the end. Last byte indicates padding count.
Extension (X) 1 If set, header extension follows CSRC list.
CSRC Count (CC) 4 Number of CSRC identifiers (0-15).
Marker (M) 1 Profile-specific meaning. Video: last packet of frame. Audio: start of talkspurt.
Payload Type (PT) 7 Media format identifier. Static (0-95) or dynamic (96-127).
Sequence Number 16 Increments by 1 per packet. Starts random. Wraps at 65535.
Timestamp 32 Sampling instant of first byte. Clock rate depends on payload format.
SSRC 32 Synchronization source identifier. Randomly chosen, unique per session.
CSRC List 0-480 Contributing source IDs (used by mixers). Up to 15 × 32-bit IDs.

Example RTP Header:

V=2, P=0, X=0, CC=0, M=1, PT=96, Seq=12345, Timestamp=0x30551980, SSRC=0x1A2B3C4D

This indicates: RTP version 2, no padding/extension, no CSRCs, marker set (end of frame), payload type 96 (dynamic - e.g., H.264), sequence 12345, with SSRC 0x1A2B3C4D.

Header Extensions

RTP allows optional header extensions when the X bit is set. RFC 5285/8285 introduced one-byte and two-byte extension formats allowing multiple elements:

Extension Type Description Negotiation
One-byte header Up to 14 extension elements, each 1-16 bytes SDP a=extmap
Two-byte header Larger extensions, more flexibility SDP a=extmap
Common uses Audio levels, video orientation, timing info Signaling agreement

RTP Control Protocol (RTCP)

RTCP is defined alongside RTP in RFC 3550 and provides:

  • Quality of Service feedback - packet loss, jitter, round-trip time
  • Inter-media synchronization - correlate audio/video timestamps via NTP
  • Participant identification - CNAME and other source descriptors
  • Session control - keep-alive, goodbye notifications

RTCP Packet Types

RTCP Packet Types (RFC 3550)
Type Code Name Sender Contents
SR 200 Sender Report Active senders NTP/RTP timestamps, packet/byte counts, reception reports
RR 201 Receiver Report Non-senders Fraction lost, cumulative loss, jitter, LSR, DLSR
SDES 202 Source Description All participants CNAME (required), NAME, EMAIL, PHONE, LOC, TOOL, NOTE
BYE 203 Goodbye Leaving participant SSRC of departing stream, optional reason
APP 204 Application-specific Application-defined Custom data for experimental features

RTCP Compound Packet Structure

Per RFC 3550, each RTCP compound packet must:

  1. Start with SR or RR
  2. Include SDES with at least CNAME
  3. Optionally include BYE, APP, or other packets
+--------+--------+--------+--------+
|   SR or RR (required first)       |
+--------+--------+--------+--------+
|   SDES with CNAME (required)      |
+--------+--------+--------+--------+
|   BYE (optional)                  |
+--------+--------+--------+--------+
|   APP (optional)                  |
+--------+--------+--------+--------+

RTCP Bandwidth Control

Parameter Value Description
Total RTCP bandwidth ~5% of session bandwidth Prevents control overhead from dominating
Senders share 25% of RTCP (1.25% total) For SR packets
Receivers share 75% of RTCP (3.75% total) For RR packets
Minimum interval 5 seconds (AVP profile) Between RTCP reports
Scaling Randomized, participant-based Adapts to session size

Receiver Report Fields

RTCP Receiver Report Block Fields
Field Size Description
SSRC of source 32 bits Which stream this report is about
Fraction lost 8 bits Packets lost / packets expected since last RR (0-255 = 0%-100%)
Cumulative lost 24 bits Total packets lost since session start
Extended highest seq 32 bits Highest sequence number received (with rollover)
Interarrival jitter 32 bits Statistical variance of packet inter-arrival time
Last SR (LSR) 32 bits Middle 32 bits of NTP timestamp from last SR
Delay since last SR (DLSR) 32 bits Time between receiving SR and sending this RR

RTP Profiles and Payload Types

RTP profiles define how the protocol is used for specific applications:

Common RTP Profiles
Profile RFC Description
RTP/AVP 3551 Audio/Video Profile - standard A/V conferencing
RTP/AVPF 4585 AVP with Feedback - immediate RTCP feedback (PLI, NACK)
RTP/SAVP 3711 Secure AVP - SRTP encryption and authentication
RTP/SAVPF 5124 Secure AVPF - SRTP with feedback

Static Payload Types (AVP Profile)

Common Static Payload Types
PT Encoding Clock Rate Type
0 PCMU (G.711 μ-law) 8000 Hz Audio
3 GSM 8000 Hz Audio
4 G723 8000 Hz Audio
8 PCMA (G.711 A-law) 8000 Hz Audio
9 G722 8000 Hz Audio
18 G729 8000 Hz Audio
31 H261 90000 Hz Video
32 MPV (MPEG-1/2 Video) 90000 Hz Video
34 H263 90000 Hz Video
96-127 Dynamic Negotiated Any

Dynamic Payload Type Negotiation

Dynamic payload types (96-127) are negotiated via SDP:

m=audio 4000 RTP/AVP 96 97
a=rtpmap:96 opus/48000/2
a=rtpmap:97 telephone-event/8000
a=fmtp:97 0-16

m=video 4002 RTP/AVP 98
a=rtpmap:98 H264/90000
a=fmtp:98 profile-level-id=42e01f

Session Setup and Signaling

RTP does not provide session establishment - this is handled by external signaling protocols.

Transport Ports

Convention RTP Port RTCP Port Notes
Traditional Even number (e.g., 4000) RTP + 1 (e.g., 4001) Separate ports
RTCP Mux Same as RTP Same as RTP SDP: a=rtcp-mux
Demultiplexing PT 0-127 PT >= 200 (RTCP types) By first byte

SDP Media Description

v=0
o=alice 2890844526 2890844526 IN IP4 10.1.1.5
s=VoIP Call
c=IN IP4 10.1.1.5
t=0 0
m=audio 4000 RTP/AVP 0 8 96
a=rtpmap:0 PCMU/8000
a=rtpmap:8 PCMA/8000
a=rtpmap:96 opus/48000/2
a=rtcp-mux
a=sendrecv

Mixers and Translators

RTP supports intermediate nodes for complex topologies:

Mixer vs Translator Comparison
Feature Translator Mixer
SSRC handling Preserves original SSRCs Creates new SSRC, lists originals in CSRC
Stream output Separate streams forwarded Single combined stream
Use cases Relay, transcoding, firewall traversal Audio conferencing, video compositing
Timing Maintains original timing Becomes timing master
Bandwidth Sum of all streams Single stream (reduced)
RTCP Forwards or adjusts Generates own SR, sends RR upstream

Translator Use Cases

  • Multicast-to-unicast relay
  • Encryption/decryption gateway
  • Codec transcoding
  • IPv4/IPv6 bridging
  • Firewall traversal

Mixer Use Cases

  • Audio conference bridges (mixing multiple speakers)
  • Video MCU (compositing multiple feeds)
  • Bandwidth reduction for large conferences

Security Considerations (SRTP)

SRTP (Secure RTP, RFC 3711) provides:

  • Confidentiality - AES encryption of payload
  • Integrity - HMAC-SHA1 authentication
  • Replay protection - sequence number tracking

SRTP Key Exchange Methods

Method RFC Description Usage
SDES 4568 Keys in SDP (deprecated) Legacy SIP
DTLS-SRTP 5764 In-band DTLS handshake WebRTC (mandatory)
ZRTP 6189 In-call DH exchange Oportunistic encryption
MIKEY 3830 Multimedia Internet KEYing Group scenarios

DTLS-SRTP

DTLS-SRTP (used in WebRTC) provides:

  • Perfect forward secrecy (DH key exchange)
  • Certificate fingerprint verification (via SDP)
  • Multiplexing on same port as RTP/RTCP/STUN

SDP attributes for DTLS-SRTP:

a=fingerprint:sha-256 AB:CD:EF:...
a=setup:actpass
a=rtcp-mux

Troubleshooting RTP

Common RTP Issues and Diagnostics
Issue Symptoms RTCP Indicator Solution
Packet loss Choppy audio, video artifacts High fraction lost in RR Check network path, QoS
High jitter Audio gaps, video stuttering High jitter value in RR Increase jitter buffer, check network
One-way audio Only one party hears No RTP received Check NAT, firewall, SDP IPs
No media Complete silence No RR/SR packets Verify signaling, check ports
Codec mismatch Garbled audio PT doesn't match expected Verify SDP negotiation
Clock drift A/V desync over time Compare SR timestamps Use RTCP for sync

Wireshark RTP Analysis

Key filters for RTP analysis:

rtp                          # All RTP packets
rtcp                         # All RTCP packets
rtp.ssrc == 0x1234abcd       # Specific stream
rtp.marker == 1              # Frame boundaries
rtcp.pt == 200               # Sender Reports
rtcp.pt == 201               # Receiver Reports

Telephony > RTP > RTP Streams - shows all streams with statistics

Quick Reference Tables

RTP Header Bit Layout

 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|V=2|P|X|  CC   |M|     PT      |       Sequence Number         |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                           Timestamp                           |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                             SSRC                              |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                          CSRC list                            |
|                             ....                              |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Clock Rates by Media Type

Media Type Typical Clock Rate Examples
Narrowband audio 8000 Hz G.711, G.729, GSM
Wideband audio 16000 Hz G.722.1, AMR-WB
Full-band audio 48000 Hz Opus
Video 90000 Hz H.264, VP8, H.265

Marker Bit Usage

Media Type M=1 Meaning Purpose
Video Last packet of frame Frame boundary detection
Audio (silence suppression) First packet after silence Talkspurt indication
RFC 4733 DTMF End of DTMF event Event boundary

References

Primary Standards

  • RFC 3550 - RTP: A Transport Protocol for Real-Time Applications
  • RFC 3551 - RTP Profile for Audio and Video Conferences (AVP)
  • RFC 3711 - The Secure Real-time Transport Protocol (SRTP)
  • RFC 4585 - Extended RTP Profile for RTCP-Based Feedback (AVPF)

Extensions and Updates

  • RFC 5761 - Multiplexing RTP Data and Control Packets on a Single Port
  • RFC 5764 - DTLS Extension to Establish Keys for SRTP
  • RFC 6051 - Rapid Synchronisation of RTP Flows
  • RFC 6222 - Guidelines for Choosing RTCP Canonical Names (CNAMEs)
  • RFC 8285 - A General Mechanism for RTP Header Extensions

Related Protocols

  • RFC 4566 - SDP: Session Description Protocol
  • RFC 3261 - SIP: Session Initiation Protocol
  • RFC 4733 - RTP Payload for DTMF Digits, Telephony Tones

External Resources