RTP, Real-time Transport Protocol

Description Glossary RFCs Publications Obsolete RFCs

Description:

Protocol suite: TCP/IP.
Protocol type:Application layer protocol.
Related protocol: RTCP, RTP Control Protocol.
Port:5004 (UDP).
MIME subtype:
SNMP MIBs: iso.org.dod.internet.mgmt.mib-2.rtpMIB (1.3.6.1.2.1.10.87).
iso.org.dod.internet.mgmt.mib-2.rohcRtpMIB (1.3.6.1.2.1.114).
Working groups: avt, Audio/Video Transport.
fecframe, FEC Framework.
Links: IANA: RTP parameters.
RTP at cs.columbia.edu

RFC 3550:

RTP provides end-to-end network transport functions suitable for applications transmitting real-time data, such as audio, video or simulation data, over multicast or unicast network services. RTP does not address resource reservation and does not guarantee quality-of-service for real-time services. The data transport is augmented by a control protocol (RTCP) to allow monitoring of the data delivery in a manner scalable to large multicast networks, and to provide minimal control and identification functionality. RTP and RTCP are designed to be independent of the underlying transport and network layers. The protocol supports the use of RTP-level translators and mixers.


MAC header IP header UDP header RTP message

RTP header, version 2:

0001020304050607 0809101112131415 1617181920212223 2425262728293031
Ver P X CC M PT Sequence Number
Timestamp
SSRC
CSRC [0..15] :::

Ver, Version. 2 bits.
RTP version number. Always set to 2.

P, Padding. 1 bit.
If set, this packet contains one or more additional padding bytes at the end which are not part of the payload. The last byte of the padding contains a count of how many padding bytes should be ignored. Padding may be needed by some encryption algorithms with fixed block sizes or for carrying several RTP packets in a lower-layer protocol data unit.

X, Extension. 1 bit.
If set, the fixed header is followed by exactly one header extension.

CC, CSRC count. 4 bits.
The number of CSRC identifiers that follow the fixed header.

M, Marker. 1 bit.
The interpretation of the marker is defined by a profile. It is intended to allow significant events such as frame boundaries to be marked in the packet stream. A profile may define additional marker bits or specify that there is no marker bit by changing the number of bits in the payload type field.

PT, Payload Type. 7 bits.
Identifies the format of the RTP payload and determines its interpretation by the application. A profile specifies a default static mapping of payload type codes to payload formats. Additional payload type codes may be defined dynamically through non-RTP means. An RTP sender emits a single RTP payload type at any given time; this field is not intended for multiplexing separate media streams.

PTNameTypeClock rate (Hz)Audio channelsReferences
0PCMUAudio80001 RFC 3551
11016Audio80001 RFC 3551
2G721Audio80001 RFC 3551
3GSMAudio80001 RFC 3551
4G723Audio80001 
5DVI4Audio80001 RFC 3551
6DVI4Audio160001 RFC 3551
7LPCAudio80001 RFC 3551
8PCMAAudio80001 RFC 3551
9G722Audio80001 RFC 3551
10L16Audio441002 RFC 3551
11L16Audio441001 RFC 3551
12QCELPAudio80001 
13CNAudio80001 RFC 3389
14MPAAudio90000 RFC 2250, RFC 3551
15G728Audio80001 RFC 3551
16DVI4Audio110251 
17DVI4Audio220501 
18G729Audio80001 
19reservedAudio   
20
-
24
     
25CellBVideo90000  RFC 2029
26JPEGVideo90000  RFC 2435
27     
28nvVideo90000  RFC 3551
29
30
     
31H261Video90000  RFC 2032
32MPVVideo90000  RFC 2250
33MP2TAudio/Video90000  RFC 2250
34H263Video90000  
35
-
71
     
72
-
76
reserved    RFC 3550
77
-
95
     
96
-
127
dynamic    RFC 3551
dynamicGSM-HRAudio80001 
dynamicGSM-EFRAudio80001 
dynamicL8Audiovariablevariable 
dynamicREDAudio   
dynamicVDVIAudiovariable1 
dynamicBT656Video90000  
dynamicH263-1998Video90000  
dynamicMP1SVideo90000  
dynamicMP2PVideo90000  
dynamicBMPEGVideo90000  

Sequence Number. 16 bits.
The sequence number increments by one for each RTP data packet sent, and may be used by the receiver to detect packet loss and to restore packet sequence. The initial value of the sequence number is random (unpredictable) to make known-plaintext attacks on encryption more difficult, even if the source itself does not encrypt, because the packets may flow through a translator that does.

Timestamp. 32 bits.
The timestamp reflects the sampling instant of the first octet in the RTP data packet. The sampling instant must be derived from a clock that increments monotonically and linearly in time to allow synchronization and jitter calculations. The resolution of the clock must be sufficient for the desired synchronization accuracy and for measuring packet arrival jitter (one tick per video frame is typically not sufficient). The clock frequency is dependent on the format of data carried as payload and is specified statically in the profile or payload format specification that defines the format, or may be specified dynamically for payload formats defined through non-RTP means. If RTP packets are generated periodically, the nominal sampling instant as determined from the sampling clock is to be used, not a reading of the system clock. As an example, for fixed-rate audio the timestamp clock would likely increment by one for each sampling period. If an audio application reads blocks covering 160 sampling periods from the input device, the timestamp would be increased by 160 for each such block, regardless of whether the block is transmitted in a packet or dropped as silent.

SSRC, Synchronization source. 32 bits.
Identifies the synchronization source. The value is chosen randomly, with the intent that no two synchronization sources within the same RTP session will have the same SSRC. Although the probability of multiple sources choosing the same identifier is low, all RTP implementations must be prepared to detect and resolve collisions. If a source changes its source transport address, it must also choose a new SSRC to avoid being interpreted as a looped source.

CSRC, Contributing source. 32 bits.
An array of 0 to 15 CSRC elements identifying the contributing sources for the payload contained in this packet. The number of identifiers is given by the CC field. If there are more than 15 contributing sources, only 15 may be identified. CSRC identifiers are inserted by mixers, using the SSRC identifiers of contributing sources. For example, for audio packets the SSRC identifiers of all sources that were mixed together to create a packet are listed, allowing correct talker indication at the receiver.


Glossary:

CSRC, Contributing source.
(RFC 1889) A source of a stream of RTP packets that has contributed to the combined stream produced by an RTP mixer. The mixer inserts a list of the SSRC identifiers of the sources that contributed to the generation of a particular packet into the RTP header of that packet. This list is called the CSRC list. An example application is audio conferencing where a mixer indicates all the talkers whose speech was combined to produce the outgoing packet, allowing the receiver to indicate the current talker, even though all the audio packets contain the same SSRC identifier.

End system.
(RFC 1889) An application that generates the content to be sent in RTP packets and/or consumes the content of received RTP packets. An end system can act as one or more synchronization sources in a particular RTP session, but typically only one.

Mixer.
(RFC 3550) An intermediate system that receives RTP packets from one or more sources, possibly changes the data format, combines the packets in some manner and then forwards a new RTP packet. Since the timing among multiple input sources will not generally be synchronized, the mixer will make timing adjustments among the streams and generate its own timing for the combined stream. Thus, all data packets originating from a mixer will be identified as having the mixer as their synchronization source.

Monitor.
(RFC 1889) An application that receives RTCP packets sent by participants in an RTP session, in particular the reception reports, and estimates the current quality of service for distribution monitoring, fault diagnosis and long-term statistics. The monitor function is likely to be built into the application(s) participating in the session, but may also be a separate application that does not otherwise participate and does not send or receive the RTP data packets. These are called third party monitors.

RTP packet.
(RFC 1889) A data packet consisting of the fixed RTP header, a possibly empty list of contributing sources, and the payload data. Some underlying protocols may require an encapsulation of the RTP packet to be defined. Typically one packet of the underlying protocol contains a single RTP packet, but several RTP packets may be contained if permitted by the encapsulation method.

RTP payload.
(RFC 1889) The data transported by RTP in a packet, for example audio samples or compressed video data.

RTP session.
(RFC 1889) The association among a set of participants communicating with RTP. For each participant, the session is defined by a particular pair of destination transport addresses (one network address plus a port pair for RTP and RTCP). The destination transport address pair may be common for all participants, as in the case of IP multicast, or may be different for each, as in the case of individual unicast network addresses plus a common port pair. In a multimedia session, each medium is carried in a separate RTP session with its own RTCP packets. The multiple RTP sessions are distinguished by different port number pairs and/or different multicast addresses.

SSRC, Synchronization source.
(RFC 1889) The source of a stream of RTP packets, identified by a 32-bit numeric SSRC identifier carried in the RTP header so as not to be dependent upon the network address. All packets from a synchronization source form part of the same timing and sequence number space, so a receiver groups packets by synchronization source for playback. Examples of synchronization sources include the sender of a stream of packets derived from a signal source such as a microphone or a camera, or an RTP mixer. A synchronization source may change its data format, e.g., audio encoding, over time. The SSRC identifier is a randomly chosen value meant to be globally unique within a particular RTP session. A participant need not use the same SSRC identifier for all the RTP sessions in a multimedia session; the binding of the SSRC identifiers is provided through RTCP. If a participant generates multiple streams in one RTP session, for example from separate video cameras, each must be identified as a different SSRC.

Translator.
(RFC 1889) An intermediate system that forwards RTP packets with their synchronization source identifier intact. Examples of translators include devices that convert encodings without mixing, replicators from multicast to unicast, and application- level filters in firewalls.


RFCs:

[RFC 2029] RTP Payload Format of Sun's CellB Video Encoding.

[RFC 2032] RTP Payload Format for H.261 Video Streams.

[RFC 2190] RTP Payload Format for H.263 Video Streams.

[RFC 2198] RTP Payload for Redundant Audio Data.

[RFC 2250] RTP Payload Format for MPEG1/MPEG2 Video.

[RFC 2343] RTP Payload Format for Bundled MPEG.

[RFC 2429] RTP Payload Format for the 1998 Version of ITU-T Rec. H.263 Video (H.263+).

[RFC 2431] RTP Payload Format for BT.656 Video Encoding.

[RFC 2435] RTP Payload Format for JPEG-compressed Video.

[RFC 2508] Compressing IP/UDP/RTP Headers for Low-Speed Serial Links.

[RFC 2658] RTP Payload Format for PureVoice(tm) Audio.

[RFC 2733] An RTP Payload Format for Generic Forward Error Correction.

[RFC 2736] Guidelines for Writers of RTP Payload Format Specifications.

[RFC 2762] Sampling of the Group Membership in RTP.

[RFC 2833] RTP Payload for DTMF Digits, Telephony Tones and Telephony Signals.

[RFC 2862] RTP Payload Format for Real-Time Pointers.

[RFC 2959] Real-Time Transport Protocol Management Information Base.

[RFC 3009] Registration of parityfec MIME types.

[RFC 3016] RTP Payload Format for MPEG-4 Audio/Visual Streams.

[RFC 3047] RTP Payload Format for ITU-T Recommendation G.722.1.

[RFC 3095] RObust Header Compression (ROHC): Framework and four profiles: RTP, UDP, ESP, and uncompressed.

[RFC 3119] A More Loss-Tolerant RTP Payload Format for MP3 Audio.

[RFC 3158] RTP Testing Strategies.

[RFC 3189] RTP Payload Format for DV (IEC 61834) Video.

[RFC 3190] RTP Payload Format for 12-bit DAT Audio and 20- and 24-bit Linear Sampled Audio.

[RFC 3267] Real-Time Transport Protocol (RTP) Payload Format and File Storage Format for the Adaptive Multi-Rate (AMR) and Adaptive Multi-Rate Wideband (AMR-WB) Audio Codecs.

[RFC 3389] Real-time Transport Protocol (RTP) Payload for Comfort Noise (CN).

[RFC 3497] RTP Payload Format for Society of Motion Picture and Television Engineers (SMPTE) 292M Video.

[RFC 3545] Enhanced Compressed RTP (CRTP) for Links with High Delay, Packet Loss and Reordering.

[RFC 3550] RTP: A Transport Protocol for Real-Time Applications.

[RFC 3551] RTP Profile for Audio and Video Conferences with Minimal Control.

[RFC 3555] MIME Type Registration of RTP Payload Formats.

[RFC 3557] RTP Payload Format for European Telecommunications Standards Institute (ETSI) European Standard ES 201 108 Distributed Speech Recognition Encoding.

[RFC 3558] RTP Payload Format for Enhanced Variable Rate Codecs (EVRC) and Selectable Mode Vocoders (SMV).

[RFC 3640] RTP Payload Format for Transport of MPEG-4 Elementary Streams.

[RFC 3711] The Secure Real-time Transport Protocol (SRTP).

[RFC 3816] Definitions of Managed Objects for RObust Header Compression (ROHC).

[RFC 3952] Real-time Transport Protocol (RTP) Payload Format for internet Low Bit Rate Codec (iLBC) Speech.

[RFC 3984] RTP Payload Format for H.264 Video.

[RFC 4040] RTP Payload Format for a 64 kbit/s Transparent Call.

[RFC 4060] RTP Payload Formats for European Telecommunications Standards Institute (ETSI) European Standard ES 202 050, ES 202 211, and ES 202 212 Distributed Speech Recognition Encoding.

[RFC 4103] RTP Payload for Text Conversation.

[RFC 4170] Tunneling Multiplexed Compressed RTP (TCRTP).

[RFC 4175] RTP Payload Format for Uncompressed Video.

[RFC 4184] RTP Payload Format for AC-3 Audio.

[RFC 4298] RTP Payload Format for BroadVoice Speech Codecs.

[RFC 4348] Real-Time Transport Protocol (RTP) Payload Format for the Variable-Rate Multimode Wideband (VMR-WB) Audio Codec.

[RFC 4351] Real-Time Transport Protocol (RTP) Payload for Text Conversation Interleaved in an Audio Stream.

[RFC 4352] RTP Payload Format for the Extended Adaptive Multi-Rate Wideband (AMR-WB+) Audio Codec.

[RFC 4396] RTP Payload Format for 3rd Generation Partnership Project (3GPP) Timed Text.

[RFC 4421] RTP Payload Format for Uncompressed Video: Additional Colour Sampling Modes.

[RFC 4424] Real-Time Transport Protocol (RTP) Payload Format for the Variable-Rate Multimode Wideband (VMR-WB) Extension Audio Codec.

[RFC 4425] RTP Payload Format for Video Codec 1 (VC-1).

[RFC 5244] Definition of Events for Channel-Oriented Telephony Signalling.

[RFC 5371] RTP Payload Format for JPEG 2000 Video Streams.

[RFC 5372] Payload Format for JPEG 2000 Video: Extensions for Scalability and Main Header Recovery.

[RFC 5391] RTP Payload Format for ITU-T Recommendation G.711.1.

[RFC 5404] RTP Payload Format for G.719.


Publications:


Obsolete RFCs:

[RFC 1889] RTP: A Transport Protocol for Real-Time Applications.

[RFC 1890] RTP Profile for Audio and Video Conferences with Minimal Control.

[RFC 2035] RTP Payload Format for JPEG-compressed Video.

[RFC 2038] RTP Payload Format for MPEG1/MPEG2 Video.

[RFC 2793] RTP Payload for Text Conversation.


Description Glossary RFCs Publications Obsolete RFCs