draft-ietf-payload-rtp-opus-08.txt   draft-ietf-payload-rtp-opus-09.txt 
Network Working Group J. Spittka Network Working Group J. Spittka
Internet-Draft Internet-Draft
Intended status: Standards Track K. Vos Intended status: Standards Track K. Vos
Expires: August 10, 2015 vocTone Expires: October 12, 2015 vocTone
JM. Valin JM. Valin
Mozilla Mozilla
February 6, 2015 April 10, 2015
RTP Payload Format for the Opus Speech and Audio Codec RTP Payload Format for the Opus Speech and Audio Codec
draft-ietf-payload-rtp-opus-08 draft-ietf-payload-rtp-opus-09
Abstract Abstract
This document defines the Real-time Transport Protocol (RTP) payload This document defines the Real-time Transport Protocol (RTP) payload
format for packetization of Opus encoded speech and audio data format for packetization of Opus encoded speech and audio data
necessary to integrate the codec in the most compatible way. necessary to integrate the codec in the most compatible way. It also
provides an applicability statement for the use of Opus over RTP.
Further, it describes media type registrations for the RTP payload Further, it describes media type registrations for the RTP payload
format. format.
Status of This Memo Status of This Memo
This Internet-Draft is submitted in full conformance with the This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79. provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/. Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on August 10, 2015. This Internet-Draft will expire on October 12, 2015.
Copyright Notice Copyright Notice
Copyright (c) 2015 IETF Trust and the persons identified as the Copyright (c) 2015 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
skipping to change at page 2, line 30 skipping to change at page 2, line 31
4. Opus RTP Payload Format . . . . . . . . . . . . . . . . . . . 6 4. Opus RTP Payload Format . . . . . . . . . . . . . . . . . . . 6
4.1. RTP Header Usage . . . . . . . . . . . . . . . . . . . . 6 4.1. RTP Header Usage . . . . . . . . . . . . . . . . . . . . 6
4.2. Payload Structure . . . . . . . . . . . . . . . . . . . . 7 4.2. Payload Structure . . . . . . . . . . . . . . . . . . . . 7
5. Congestion Control . . . . . . . . . . . . . . . . . . . . . 8 5. Congestion Control . . . . . . . . . . . . . . . . . . . . . 8
6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8
6.1. Opus Media Type Registration . . . . . . . . . . . . . . 8 6.1. Opus Media Type Registration . . . . . . . . . . . . . . 8
7. SDP Considerations . . . . . . . . . . . . . . . . . . . . . 12 7. SDP Considerations . . . . . . . . . . . . . . . . . . . . . 12
7.1. SDP Offer/Answer Considerations . . . . . . . . . . . . . 13 7.1. SDP Offer/Answer Considerations . . . . . . . . . . . . . 13
7.2. Declarative SDP Considerations for Opus . . . . . . . . . 15 7.2. Declarative SDP Considerations for Opus . . . . . . . . . 15
8. Security Considerations . . . . . . . . . . . . . . . . . . . 15 8. Security Considerations . . . . . . . . . . . . . . . . . . . 15
9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 15 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 16
10. References . . . . . . . . . . . . . . . . . . . . . . . . . 16 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 16
10.1. Normative References . . . . . . . . . . . . . . . . . . 16 10.1. Normative References . . . . . . . . . . . . . . . . . . 16
10.2. Informative References . . . . . . . . . . . . . . . . . 17 10.2. Informative References . . . . . . . . . . . . . . . . . 17
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 17 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 18
1. Introduction 1. Introduction
Opus [RFC6716] is a speech and audio codec developed within the IETF Opus [RFC6716] is a speech and audio codec developed within the IETF
Internet Wideband Audio Codec working group. The codec has a very Internet Wideband Audio Codec working group. The codec has a very
low algorithmic delay and it is highly scalable in terms of audio low algorithmic delay and it is highly scalable in terms of audio
bandwidth, bitrate, and complexity. Further, it provides different bandwidth, bitrate, and complexity. Further, it provides different
modes to efficiently encode speech signals as well as music signals, modes to efficiently encode speech signals as well as music signals,
thus making it the codec of choice for various applications using the thus making it the codec of choice for various applications using the
Internet or similar networks. Internet or similar networks.
This document defines the Real-time Transport Protocol (RTP) This document defines the Real-time Transport Protocol (RTP)
[RFC3550] payload format for packetization of Opus encoded speech and [RFC3550] payload format for packetization of Opus encoded speech and
audio data necessary to integrate Opus in the most compatible way. audio data necessary to integrate Opus in the most compatible way.
Further, it describes media type registrations for the RTP payload It also provides an applicability statement for the use of Opus over
format. RTP. Further, it describes media type registrations for the RTP
payload format.
2. Conventions, Definitions and Acronyms used in this document 2. Conventions, Definitions and Acronyms used in this document
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC2119]. document are to be interpreted as described in [RFC2119].
audio bandwidth: The range of audio frequecies being coded audio bandwidth: The range of audio frequecies being coded
CBR: Constant bitrate CBR: Constant bitrate
CPU: Central Processing Unit CPU: Central Processing Unit
skipping to change at page 4, line 6 skipping to change at page 4, line 10
different modes can be chosen, a voice mode or an audio mode, to different modes can be chosen, a voice mode or an audio mode, to
allow the most efficient coding depending on the type of the input allow the most efficient coding depending on the type of the input
signal, the sampling frequency of the input signal, and the intended signal, the sampling frequency of the input signal, and the intended
application. application.
The voice mode allows efficient encoding of voice signals at lower The voice mode allows efficient encoding of voice signals at lower
bit rates while the audio mode is optimized for general audio signals bit rates while the audio mode is optimized for general audio signals
at medium and higher bitrates. at medium and higher bitrates.
Opus is highly scalable in terms of audio bandwidth, bitrate, and Opus is highly scalable in terms of audio bandwidth, bitrate, and
complexity. Further, Opus allows transmitting stereo signals. complexity. Further, Opus allows transmitting stereo signals with
in-band signaling in the bit-stream.
3.1. Network Bandwidth 3.1. Network Bandwidth
Opus supports bitrates from 6 kb/s to 510 kb/s. The bitrate can be Opus supports bitrates from 6 kb/s to 510 kb/s. The bitrate can be
changed dynamically within that range. All other parameters being changed dynamically within that range. All other parameters being
equal, higher bitrates result in higher audio quality. equal, higher bitrates result in higher audio quality.
3.1.1. Recommended Bitrate 3.1.1. Recommended Bitrate
For a frame size of 20 ms, these are the bitrate "sweet spots" for For a frame size of 20 ms, these are the bitrate "sweet spots" for
skipping to change at page 5, line 28 skipping to change at page 5, line 32
comfort noise signal to replace the non transmitted parts of the comfort noise signal to replace the non transmitted parts of the
speech or audio signal. Use of [RFC3389] Comfort Noise (CN) with speech or audio signal. Use of [RFC3389] Comfort Noise (CN) with
Opus is discouraged. The transmitter MUST drop whole frames only, Opus is discouraged. The transmitter MUST drop whole frames only,
based on the size of the last transmitted frame, to ensure successive based on the size of the last transmitted frame, to ensure successive
RTP timestamps differ by a multiple of 120 and to allow the receiver RTP timestamps differ by a multiple of 120 and to allow the receiver
to use whole frames for concealment. to use whole frames for concealment.
DTX can be used with both variable and constant bitrate. It will DTX can be used with both variable and constant bitrate. It will
have a slightly lower speech or audio quality than continuous have a slightly lower speech or audio quality than continuous
transmission. Therefore, using continuous transmission is transmission. Therefore, using continuous transmission is
RECOMMENDED unless restraints on available network bandwidth are RECOMMENDED unless constraints on available network bandwidth are
severe. severe.
3.2. Complexity 3.2. Complexity
Complexity of the encoder can be scaled to optimize for CPU resources Complexity of the encoder can be scaled to optimize for CPU resources
in real-time, mostly as a trade-off between audio quality and in real-time, mostly as a trade-off between audio quality and
bitrate. Also, different modes of Opus have different complexity. bitrate. Also, different modes of Opus have different complexity.
3.3. Forward Error Correction (FEC) 3.3. Forward Error Correction (FEC)
skipping to change at page 6, line 22 skipping to change at page 6, line 25
Any compliant Opus decoder is capable of ignoring FEC information Any compliant Opus decoder is capable of ignoring FEC information
when it is not needed, so encoding with FEC cannot cause when it is not needed, so encoding with FEC cannot cause
interoperability problems. However, if FEC cannot be used on the interoperability problems. However, if FEC cannot be used on the
receiving side, then FEC SHOULD NOT be used, as it leads to an receiving side, then FEC SHOULD NOT be used, as it leads to an
inefficient usage of network resources. Decoder support for FEC inefficient usage of network resources. Decoder support for FEC
SHOULD be indicated at the time a session is set up. SHOULD be indicated at the time a session is set up.
3.4. Stereo Operation 3.4. Stereo Operation
Opus allows for transmission of stereo audio signals. This operation Opus allows for transmission of stereo audio signals. This operation
is signaled in-band in the Opus payload and no special arrangement is is signaled in-band in the Opus bit-stream and no special arrangement
needed in the payload format. An Opus decoder is capable of handling is needed in the payload format. An Opus decoder is capable of
a stereo encoding, but an application might only be capable of handling a stereo encoding, but an application might only be capable
consuming a single audio channel. of consuming a single audio channel.
If a decoder cannot take advantage of the benefits of a stereo signal If a decoder cannot take advantage of the benefits of a stereo signal
this SHOULD be indicated at the time a session is set up. In that this SHOULD be indicated at the time a session is set up. In that
case the sending side SHOULD NOT send stereo signals as it leads to case the sending side SHOULD NOT send stereo signals as it leads to
an inefficient usage of network resources. an inefficient usage of network resources.
4. Opus RTP Payload Format 4. Opus RTP Payload Format
The payload format for Opus consists of the RTP header and Opus The payload format for Opus consists of the RTP header and Opus
payload data. payload data.
skipping to change at page 7, line 10 skipping to change at page 7, line 15
The timestamp, sequence number, and marker bit (M) of the RTP header The timestamp, sequence number, and marker bit (M) of the RTP header
are used in accordance with Section 4.1 of [RFC3551]. are used in accordance with Section 4.1 of [RFC3551].
The RTP payload type for Opus is to be assigned dynamically. The RTP payload type for Opus is to be assigned dynamically.
The receiving side MUST be prepared to receive duplicate RTP packets. The receiving side MUST be prepared to receive duplicate RTP packets.
The receiver MUST provide at most one of those payloads to the Opus The receiver MUST provide at most one of those payloads to the Opus
decoder for decoding, and MUST discard the others. decoder for decoding, and MUST discard the others.
Opus supports 5 different audio bandwidths, which can be adjusted Opus supports 5 different audio bandwidths, which can be adjusted
during a call. The RTP timestamp is incremented with a 48000 Hz during a stream. The RTP timestamp is incremented with a 48000 Hz
clock rate for all modes of Opus and all sampling rates. The unit clock rate for all modes of Opus and all sampling rates. The unit
for the timestamp is samples per single (mono) channel. The RTP for the timestamp is samples per single (mono) channel. The RTP
timestamp corresponds to the sample time of the first encoded sample timestamp corresponds to the sample time of the first encoded sample
in the encoded frame. For data encoded with sampling rates other in the encoded frame. For data encoded with sampling rates other
than 48000 Hz, the sampling rate has to be adjusted to 48000 Hz. than 48000 Hz, the sampling rate has to be adjusted to 48000 Hz.
4.2. Payload Structure 4.2. Payload Structure
The Opus encoder can output encoded frames representing 2.5, 5, 10, The Opus encoder can output encoded frames representing 2.5, 5, 10,
20, 40, or 60 ms of speech or audio data. Further, an arbitrary 20, 40, or 60 ms of speech or audio data. Further, an arbitrary
skipping to change at page 8, line 29 skipping to change at page 8, line 29
The target bitrate of Opus can be adjusted at any point in time, thus The target bitrate of Opus can be adjusted at any point in time, thus
allowing efficient congestion control. Furthermore, the amount of allowing efficient congestion control. Furthermore, the amount of
encoded speech or audio data encoded in a single packet can be used encoded speech or audio data encoded in a single packet can be used
for congestion control, since the transmission rate is inversely for congestion control, since the transmission rate is inversely
proportional to the packet duration. A lower packet transmission proportional to the packet duration. A lower packet transmission
rate reduces the amount of header overhead, but at the same time rate reduces the amount of header overhead, but at the same time
increases latency and loss sensitivity, so it ought to be used with increases latency and loss sensitivity, so it ought to be used with
care. care.
It is RECOMMENDED that senders of Opus encoded data apply congestion Since UDP does not provide congestion control, applications that use
control. RTP over UDP SHOULD implement their own congestion control above the
UDP layer. [draft-ietf-rmcat-app-interaction-01] describes the
interactions and conceptual interfaces necessary between the
application components that relate to congestion control, including
the RTP layer, the higher-level media codec control layer, and the
lower-level transport interface, as well as components dedicated to
congestion control functions.
6. IANA Considerations 6. IANA Considerations
One media subtype (audio/opus) has been defined and registered as One media subtype (audio/opus) has been defined and registered as
described in the following section. described in the following section.
6.1. Opus Media Type Registration 6.1. Opus Media Type Registration
Media type registration is done according to [RFC6838] and [RFC4855]. Media type registration is done according to [RFC6838] and [RFC4855].
skipping to change at page 15, line 31 skipping to change at page 15, line 38
provided if necessary by declaring multiple RTP payload types; provided if necessary by declaring multiple RTP payload types;
however, the number of types ought to be kept small. however, the number of types ought to be kept small.
8. Security Considerations 8. Security Considerations
All RTP packets using the payload format defined in this All RTP packets using the payload format defined in this
specification are subject to the general security considerations specification are subject to the general security considerations
discussed in the RTP specification [RFC3550] and any profile from, discussed in the RTP specification [RFC3550] and any profile from,
e.g., [RFC3711] or [RFC3551]. e.g., [RFC3711] or [RFC3551].
Use of variable bitrate (VBR) is subject to the security
considerations in [RFC6562].
RTP packets using the payload format defined in this specification
are subject to the security considerations discussed in the RTP
specification [RFC3550], and in any applicable RTP profile such as
RTP/AVP [RFC3551], RTP/AVPF [RFC4585], RTP/SAVP [RFC3711] or RTP/
SAVPF [RFC5124]. However, as "Securing the RTP Protocol Framework:
Why RTP Does Not Mandate a Single Media Security Solution" [RFC7202]
discusses it is not an RTP payload formats responsibility to discuss
or mandate what solutions are used to meet the basic security goals
like confidentiality, integrity and source authenticity for RTP in
general. This responsibility lays on anyone using RTP in an
application. They can find guidance on available security mechanisms
and important considerations in Options for Securing RTP Sessions [I-
D.ietf-avtcore-rtp-security-options]. Applications SHOULD use one or
more appropriate strong security mechanisms.
This payload format transports Opus encoded speech or audio data. This payload format transports Opus encoded speech or audio data.
Hence, security issues include confidentiality, integrity protection, Hence, security issues include confidentiality, integrity protection,
and authentication of the speech or audio itself. Opus does not and authentication of the speech or audio itself. Opus does not
provide any confidentiality or integrity protection. Any suitable provide any confidentiality or integrity protection. Any suitable
external mechanisms, such as SRTP [RFC3711], MAY be used. external mechanisms, such as SRTP [RFC3711], MAY be used.
This payload format and the Opus encoding do not exhibit any This payload format and the Opus encoding do not exhibit any
significant non-uniformity in the receiver-end computational load and significant non-uniformity in the receiver-end computational load and
thus are unlikely to pose a denial-of-service threat due to the thus are unlikely to pose a denial-of-service threat due to the
receipt of pathological datagrams. receipt of pathological datagrams.
skipping to change at page 17, line 14 skipping to change at page 17, line 35
[RFC6838] Freed, N., Klensin, J., and T. Hansen, "Media Type [RFC6838] Freed, N., Klensin, J., and T. Hansen, "Media Type
Specifications and Registration Procedures", BCP 13, RFC Specifications and Registration Procedures", BCP 13, RFC
6838, January 2013. 6838, January 2013.
10.2. Informative References 10.2. Informative References
[RFC2974] Handley, M., Perkins, C., and E. Whelan, "Session [RFC2974] Handley, M., Perkins, C., and E. Whelan, "Session
Announcement Protocol", RFC 2974, October 2000. Announcement Protocol", RFC 2974, October 2000.
[RFC4585] Ott, J., Wenger, S., Sato, N., Burmeister, C., and J. Rey,
"Extended RTP Profile for Real-time Transport Control
Protocol (RTCP)-Based Feedback (RTP/AVPF)", RFC 4585, July
2006.
[RFC5124] Ott, J. and E. Carrara, "Extended Secure RTP Profile for
Real-time Transport Control Protocol (RTCP)-Based Feedback
(RTP/SAVPF)", RFC 5124, February 2008.
[RFC7202] Perkins, C. and M. Westerlund, "Securing the RTP
Framework: Why RTP Does Not Mandate a Single Media
Security Solution", RFC 7202, April 2014.
[draft-ietf-rmcat-app-interaction-01]
Zanaty, M., Singh, V., Nandakumar, S., and Z. Sarker, "RTP
Application Interaction with Congestion Control", draft-
ietf-rmcat-app-interaction-01 (work in progress), October
2014, <http://tools.ietf.org/html/
draft-ietf-rmcat-app-interaction-01>.
Authors' Addresses Authors' Addresses
Julian Spittka Julian Spittka
Email: jspittka@gmail.com Email: jspittka@gmail.com
Koen Vos Koen Vos
vocTone vocTone
Email: koenvos74@gmail.com Email: koenvos74@gmail.com
 End of changes. 15 change blocks. 
18 lines changed or deleted 65 lines changed or added

This html diff was produced by rfcdiff 1.42. The latest version is available from http://tools.ietf.org/tools/rfcdiff/