draft-ietf-payload-rtp-opus-06.txt   draft-ietf-payload-rtp-opus-07.txt 
Network Working Group J. Spittka Network Working Group J. Spittka
Internet-Draft Internet-Draft
Intended status: Standards Track K. Vos Intended status: Standards Track K. Vos
Expires: July 10, 2015 vocTone Expires: July 17, 2015 vocTone
JM. Valin JM. Valin
Mozilla Mozilla
January 6, 2015 January 13, 2015
RTP Payload Format for Opus Speech and Audio Codec RTP Payload Format for Opus Speech and Audio Codec
draft-ietf-payload-rtp-opus-06 draft-ietf-payload-rtp-opus-07
Abstract Abstract
This document defines the Real-time Transport Protocol (RTP) payload This document defines the Real-time Transport Protocol (RTP) payload
format for packetization of Opus encoded speech and audio data format for packetization of Opus encoded speech and audio data
necessary to integrate the codec in the most compatible way. necessary to integrate the codec in the most compatible way.
Further, it describes media type registrations for the RTP payload Further, it describes media type registrations for the RTP payload
format. format.
Status of This Memo Status of This Memo
skipping to change at page 1, line 37 skipping to change at page 1, line 37
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/. Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on July 10, 2015. This Internet-Draft will expire on July 17, 2015.
Copyright Notice Copyright Notice
Copyright (c) 2015 IETF Trust and the persons identified as the Copyright (c) 2015 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
skipping to change at page 2, line 28 skipping to change at page 2, line 28
3.3. Forward Error Correction (FEC) . . . . . . . . . . . . . 5 3.3. Forward Error Correction (FEC) . . . . . . . . . . . . . 5
3.4. Stereo Operation . . . . . . . . . . . . . . . . . . . . 6 3.4. Stereo Operation . . . . . . . . . . . . . . . . . . . . 6
4. Opus RTP Payload Format . . . . . . . . . . . . . . . . . . . 6 4. Opus RTP Payload Format . . . . . . . . . . . . . . . . . . . 6
4.1. RTP Header Usage . . . . . . . . . . . . . . . . . . . . 6 4.1. RTP Header Usage . . . . . . . . . . . . . . . . . . . . 6
4.2. Payload Structure . . . . . . . . . . . . . . . . . . . . 7 4.2. Payload Structure . . . . . . . . . . . . . . . . . . . . 7
5. Congestion Control . . . . . . . . . . . . . . . . . . . . . 8 5. Congestion Control . . . . . . . . . . . . . . . . . . . . . 8
6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8
6.1. Opus Media Type Registration . . . . . . . . . . . . . . 8 6.1. Opus Media Type Registration . . . . . . . . . . . . . . 8
6.2. Mapping to SDP Parameters . . . . . . . . . . . . . . . . 12 6.2. Mapping to SDP Parameters . . . . . . . . . . . . . . . . 12
6.2.1. Offer-Answer Model Considerations for Opus . . . . . 13 6.2.1. Offer-Answer Model Considerations for Opus . . . . . 13
6.2.2. Declarative SDP Considerations for Opus . . . . . . . 14 6.2.2. Declarative SDP Considerations for Opus . . . . . . . 15
7. Security Considerations . . . . . . . . . . . . . . . . . . . 15 7. Security Considerations . . . . . . . . . . . . . . . . . . . 15
8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 15 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 15
9. References . . . . . . . . . . . . . . . . . . . . . . . . . 15 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 16
9.1. Normative References . . . . . . . . . . . . . . . . . . 15 9.1. Normative References . . . . . . . . . . . . . . . . . . 16
9.2. Informative References . . . . . . . . . . . . . . . . . 16 9.2. Informative References . . . . . . . . . . . . . . . . . 17
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 16 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 17
1. Introduction 1. Introduction
The Opus codec is a speech and audio codec developed within the IETF The Opus codec is a speech and audio codec developed within the IETF
Internet Wideband Audio Codec working group. The codec has a very Internet Wideband Audio Codec working group. The codec has a very
low algorithmic delay and it is highly scalable in terms of audio low algorithmic delay and it is highly scalable in terms of audio
bandwidth, bitrate, and complexity. Further, it provides different bandwidth, bitrate, and complexity. Further, it provides different
modes to efficiently encode speech signals as well as music signals, modes to efficiently encode speech signals as well as music signals,
thus making it the codec of choice for various applications using the thus making it the codec of choice for various applications using the
Internet or similar networks. Internet or similar networks.
skipping to change at page 4, line 41 skipping to change at page 4, line 41
for choosing CBR is the potential information leak that _might_ occur for choosing CBR is the potential information leak that _might_ occur
when encrypting the compressed stream. See [RFC6562] for guidelines when encrypting the compressed stream. See [RFC6562] for guidelines
on when VBR is appropriate for encrypted audio communications. In on when VBR is appropriate for encrypted audio communications. In
the case where an existing VBR stream needs to be converted to CBR the case where an existing VBR stream needs to be converted to CBR
for security reasons, then the Opus padding mechanism described in for security reasons, then the Opus padding mechanism described in
[RFC6716] is the RECOMMENDED way to achieve padding because the RTP [RFC6716] is the RECOMMENDED way to achieve padding because the RTP
padding bit is unencrypted. padding bit is unencrypted.
The bitrate can be adjusted at any point in time. To avoid The bitrate can be adjusted at any point in time. To avoid
congestion, the average bitrate SHOULD NOT exceed the available congestion, the average bitrate SHOULD NOT exceed the available
network capacity. If no target bitrate is specified, the bitrates network bandwidth. If no target bitrate is specified, the bitrates
specified in Section 3.1.1 are RECOMMENDED. specified in Section 3.1.1 are RECOMMENDED.
3.1.3. Discontinuous Transmission (DTX) 3.1.3. Discontinuous Transmission (DTX)
The Opus codec can, as described in Section 3.1.2, be operated with a The Opus codec can, as described in Section 3.1.2, be operated with a
variable bitrate. In that case, the encoder will automatically variable bitrate. In that case, the encoder will automatically
reduce the bitrate for certain input signals, like periods of reduce the bitrate for certain input signals, like periods of
silence. When using continuous transmission, it will reduce the silence. When using continuous transmission, it will reduce the
bitrate when the characteristics of the input signal permit, but will bitrate when the characteristics of the input signal permit, but will
never interrupt the transmission to the receiver. Therefore, the never interrupt the transmission to the receiver. Therefore, the
skipping to change at page 5, line 28 skipping to change at page 5, line 28
comfort noise signal to replace the non transmitted parts of the comfort noise signal to replace the non transmitted parts of the
speech or audio signal. Use of [RFC3389] Comfort Noise (CN) with speech or audio signal. Use of [RFC3389] Comfort Noise (CN) with
Opus is discouraged. The transmitter MUST drop whole frames only, Opus is discouraged. The transmitter MUST drop whole frames only,
based on the size of the last transmitted frame, to ensure successive based on the size of the last transmitted frame, to ensure successive
RTP timestamps differ by a multiple of 120 and to allow the receiver RTP timestamps differ by a multiple of 120 and to allow the receiver
to use whole frames for concealment. to use whole frames for concealment.
DTX can be used with both variable and constant bitrate. It will DTX can be used with both variable and constant bitrate. It will
have a slightly lower speech or audio quality than continuous have a slightly lower speech or audio quality than continuous
transmission. Therefore, using continuous transmission is transmission. Therefore, using continuous transmission is
RECOMMENDED unless restraints on network capacity are severe. RECOMMENDED unless restraints on available network bandwidth are
severe.
3.2. Complexity 3.2. Complexity
Complexity of the encoder can be scaled to optimize for CPU resources Complexity of the encoder can be scaled to optimize for CPU resources
in real-time, mostly as a trade-off between audio quality and in real-time, mostly as a trade-off between audio quality and
bitrate. Also, different modes of Opus have different complexity. bitrate. Also, different modes of Opus have different complexity.
3.3. Forward Error Correction (FEC) 3.3. Forward Error Correction (FEC)
The voice mode of Opus allows for embedding "in-band" forward error The voice mode of Opus allows for embedding "in-band" forward error
skipping to change at page 8, line 10 skipping to change at page 8, line 10
rates (fs) of Opus and shows how the timestamp is incremented for rates (fs) of Opus and shows how the timestamp is incremented for
packetization (ts incr). If the Opus encoder outputs multiple packetization (ts incr). If the Opus encoder outputs multiple
encoded frames into a single packet, the timestamp increment is the encoded frames into a single packet, the timestamp increment is the
sum of the increments for the individual frames. sum of the increments for the individual frames.
+---------+-----------------+-----+-----+-----+-----+------+------+ +---------+-----------------+-----+-----+-----+-----+------+------+
| Mode | fs | 2.5 | 5 | 10 | 20 | 40 | 60 | | Mode | fs | 2.5 | 5 | 10 | 20 | 40 | 60 |
+---------+-----------------+-----+-----+-----+-----+------+------+ +---------+-----------------+-----+-----+-----+-----+------+------+
| ts incr | all | 120 | 240 | 480 | 960 | 1920 | 2880 | | ts incr | all | 120 | 240 | 480 | 960 | 1920 | 2880 |
| | | | | | | | | | | | | | | | | |
| voice | NB/MB/WB/SWB/FB | | | x | x | x | x | | voice | NB/MB/WB/SWB/FB | x | x | o | o | o | o |
| | | | | | | | | | | | | | | | | |
| audio | NB/WB/SWB/FB | x | x | x | x | | | | audio | NB/WB/SWB/FB | o | o | o | o | x | x |
+---------+-----------------+-----+-----+-----+-----+------+------+ +---------+-----------------+-----+-----+-----+-----+------+------+
Table 2: Supported Opus frame sizes and timestamp increments Table 2: Supported Opus frame sizes and timestamp increments marked
with an o. Unsupported marked with an x.
5. Congestion Control 5. Congestion Control
The target bitrate of Opus can be adjusted at any point in time, thus The target bitrate of Opus can be adjusted at any point in time, thus
allowing efficient congestion control. Furthermore, the amount of allowing efficient congestion control. Furthermore, the amount of
encoded speech or audio data encoded in a single packet can be used encoded speech or audio data encoded in a single packet can be used
for congestion control, since the transmission rate is inversely for congestion control, since the transmission rate is inversely
proportional to the packet duration. A lower packet transmission proportional to the packet duration. A lower packet transmission
rate reduces the amount of header overhead, but at the same time rate reduces the amount of header overhead, but at the same time
increases latency and loss sensitivity, so it ought to be used with increases latency and loss sensitivity, so it ought to be used with
skipping to change at page 11, line 19 skipping to change at page 11, line 21
The Opus media type is framed and consists of binary data The Opus media type is framed and consists of binary data
according to Section 4.8 in [RFC6838]. according to Section 4.8 in [RFC6838].
Security considerations: Security considerations:
See Section 7 of this document. See Section 7 of this document.
Interoperability considerations: none Interoperability considerations: none
Published specification: none Published specification: RFC [XXXX]
Note to the RFC Editor: Replace [XXXX] with the number of the
published RFC.
Applications that use this media type: Applications that use this media type:
Any application that requires the transport of speech or audio Any application that requires the transport of speech or audio
data can use this media type. Some examples are, but not limited data can use this media type. Some examples are, but not limited
to, audio and video conferencing, Voice over IP, media streaming. to, audio and video conferencing, Voice over IP, media streaming.
Fragment identifier considerations: N/A
Person & email address to contact for further information: Person & email address to contact for further information:
SILK Support silksupport@skype.net SILK Support silksupport@skype.net
Jean-Marc Valin jmvalin@jmvalin.ca Jean-Marc Valin jmvalin@jmvalin.ca
Intended usage: COMMON Intended usage: COMMON
Restrictions on usage: Restrictions on usage:
For transfer over RTP, the RTP payload format (Section 4 of this For transfer over RTP, the RTP payload format (Section 4 of this
document) SHALL be used. document) SHALL be used.
Author: Author:
Julian Spittka jspittka@gmail.com Julian Spittka jspittka@gmail.com
Koen Vos koenvos74@gmail.com Koen Vos koenvos74@gmail.com
Jean-Marc Valin jmvalin@jmvalin.ca Jean-Marc Valin jmvalin@jmvalin.ca
Change controller: TBD Change controller: IETF Payload Working Group delegated from the IESG
6.2. Mapping to SDP Parameters 6.2. Mapping to SDP Parameters
The information described in the media type specification has a The information described in the media type specification has a
specific mapping to fields in the Session Description Protocol (SDP) specific mapping to fields in the Session Description Protocol (SDP)
[RFC4566], which is commonly used to describe RTP sessions. When SDP [RFC4566], which is commonly used to describe RTP sessions. When SDP
is used to specify sessions employing the Opus codec, the mapping is is used to specify sessions employing the Opus codec, the mapping is
as follows: as follows:
o The media type ("audio") goes in SDP "m=" as the media name. o The media type ("audio") goes in SDP "m=" as the media name.
 End of changes. 14 change blocks. 
16 lines changed or deleted 23 lines changed or added

This html diff was produced by rfcdiff 1.41. The latest version is available from http://tools.ietf.org/tools/rfcdiff/