aboutsummaryrefslogtreecommitdiff
path: root/doc/vorbis.html
blob: 66df2f466933a24e8213b06e555e3fb1ba83339d (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html>
<head>

<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-15"/>
<title>Ogg Vorbis Documentation</title>

<style type="text/css">
body {
  margin: 0 18px 0 18px;
  padding-bottom: 30px;
  font-family: Verdana, Arial, Helvetica, sans-serif;
  color: #333333;
  font-size: .8em;
}

a {
  color: #3366cc;
}

img {
  border: 0;
}

#xiphlogo {
  margin: 30px 0 16px 0;
}

#content p {
  line-height: 1.4;
}

h1, h1 a, h2, h2 a, h3, h3 a {
  font-weight: bold;
  color: #ff9900;
  margin: 1.3em 0 8px 0;
}

h1 {
  font-size: 1.3em;
}

h2 {
  font-size: 1.2em;
}

h3 {
  font-size: 1.1em;
}

li {
  line-height: 1.4;
}

#copyright {
  margin-top: 30px;
  line-height: 1.5em;
  text-align: center;
  font-size: .8em;
  color: #888888;
  clear: both;
}
</style>

</head>

<body>

<div id="xiphlogo">
  <a href="http://www.xiph.org/"><img src="fish_xiph_org.png" alt="Fish Logo and Xiph.org"/></a>
</div>

<h1>Ogg Vorbis encoding format documentation</h1>

<p><img src="wait.png" alt="wait"/>As of writing, not all the below document
links are live. They will be populated as we complete the documents.</p>

<h2>Documents</h2>

<ul>
<li><a href="packet.html">Vorbis packet structure</a></li>
<li><a href="envelope.html">Temporal envelope shaping and blocksize</a></li>
<li><a href="mdct.html">Time domain segmentation and MDCT transform</a></li>
<li><a href="resolution.html">The resolution floor</a></li>
<li><a href="residuals.html">MDCT-domain fine structure</a></li>
</ul>

<ul>
<li><a href="probmodel.html">The Vorbis probability model</a></li>
<li><a href="bitpack.html">The Vorbis bitpacker</a></li>
</ul>

<ul>
<li><a href="oggstream.html">Ogg bitstream overview</a></li>
<li><a href="framing.html">Ogg logical bitstream and framing spec</a></li>
<li><a href="vorbis-stream.html">Vorbis packet->Ogg bitstream mapping</a></li>
</ul>

<ul>
<li><a href="programming.html">Programming with libvorbis</a></li>
</ul>

<h2>Description</h2>

<p>Ogg Vorbis is a general purpose compressed audio format
for high quality (44.1-48.0kHz, 16+ bit, polyphonic) audio and music
at moderate fixed and variable bitrates (40-80 kb/s/channel). This
places Vorbis in the same class as audio representations including
MPEG-1 audio layer 3, MPEG-4 audio (AAC and TwinVQ), and PAC.</p>

<p>Vorbis is the first of a planned family of Ogg multimedia coding
formats being developed as part of the Xiph.org Foundation's Ogg multimedia
project. See <a href="http://www.xiph.org/">http://www.xiph.org/</a>
for more information.</p>

<h2>Vorbis technical documents</h2>

<p>A Vorbis encoder takes in overlapping (but contiguous) short-time
segments of audio data. The encoder analyzes the content of the audio
to determine an optimal compact representation; this phase of encoding
is known as <em>analysis</em>. For each short-time block of sound,
the encoder then packs an efficient representation of the signal, as
determined by analysis, into a raw packet much smaller than the size
required by the original signal; this phase is <em>coding</em>.
Lastly, in a streaming environment, the raw packets are then
structured into a continuous stream of octets; this last phase is
<em>streaming</em>. Note that the stream of octets is referred to both
as a 'byte-' and 'bit-'stream; the latter usage is acceptible as the
stream of octets is a physical representation of a true logical
bit-by-bit stream.</p>

<p>A Vorbis decoder performs a mirror image process of extracting the
original sequence of raw packets from an Ogg stream (<em>stream
decomposition</em>), reconstructing the signal representation from the
raw data in the packet (<em>decoding</em>) and them reconstituting an
audio signal from the decoded representation (<em>synthesis</em>).</p>

<p>The <a href="programming.html">Programming with libvorbis</a>
documents discuss use of the reference Vorbis codec library
(libvorbis) produced by the Xiph.org Foundation.</p>

<p>The data representations and algorithms necessary at each step to
encode and decode Ogg Vorbis bitstreams are described by the below
documents in sufficient detail to construct a complete Vorbis codec.
Note that at the time of writing, Vorbis is still in a 'Request For
Comments' stage of development; despite being in advanced stages of
development, input from the multimedia community is welcome.</p>

<h3>Vorbis analysis and synthesis</h3>

<p>Analysis begins by seperating an input audio stream into individual,
overlapping short-time segments of audio data. These segments are
then transformed into an alternate representation, seeking to
represent the original signal in a more efficient form that codes into
a smaller number of bytes. The analysis and transformation stage is
the most complex element of producing a Vorbis bitstream.</p>

<p>The corresponding synthesis step in the decoder is simpler; there is
no analysis to perform, merely a mechanical, deterministic
reconstruction of the original audio data from the transform-domain
representation.</p>

<ul>
<li><a href="packet.html">Vorbis packet structure</a>:
Describes the basic analysis components necessary to produce Vorbis
packets and the structure of the packet itself.</li>
<li><a href="envelope.html">Temporal envelope shaping and blocksize</a>:
Use of temporal envelope shaping and variable blocksize to minimize
time-domain energy leakage during wide dynamic range and spectral energy
swings. Also discusses time-related principles of psychoacoustics.</li>
<li><a href="mdct.html">Time domain segmentation and MDCT transform</a>:
Division of time domain data into individual overlapped, windowed
short-time vectors and transformation using the MDCT</li>
<li><a href="resolution.html">The resolution floor</a>: Use of frequency
doamin psychoacoustics, and the MDCT-domain noise, masking and resolution
floors</li>
<li><a href="residuals.html">MDCT-domain fine structure</a>: Production,
quantization and massaging of MDCT-spectrum fine structure</li>
</ul>

<h3>Vorbis coding and decoding</h3>

<p>Coding and decoding converts the transform-domain representation of
the original audio produced by analysis to and from a bitwise packed
raw data packet. Coding and decoding consist of two logically
orthogonal concepts, <em>back-end coding</em> and <em>bitpacking</em>.</p>

<p><em>Back-end coding</em> uses a probability model to represent the raw numbers
of the audio representation in as few physical bits as possible;
familiar examples of back-end coding include Huffman coding and Vector
Quantization.</p>

<p><em>Bitpacking</em> arranges the variable sized words of the back-end
coding into a vector of octets without wasting space. The octets
produced by coding a single short-time audio segment is one raw Vorbis
packet.</p>

<ul>
<li><a href="probmodel.html">The Vorbis probability model</a></li>
<li><a href="bitpack.html">The Vorbis bitpacker</a>: Arrangement of 
variable bit-length words into an octet-aligned packet.</li>
</ul>

<h3>Vorbis streaming and stream decomposition</h3>

<p>Vorbis packets contain the raw, bitwise-compressed representation of a
snippet of audio. These packets contain no structure and cannot be
strung together directly into a stream; for streamed transmission and
storage, Vorbis packets are encoded into an Ogg bitstream.</p>

<ul>
<li><a href="oggstream.html">Ogg bitstream overview</a>: High-level
description of Ogg logical bitstreams, how logical bitstreams
(of mixed media types) can be combined into physical bitstreams, and
restrictions on logical-to-physical mapping. Note that this document is
not specific only to Ogg Vorbis.</li>
<li><a href="framing.html">Ogg logical bitstream and framing
spec</a>: Low level, complete specification of Ogg logical
bitstream pages. Note that this document is not specific only to Ogg
Vorbis.</li>
<li><a href="vorbis-stream.html">Vorbis bitstream mapping</a>:
Specifically describes mapping Vorbis data into an
Ogg physical bitstream.</li>
</ul>

<div id="copyright">
  The Xiph Fish Logo is a
  trademark (&trade;) of Xiph.Org.<br/>

  These pages &copy; 1994 - 2005 Xiph.Org. All rights reserved.
</div>

</body>
</html>