89 lines
4.2 KiB
Text
89 lines
4.2 KiB
Text
|
CELT is a very low delay audio codec designed for high-quality communications.
|
||
|
|
||
|
Traditional full-bandwidth codecs such as Vorbis and AAC can offer high
|
||
|
quality but they require codec delays of hundreds of milliseconds, which
|
||
|
makes them unsuitable for real-time interactive applications like tele-
|
||
|
conferencing. Speech targeted codecs, such as Speex or G.722, have lower
|
||
|
20-40ms delays but their speech focus and limited sampling rates
|
||
|
restricts their quality, especially for music.
|
||
|
|
||
|
Additionally, the other mandatory components of a full network audio system—
|
||
|
audio interfaces, routers, jitter buffers— each add their own delay. For lower
|
||
|
speed networks the time it takes to serialize a packet onto the network cable
|
||
|
takes considerable time, and over the long distances the speed of light
|
||
|
imposes a significant delay.
|
||
|
|
||
|
In teleconferencing— it is important to keep delay low so that the participants
|
||
|
can communicate fluidly without talking on top of each other and so that their
|
||
|
own voices don't return after a round trip as an annoying echo.
|
||
|
|
||
|
For network music performance— research has show that the total one way delay
|
||
|
must be kept under 25ms to avoid degrading the musicians performance.
|
||
|
|
||
|
Since many of the sources of delay in a complete system are outside of the
|
||
|
user's control (such as the speed of light) it is often only possible to
|
||
|
reduce the total delay by reducing the codec delay.
|
||
|
|
||
|
Low delay has traditionally been considered a challenging area in audio codec
|
||
|
design, because as a codec is forced to work on the smaller chunks of audio
|
||
|
required for low delay it has access to less redundancy and less perceptual
|
||
|
information which it can use to reduce the size of the transmitted audio.
|
||
|
|
||
|
CELT is designed to bridge the gap between "music" and "speech" codecs,
|
||
|
permitting new very high quality teleconferencing applications, and to go
|
||
|
further, permitting latencies much lower than speech codecs normally provide
|
||
|
to enable applications such as remote musical collaboration even over long
|
||
|
distances.
|
||
|
|
||
|
In keeping with the Xiph.Org mission— CELT is also designed to accomplish
|
||
|
this without copyright or patent encumbrance. Only by keeping the formats
|
||
|
that drive our Internet communication free and unencumbered can we maximize
|
||
|
innovation, collaboration, and interoperability. Fortunately, CELT is ahead
|
||
|
of the adoption curve in its target application space, so there should be
|
||
|
no reason for someone who needs what CELT provides to go with a proprietary
|
||
|
codec.
|
||
|
|
||
|
CELT has been tested on x86, x86_64, ARM, and the TI C55x DSPs, and should
|
||
|
be portable to any platform with a working C compiler and on the order of
|
||
|
100 MIPS of processing power.
|
||
|
|
||
|
The code is still in early stage, so it may be broken from time to time, and
|
||
|
the bit-stream is not frozen yet, so it is different from one version to
|
||
|
another. Oh, and don't complain if it sets your house on fire.
|
||
|
|
||
|
Complaints and accolades can be directed to the CELT mailing list:
|
||
|
http://lists.xiph.org/mailman/listinfo/celt-dev/
|
||
|
|
||
|
To compile:
|
||
|
% ./configure
|
||
|
% make
|
||
|
|
||
|
For platforms without fast floating point support (such as ARM) use the
|
||
|
--enable-fixed argument to configure to build a fixed-point version of CELT.
|
||
|
|
||
|
There are Ogg-based encode/decode tools in tools/. These are quite similar to
|
||
|
the speexenc/speexdec tools. Use the --help option for details.
|
||
|
|
||
|
There is also a basic tool for testing the encoder and decoder called
|
||
|
"testcelt" located in libcelt/:
|
||
|
|
||
|
% testcelt <rate> <channels> <frame size> <bytes per packet> input.sw output.sw
|
||
|
|
||
|
where input.sw is a 16-bit (machine endian) audio file sampled at 32000 Hz to
|
||
|
96000 Hz. The output file is already decompressed.
|
||
|
|
||
|
For example, for a 44.1 kHz mono stream at ~64kbit/sec and with 256 sample
|
||
|
frames:
|
||
|
|
||
|
% testcelt 44100 1 256 46 intput.sw output.sw
|
||
|
|
||
|
Since 44100/256*46*8 = 63393.74 bits/sec.
|
||
|
|
||
|
All even frame sizes from 64 to 512 are currently supported, although
|
||
|
power-of-two sizes are recommended and most CELT development is done
|
||
|
using a size of 256. The delay imposed by CELT is 1.25x - 1.5x the
|
||
|
frame duration depending on the frame size and some details of CELT's
|
||
|
internal operation. For 256 sample frames the delay is 1.5x or 384
|
||
|
samples, so the total codec delay in the above example is 8.70ms
|
||
|
(1000/(44100/384)).
|