DOC PREVIEW
UT Arlington EE 5359 - Audio coding

This preview shows page 1-2 out of 7 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

AUDIO CODINGEBU TECHNICAL REVIEW – July 2002 1 / 7M. Dietz and S. MeltzerMartin Dietz and Stefan MeltzerCoding Technologies, GermanyCT-aacPlus is a combination of Spectral Band Replication (SBR) technology – abandwidth-extension tool developed by Coding Technologies (CT) in Germany – withthe MPEG Advanced Audio Coding (AAC) technology which, to date, has been oneof the most efficient traditional perceptual audio-coding schemes.CT-aacPlus is able to deliver high-quality audio signals at bit-rates down to 24 kbit/sfor mono and 48 kbit/s for stereo signals. The forthcoming Digital Radio Mondiale(DRM) broadcasting system, among others, will use CT-aacPlus for its audio-codingscheme. CT-aacPlus will enable DRM to deliver an audio quality, in the frequencyrange below 30 MHz, that is equivalent to – or even better than – that offered bytoday’s analogue FM services.This article describes the principles of traditional audio coders – and their limitationswhen used for low bit-rate applications. The second part describes the basic idea ofSBR technology and demonstrates the improvements achieved through thecombination of SBR technology with traditional audio coders such as AAC and MP3.Advanced Audio Coding (AAC) has so far been one of the most efficient traditional perceptual audio-codingalgorithms. In combination with the bandwidth-extension technology, Spectral Band Replication (SBR), thecoding efficiency of AAC can be even further improved by at least 30%, thus providing the same audio qualityat a 30% lower bit-rate. The combination of AAC and SBR – referred to as CT-aacPlus – will be used by theDigital Radio Mondiale transmission system [1] in the frequency bands below 30 MHz and will provide near-FM sound quality at bit-rates of around 20 kbit/s per audio channel.SBR technology is the latest development in audio-coding research and can be combined with nearly everytraditional audio-coding scheme. An average improvement in bit-rate efficiency of at least 30% can beachieved. The combination of coding schemes can be done in a compatible way, thus allowing the upgrade ofexisting systems based on traditional audio coders. Appropriate transition scenarios will allow a nearly seam-less introduction of the new technology which, in the end, will allow us to harness the full advantages of theincreased bit-rate efficiency offered by SBR technology.Traditional perceptual audio codingResearch on perceptual audio coders started about 20 years ago. Research on the human auditory systemrevealed that hearing is mainly based on a short-term spectral analysis of the audio signal. The so-calledmasking effect was observed: the human auditory system is not able to perceive distortions that are masked byAudio codingCT-aacPlus — a state-of-the-artschemeAUDIO CODINGEBU TECHNICAL REVIEW – July 2002 2 / 7M. Dietz and S. Meltzera stronger signal in the spectralneighbourhood. Thus, when lookingat the short-term spectrum, a so-called masking threshold can be cal-culated for this spectrum. Distortionsbelow this threshold are – in the idealcase – inaudible. Research thenstarted on how to calculate the mask-ing threshold (“psycho-acousticmodel”) and on how to process theaudio signal in such a way that onlyaudible information resides in thesignal. Ideally, an audio codecapplies compression such that thedistortion introduced is exactly belowthe masking threshold. Fig. 1 illus-trates the quantization noise that anideal perceptual coder would pro-duce.This research led to today’s well-known traditional perceptual audio codecs, based on waveform codecs; forexample, MPEG Layer 2, Dolby AC-3, MP3, Sony Atrac, Lucent PAC and MPEG AAC. All these codecs arebased on the same principle, as shown in Fig. 2.The audio signal is transformedinto the frequency domain bymeans of a filter bank or trans-form, on a block-by-block basis.The resulting short-time spectrumis quantized in such a way that themasking threshold calculated bythe psycho-acoustic model is notviolated. The quantized spectrumgets coded and packed into a bit-stream. The decoder performs thereverse signal-processing steps,but does not generally contain apsycho-acoustic model.Although the established percep-tual waveform codecs alreadyachieve significant compression,the efficiency is still not highenough to fulfil the needs of sys-tems based on analogue/digitaltelephone lines or wireless sys-tems and broadcasting systems.Fig. 3 illustrates what happens ifthe compression rate is furtherincreased in such a codec: the dis-tortion introduced by the codecviolates the masking thresholdand produces audible artefacts.The main method of overcomingthis problem in traditional percep-tual waveform codecs is to limitthe audio bandwidth. As a conse-Figure 1Ideal perceptual audio codingEnergyFrequencyMasking thresholdPerceptual coderTime/frequencymappingPsychoacousticmodelQuantizingandcodingFramepacking-BitstreamPCM audiodataFigure 2Traditional perceptual waveform encoderEnergyFrequencySevere violation ofmasking threshold!Figure 3Waveform coding beyond its limitsAUDIO CODINGEBU TECHNICAL REVIEW – July 2002 3 / 7M. Dietz and S. Meltzerquence, more information is available for the remainder of the spectrum, resulting in a clean but hollow-sounding signal. Another method, called intensity stereo, can only be used for stereo signals. In intensitystereo, only one channel and some panning information is transmitted, instead of a left and a right channel.However, this is only of limited use in increasing the compression efficiency, as in many cases the stereoimage of the audio signal gets destroyed.SBR – the next step in audio codingSBR – the Spectral Band Replication technology developed by Coding Technologies – is a novel technologythat significantly increases the efficiency of audio coding. SBR is the result of the latest achievements in audiocoding research, which revealed that the high frequencies of an audio signal can be represented much moreefficiently than before. The main effect used is the high correlation between the low- and high-frequency con-tent in an audio signal.In an SBR-based coding system,waveform audio coding is onlyused to code the lower frequen-cies of an audio signal. This lowfrequency content is used to rec-reate the high frequency contentat the decoding side (Fig. 4).This is done by state-of-the-art transposition methods. The recreated high-frequency content undergoes somefrequency and time domain


View Full Document

UT Arlington EE 5359 - Audio coding

Documents in this Course
JPEG 2000

JPEG 2000

27 pages

MPEG-II

MPEG-II

45 pages

MATLAB

MATLAB

22 pages

AVS China

AVS China

22 pages

Load more
Download Audio coding
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Audio coding and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Audio coding 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?