AUDIO CODING CT aacPlus a state of the art Audio coding scheme Martin Dietz and Stefan Meltzer Coding Technologies Germany CT aacPlus is a combination of Spectral Band Replication SBR technology a bandwidth extension tool developed by Coding Technologies CT in Germany with the MPEG Advanced Audio Coding AAC technology which to date has been one of the most efficient traditional perceptual audio coding schemes CT aacPlus is able to deliver high quality audio signals at bit rates down to 24 kbit s for mono and 48 kbit s for stereo signals The forthcoming Digital Radio Mondiale DRM broadcasting system among others will use CT aacPlus for its audio coding scheme CT aacPlus will enable DRM to deliver an audio quality in the frequency range below 30 MHz that is equivalent to or even better than that offered by today s analogue FM services This article describes the principles of traditional audio coders and their limitations when used for low bit rate applications The second part describes the basic idea of SBR technology and demonstrates the improvements achieved through the combination of SBR technology with traditional audio coders such as AAC and MP3 Advanced Audio Coding AAC has so far been one of the most efficient traditional perceptual audio coding algorithms In combination with the bandwidth extension technology Spectral Band Replication SBR the coding efficiency of AAC can be even further improved by at least 30 thus providing the same audio quality at a 30 lower bit rate The combination of AAC and SBR referred to as CT aacPlus will be used by the Digital Radio Mondiale transmission system 1 in the frequency bands below 30 MHz and will provide nearFM sound quality at bit rates of around 20 kbit s per audio channel SBR technology is the latest development in audio coding research and can be combined with nearly every traditional audio coding scheme An average improvement in bit rate efficiency of at least 30 can be achieved The combination of coding schemes can be done in a compatible way thus allowing the upgrade of existing systems based on traditional audio coders Appropriate transition scenarios will allow a nearly seamless introduction of the new technology which in the end will allow us to harness the full advantages of the increased bit rate efficiency offered by SBR technology Traditional perceptual audio coding Research on perceptual audio coders started about 20 years ago Research on the human auditory system revealed that hearing is mainly based on a short term spectral analysis of the audio signal The so called masking effect was observed the human auditory system is not able to perceive distortions that are masked by EBU TECHNICAL REVIEW July 2002 M Dietz and S Meltzer 1 7 AUDIO CODING Perceptual coder Energy a stronger signal in the spectral neighbourhood Thus when looking at the short term spectrum a socalled masking threshold can be calculated for this spectrum Distortions below this threshold are in the ideal case inaudible Research then started on how to calculate the masking threshold psycho acoustic model and on how to process the audio signal in such a way that only audible information resides in the signal Ideally an audio codec applies compression such that the distortion introduced is exactly below the masking threshold Fig 1 illustrates the quantization noise that an ideal perceptual coder would produce Masking threshold Frequency Figure 1 Ideal perceptual audio coding This research led to today s well known traditional perceptual audio codecs based on waveform codecs for example MPEG Layer 2 Dolby AC 3 MP3 Sony Atrac Lucent PAC and MPEG AAC All these codecs are based on the same principle as shown in Fig 2 PCM audio data Time frequency mapping Quantizing and coding Frame packing Psycho acoustic model Figure 2 Traditional perceptual waveform encoder Energy Severe violation of masking threshold Frequency Figure 3 Waveform coding beyond its limits EBU TECHNICAL REVIEW July 2002 M Dietz and S Meltzer Bitstream The audio signal is transformed into the frequency domain by means of a filter bank or transform on a block by block basis The resulting short time spectrum is quantized in such a way that the masking threshold calculated by the psycho acoustic model is not violated The quantized spectrum gets coded and packed into a bitstream The decoder performs the reverse signal processing steps but does not generally contain a psycho acoustic model Although the established perceptual waveform codecs already achieve significant compression the efficiency is still not high enough to fulfil the needs of systems based on analogue digital telephone lines or wireless systems and broadcasting systems Fig 3 illustrates what happens if the compression rate is further increased in such a codec the distortion introduced by the codec violates the masking threshold and produces audible artefacts The main method of overcoming this problem in traditional perceptual waveform codecs is to limit the audio bandwidth As a conse2 7 AUDIO CODING Abbreviations AAC MPEG 2 4 Advanced Audio Coding HVXC MPEG Harmonic Vector Excitation Coding AES Audio Engineering Society MPEG Moving Picture Experts Group AM Amplitude Modulation MUSHRA EBU MUlti Stimulus test with Hidden Reference and Anchors CT aacPlus Coding Technologies advanced audio coding Plus QAM Quadrature Amplitude Modulation DRM Digital Radio Mondiale RF Radio Frequency DVB Digital Video Broadcasting SBR Spectral Band Replication FM Frequency Modulation WMA Microsoft Windows MediA quence more information is available for the remainder of the spectrum resulting in a clean but hollowsounding signal Another method called intensity stereo can only be used for stereo signals In intensity stereo only one channel and some panning information is transmitted instead of a left and a right channel However this is only of limited use in increasing the compression efficiency as in many cases the stereo image of the audio signal gets destroyed SBR the next step in audio coding SBR the Spectral Band Replication technology developed by Coding Technologies is a novel technology that significantly increases the efficiency of audio coding SBR is the result of the latest achievements in audio coding research which revealed that the high frequencies of an audio signal can be represented much more efficiently than before The main effect used is the high correlation between the low and high frequency content in an audio signal In an SBR based coding
View Full Document