UT Arlington EE 5351 - Audio Coding Technology of ExAC

Unformatted text preview:

Audio Coding Technology of ExAC A. Ehret1, X.D.Pan2, M. Schug1, H. Hoerich1, W.M. Ren3, X.M. Zhu3 and F. Henn4 1:Coding Technologies GmbH, Nuremberg, Germany; 2:Beijing Media Works Co., Ltd, Beijing, China; 3:Beijing Eworld Tech. Co., Ltd, Beijing, China; 4:Coding Technologies AB, Stockholm, Sweden ABSTRACT A new low bitrate audio coding technology (further denoted as “ExAC”) based on Enhanced Audio Coding (EAC) and Spectral Band Replication (SBR) is introduced. The major building blocks of the coding schemes are explained, in which EAC works as a core coder and SBR works as a powerful bandwidth extension module. The new coding technology provides a high quality audio compression scheme for a broad range of applications, including the high-density laser video diskette, HDTV and very low bitrate applications such as AM audio broadcasting and streaming. 1. INTRODUCTION In recent years, digital audio has become an important source of information in the modern world of information systems. In linear representation, digital audio files require a lot of memory or bandwidth for transmission respectively. Many research efforts have been devoted to the problem of audio compression in the last two decades. Two different compression categories have been of particular interest: high performance and low bitrate audio coding. High performance audio coding is aimed to achieve the audio quality as high as possible at a certain bitrate. Applications requiring this type of compression include high-density laser video diskette and HDTV. Vice versa, for applications such as streaming or audio broadcasting, audio coding at lowest possible bitrate whereas maintaining a reasonable audio quality is of primary interest. In this paper, a new audio codec named ExAC is introduced. ExAC is based on the existing audio coding technologies of EAC and SBR. In addition new tools are being applied to guarantee highest possible audio quality, even at very low bitrates and/or very low sampling rates. ExAC is being developed to fulfill the requirements of different applications, with both high performance and low bitrate audio coding characters. The codec has been proposed to the China Audio and Video Standard (AVS) Working Group and Enhanced Video Diskette (EVD) standard, and it has been selected as the audio coding standard of EVD. The following part of the article explains the basic blocks of ExAC, and some test results are presented. 2. OVERVIEW of ExAC The figure 1 illustrates the general structure of the ExAC Encoder, the encoder includes several components: Downsampler, EAC core encoder, SBR encoder and Multiplexer. DownsamplerEAC Core Encoder SBR EncoderMultiplexer Fig1. Diagram of the ExAC Encoder The ExAC Decoder is depicted in Fig. 2. It includes several components: deMultiplexer, EAC core decoder and SBR decoder (with implicit upsampling). deMultiplexerEAC Core Decoder SBR Decoder Fig2. Diagram of the ExAC Decoder 3. EAC CORE EAC is an audio coding technology that has been developed by Beijing E-World Ltd. Co. in which a multi-resolution tiling of the T/F plane, a quantization algorithm to minimize the global perceptual distortion and entropy coding are used to compress the audio signal by utilizing the redundancy as well as the irrelevancy. The EAC codec supports mono, stereo and 5.1 surround stereo encoding and decoding modes, EAC has already been accepted as the audio codec for the EVD (Enhanded Video Diskette) system. EAC works as the core coder of ExAC at half of the nominal sampling rate. It is composed of a 2 : 1 Downsampler, a Time to Frequency Mapping module and a Quantizer, as well as a Psychoacoustic Model. In the ExAC encoder, EAC encodes the low frequency components of the audio signal, and the result will be transmitted to the Multiplexer. A typical EAC encoder is illustrated as Fig.3. The Signal-type Detection analyzes the input audio signal, and the audio frames are classified and labeled as either stationary-like or transient-like. When a transient like frame is encoded, the codec should be adjusted to avoid perceptible pre-echoes. In the current EAC core, FLPVQ Proceedings of 2004 International Symposium on Intelligent Multimedia, Video and Speech Processing October 20-22, 2004 Hong Kong290(Frequency domain Linear Prediction Vector Quantization) and MR (Multi-Resolution Analysis) have a positive effect in mitigating pre-echoes. The basic idea behind FLPVQ is the linear prediction of the spectrum could further improve the time resolution efficiently for a kind of transient-like signal. On the other hand, the EAC core tunes the time-frequency resolution of the encoded signal by employing multi-resolution analysis on the frequency coefficients. The significant vectors in the time-frequency plane are quantized and coded with a Vector Quantizer to improve the coding efficiency. Signal typedetectionTime-frequencymappingMulti-resolutionanalysis+VQFrequncydomain LPVQM/SQuantizing &Entropy codingPsychoacoustic analysisbitstreamAUDIO SIGNAL Fig3 The diagram of EAC encoder In order to reduce redundancy of the multi-channel audio signal, M/S is implemented in which the sum and difference of highly correlated channels are coded rather than the original channels. In the module of Quantizing and Entropy coding, the coefficients are divided into a set of scale factor bands, then a non-uniform scalar quantizer quantizes the coefficients of each band. To improve the coding gain, the Huffman coding method is implemented in this module. A bit allocation loop is introduced to distribute the budgeted bits into scale factor bands when quantizing the coefficients. For a more elaborate description of EAC, please refer to the technical proposal to China Audio and Video Standard working group [1]. 4. SBR MODULE A. Principle The principle of SBR is based on the fact that the high frequencies of an audio signal can be extrapolated from the low frequencies, whereas the reconstruction by means of transposition results in a coding of the high frequency portion with very low overhead. Apart from the pure transposition (see Fig 4a) the reconstruction of the highband is further improved by transmitting guiding information such as the spectral envelope of the original input signal or additional info to compensate for potentially missing high frequency components (see Fig 4b). This guiding information is further referred to as SBR data. Of course, means must be taken to code the SBR data


View Full Document

UT Arlington EE 5351 - Audio Coding Technology of ExAC

Download Audio Coding Technology of ExAC
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Audio Coding Technology of ExAC and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Audio Coding Technology of ExAC 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?