EE5359 MULTIMEDIA PROCESSING REPORT Project Proposal Title Study and comparison of AC 3 AAC and HE AAC audio codecs Abstract The spectral band replication technology SBR is an advancement in the field of low bit rate audio coding and it enhances the performance of the traditional audio coders Coding Technologies an international company in the audio coding field has developed and marketed SBR MPEG AAC belonging to the ISO MPEG standard has shown a tremendous improvement with SBR 1 The coding efficiency of the traditional audio coders with SBR increases at least by 30 7 The SBR is a bandwidth extension technique which exploits the strong correlation effect between the low and high frequency content in an audio signal In this project a performance analysis of the MPEG AAC audio coders and advanced audio coding AAC audio coders with SBR will be implemented which includes a comparison of the coding efficiency Student Dhatchaini Rajendran Student ID 1000636681 Email dhatchaini rajendran mavs uta edu Date September 28 2010 ACRONYMS AND ABBREVIATIONS AAC Advanced audio coding AC 3 Audio codec 3 AES Audio Engineering Society ATSC Advanced television systems committee HE AAC High efficiency advanced audio coding IMDCT Inverse modified discrete cosine transform ISO International organization for standardization LC Low complexity LFE Low frequencies enhancement LTP Long term prediction MDCT Modified discrete cosine transform MPEG Moving pictures experts group PCM Pulse code modulation SBR Spectral band replication SRS Sample rate scalable TNS Temporal noise shaping An Overview of Perceptual Audio Coding Audio coding algorithms aim at representing the audio signal with minimum number of bits and at the same time achieves signal reproduction with minimum errors Perceptual audio coding algorithms make use of facts like the insensitivity of the human ear to frequencies less than 20 kHz and the redundancy in audio signals to accomplish maximum compression of the audio signal The irrelevant information in the signal is identified by using several psychoacoustic parameters like absolute hearing thresholds simultaneous masking critical band frequency analysis temporal masking and spread of masking along the basilar membrane Digital Audio Input Analysis Filter Bank Quantization and Coding Encoding of Bitstream Perceptual Model Figure 1 Block diagram of perceptual encoding decoding scheme 1 The blocks in Fig 1 are explained below The filter bank decomposes the digital input signal into its subsampled spectral components in the time or frequency domain The perceptual model uses the time domain input signal and mostly the output of the analysis filter bank along with the psychoacoustic rules and calculates the actual masking threshold This is called the perceptual model of the perceptual encoding system The quantization and coding of the spectral components is done and the noise introduced by quantizing below the masking threshold level is retained There are several ways of accomplishing this step from simple block companding to analysis by synthesis systems using additional noiseless compression A bitstream formatter is used in the encoding of the bitstream which is made up of quantized and coded spectral coefficients and some side information like bit allocation information An Overview of AC 3 Audio Codec AC 3 is an audio codec developed by Dolby Laboratories Dolby AC 3 audio compression algorithm is a advanced television systems committee ATSC standard for digital audio compression 2 It is a lossy audio compression format and supports multi channel format and is used in a variety of applications including digital television and DVD There are 5 full range channels 3Hz 20 000Hz Three of them are in the front left right and centre and the other two are surround channels The sixth channel ranges from 3Hz 120Hz and is also known as low frequencies enhancement LFE Channel This set of channels is known as 5 1 channels Figure 2 Block diagram of AC 3 encoder 2 The working of the AC 3 encoder blocks in Fig 2 is explained here 2 Transforming the representation of audio from a sequence of PCM time samples into a sequence of frequency coefficients blocks is the first step in the encoding process This is accomplished with the analysis filter bank Overlapping blocks of 512 time samples are transformed into the frequency domain by multiplying them with a time window As the blocks overlap each PCM input sample is represented by two sequential transformed blocks Thus the frequency domain representation gets decimated by a factor of two and so each block will contain 256 frequency coefficients A binary exponent and mantissa is used to represent each frequency The set of exponents is encoded into a coarse representation of the signal spectrum which is referred to as the spectral envelope The core bit allocation routine is used to determine the number of bits used to encode each individual mantissa The mantissa is then quantized according to the bit allocation information The spectral envelope and the coarsely quantized mantissas for 6 audio blocks 1536 audio samples are formatted into an AC 3 frame The AC 3 bit stream from 32 to 640 kbps is a sequence of AC 3 frames The AC 3 decoder function is the exact opposite to the encoder An overview of MPEG Advanced Audio Coding Advanced audio coding scheme was a joint development by Dolby Fraunhoffer AT T Sony and Nokia 9 It is a digital audio compression scheme for medium to high bit rates which is not backward compatible with motion pictures experts group MPEG audio standards The AAC encoding follows a modular approach and the standard define four profiles which can be chosen based on factors like complexity of bitstream to be encoded desired performance and output Low complexity LC Main profile MAIN Sample rate scalable SRS Long term prediction LTP Excellent audio quality is provided by AAC and it is suitable for low bit rate high quality audio applications MPEG AAC audio coder uses the AAC scheme HE AAC also known as aacPlus is a low bit rate audio coder It is an AAC LC audio coder enhanced with SBR technology A generic block diagram of an AAC encoder is shown in Fig 3 3 AAC is a second generation coding scheme which is used for stereo and multichannel signals When compared to the perceptual coders AAC provides more flexibility and uses more coding tools 3 The coding efficiency is enhanced by the following tools and they help attain higher quality at lower bit rates 3
View Full Document