DOC PREVIEW
MIT HST 723 - Fundamentals of Perceptual Audio Coding

This preview shows page 1-2-14-15-29-30 out of 30 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 30 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 30 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 30 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 30 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 30 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 30 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 30 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Fundamentals of Perceptual Audio EncodingGoals of LabDigital AudioQuantizationQuantizationDigital AudioCompressionMPEG Overview of Perceptual EncodingMaskingQuantization NoiseSub-band CodingSub-band CodingMasking/Bit AllocationExample: MPEG-1 Psychoacoustic Model IExample: MPEG-1 Psychoacoustic Model IExample: MPEG-1 Psychoacoustic Model IExample: MPEG-1 Psychoacoustic Model IExample: MPEG-1 Psychoacoustic Model IExample: MPEG-1 Psychoacoustic Model ILab Experiments Method of AdjustmentExp 1: Masking Pattern Exp 2: Masking ThresholdsAsymmetry of Simultaneous MaskingExp 2: Masking ThresholdsLab Write-upFundamentals Fundamentals of of Perceptual Audio EncodingPerceptual Audio EncodingCraig LewistonHST.723 Lab II3/23/06Goals of Lab• Introduction to fundamental principles of digital audio & perceptual audio encoding• Learn the basics of psychoacoustic models used in perceptual audio encoding.• Run 2 experiments exploring some fundamental principles behind the psychoacoustic models of perceptual audio encoding.Digital AudioDigital AudioQuantizationQuantizationN Bits => 2N Bits => 2NNlevelslevelsQuantization Noise is the difference between the analog signal and the digital representation, and arises as a result of the error in the quantization of the analog signal.38416532825616 65536BitsBitsLevelsLevelsWith each increase in the bit level, the digital representation of the analog signal increases in fidelity, and the quantization noise becomes smaller.QuantizationDigital AudioDigital AudioCD Audio: • 16 bit encoding• 2 Channels (Stereo)• 44.1 kHz sampling rate2 * 44.1 kHz * 16 bits = 1.41 Mb/s+Overhead (synchronization, error correction, etc.)CD Audio = 4.32 Mb/sCompressionCompression• High data rates, such as CD audio (4.32 Mb/s), are incompatible with internet & wireless applications.• Audio data must somehow be compressed to a smaller size (less bits), while not affecting signal quality (minimizing quantization noise).• Perceptual Audio Encoding is the encoding of audio signals, incorporating psychoacoustic knowledge of the auditory system, in order to reduce the amount of bits necessary to faithfully reproduce the signal.• MPEG-1 Layer III (aka mp3)• MPEG-2 Advanced Audio Coding (AAC)MPEG MPEG = Motion Picture Experts GroupMPEG is a family of encoding standards for digital multimedia information•MPEG-1: a standard for storage and retrieval of moving pictures and audio on storage media (e.g., CD-ROM). • Layer I• Layer II• Layer III (aka MP3)•MPEG-2:standard for digital television, including high-definition television (HDTV), and for addressing multimedia applications. • Advanced Audio Coding (AAC)•MPEG-4: a standard for multimedia applications, with very low bit-rate audio-visual compression for those channels with very limited bandwidths (e.g., wireless channels).•MPEG-7:a content representation standard for information searchGeneral Perceptual Audio Encoder (Painter & Spanias, 2000):• Psychoacoustic analysis => masking thresholds• Basic principle of Perceptual Audio Encoder: use masking pattern of stimulus to determine the least number of bits necessary for each frequency sub-band, so as to prevent the quantization noisefrom becoming audible.Overview of Perceptual EncodingMaskingQuantization Noise0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1−3−2−10123Time (ms)Amplitude (Quantization Levels)0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1−15−10−5051015Time (ms)Amplitude (Quantization Levels)0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1−250−200−150−100−50050100150200250Time (ms)Amplitude (Quantization Levels)102103104−30−20−100102030Frequency (Hz)Level (dB)Sub-band Coding102103104−30−20−100102030Frequency (Hz)Level (dB)102103104−30−20−100102030Frequency (Hz)Level (dB)102103104−30−20−100102030Frequency (Hz)Level (dB)Sub-band Codingmm-1m+1Masking/Bit AllocationThe number of bits used to encode each frequency sub-band is equal to the least number of bits with a quantization noise that is below the minimum masking threshold for that sub-band.Example: MPEG-1 Psychoacoustic Model I1. Spectral Analysis and SPL NormalizationExample: MPEG-1 Psychoacoustic Model I2. Identification of Tonal Maskers & calculation of individual masking thresholdsExample: MPEG-1 Psychoacoustic Model I2. Identification of Noise Maskers & calculation of individual masking thresholdsExample: MPEG-1 Psychoacoustic Model I4. Calculation of Global Masking ThresholdsExample: MPEG-1 Psychoacoustic Model IA - Some portions of the input spectrum require SNR’s > 20 dBB - Other portions require less than 3 dB SNRC - Some high frequency portions are masked by the signal itselfD - Very high frequency portions fall below the absolute threshold of hearing.ABCDExample: MPEG-1 Psychoacoustic Model I5. Sub-band Bit AllocationLab ExperimentsExp 1: Masking Pattern• Measure absolute hearing thresholds in quiet• Measure absolute hearing thresholds in presence of narrowband noise maskerExp 2: Masking Threshold• Measure masking threshold of a 1 kHz tone in the presence of four different maskers:– Tone–Gaussian Noise– Multiplied Noise–Low-noise NoiseMethod of AdjustmentGeorg von BekesyMethod of Adjustment (aka Békésy tracking method)Target tone is swept through frequency range, and subject must adjust intensity of target tone so that it is just barely detectableExp 1: Masking PatternMaskerMaskedSoundsThreshold in quietMaskedthresholdExp 2: Masking ThresholdsCalculation of tonal & noise masking thresholds:Tonal & noise maskers have different masking effects…Asymmetry of Simultaneous MaskingTone maskerSNR ~ 24 dBNoise maskerSNR ~ 4 dBWhy do tones and noises have different masking effects?Signal = A(t) ejω(t) + φ(t)For narrowband Gaussian noise, ejω(t) is approximately the same as a tone centered at the same frequency.Asymmetry effect is either due to the amplitude term A(t) or to the phase term φ(t), or a combination of both.Asymmetry of Simultaneous MaskingAsymmetry of Simultaneous MaskingMeasure masking effects of “modified” noises:Multiplied Noise: generated by multiplying a sinusoid at 1 kHz with a low-pass Gaussian noise. Amplitude => Gaussian NoisePhase => Pure ToneLow-Noise Noise: Gaussian noise with a temporal envelope that has been smoothed.Amplitude => Pure TonePhase => Gaussian NoiseTarget(Quantization noise)Masker(Desire signal)Gaussian noise ToneGaussian noise Gaussian noiseGaussian noise Multiplied noiseGaussian noise Low-noise


View Full Document

MIT HST 723 - Fundamentals of Perceptual Audio Coding

Download Fundamentals of Perceptual Audio Coding
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Fundamentals of Perceptual Audio Coding and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Fundamentals of Perceptual Audio Coding 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?