DOC PREVIEW
ROCHESTER PHY 103 - Lecture Notes - Psycho-acoustics and MP3 Audio Encoding

This preview shows page 1-2-16-17-18-33-34 out of 34 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 34 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 34 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 34 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 34 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 34 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 34 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 34 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 34 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Psycho-acoustics and MP3 audio encodingMP3Auditory CodingEncoding and DecodingLossy vs Lossless compressionAdding noiseEasy chops:Bad ways to compress an audio fileMaskingDefinition of maskingSlide 11Critical band width as a function of frequencyCritical band conceptThe nature of the auditory filterPhysiological reasons for the maskingPitch perceptionPitch perception vs maskingTemporal effects - non-simultaneous maskingPhysiological explanations for temporal maskingComodulation masking releasePerception of loudness Just noticeable differenceJND as a function of loudnessLoudness and the Critical BandOutside the critical bandPitch information area for complex tonesPitch depends on partial pitchesTimbre depends on frequencyMP3 schematicSlide 2913 dB miracleSlide 31Pushing MP3 to its limitsLimits of MP3Auditory illusionsPsycho-acoustics and MP3 audio encodingPhysics of Music PHY103MP3•MPEG is moving pictures experts group. set up by ISO (international standards organization) every few years issues a standard MPEG1 (1992), MPEG2(1994)..•MP3 stands for MPEG audio layer III•Longer history – age of photo-video compression – in part started with audio compression experiments in the late ’80sAuditory Coding1) Time frequency decomposition – divide the signal into pieces, obtain the spectrum of each piece2) Use psycho-acoustic masking model to determine what information to keep3) Store the information in the most compact way possible – minimize the bitrate and maximize the audible auditory content4) System of synchronizationEncoding and DecodingEncoding:•Auditory signal (from a recording) is coded into an mp3 file containing carefully stored spectral informationDecoding:•mp3 file is turned back into an auditory file that can be output to your speakersStreaming: •This can be done in real time even if you don’t have the entire fileLossy vs Lossless compression•Compression: Store in a very compact format, more compact than the original audio file•Lossless compression means no information is removed•MP3 is a lossy type of compression. Information is lost during compression. Only inaudible information should be removed. Topic of current research on whether expert listeners can hear differences and how much is enough ...•MP3 achieves a 10:1 compression ratio!•This enables bit-streaming, makes storing audio very compactAdding noise•Rather than removing information MP3 adds noise. This is done by describing the signal with degraded digital precision.•If you fail to digitize something sufficiently accurately, this is equivalent to adding noise•The added noise should be inaudible it is below the mask thresholdEasy chops:•Don’t bother storing information outside the range of hearing (outside 40Hz-15kHz)•Stereo info not stored for low frequenciesBad ways to compress an audio file•Reduce the total number of bits per sample (e.g. 32 bit to 16 or 16 to 8 bit)  this gives you a factor of 2 in compression. However you get a noisier signal•Reduce the sampling rate (44kHz to 22kHz or 22kHz to 10kHz). Total loss of all high frequency information. Again only a gain of a factor of 2 in size. Equivalent to a high pass filter.•A factor 10:1 in compression cannot be achieved using lin ear compression schemesMaskingIf a dominant tone is present then noise can be added at frequencies next to it and this noise will not be heard. Less precision is required to store nearby frequencies.13dBcritical bandDefinition of masking•The process by which the threshold of audibility for one sound is raised by the presence of another (masking) sound•The amount by which the threshold is raised by the masker (in dB).A sine (signal) in the presence of noise that has a band width (in frequency) centered around the signal.The wider the noise bandwidth the more the signal (sine wave) is masked.Past a particular frequency width the masking doesn’t increase.critical bandcritical bandCritical band width as a function of frequencySize of critical band is typically one tenth of the frequencyCritical band concept•Only a narrow band of frequencies surrounding the tone – those within the critical band contribute to masking of the tone•When the noise just masks the tone, the power of the tone divided by the power of the noise inside the band is a constant.The nature of the auditory filter•The auditory filter is not necessarily square – actually it is more like a triangle shape•Critical band width is sometimes referred to as ERB (equivalent rectangular bandwidth)•Shape difficult to measure in psychoacoustic experiments because of side band listening affects some innovative experiments (notched filtered noise + signal) designed to measure the actual shape of the filter).Physiological reasons for the masking•Basal membrane? The critical bandwidths at different frequencies correspond to fixed distances along the basal membrane. •However the masking could be a result of feedback in the neuron firing instead. Negative reinforcement or suppression of signals. Or swamping of signals.Pitch perceptionAbility to discriminate between a change in frequency as a function of pulse durationDLF (Difference Limen for Frequency) given in % of centralPitch perception vs masking•Note our ability to detect pitch changes is at the level of 0.25% well below the width of the critical band.•This precision requires active hair/basal membrane interactions in the cochleaTemporal effects - non-simultaneous masking•The peak ratio of the masker is important -- that means its variations in volume as a function of time compared to its rms value. Short loud peaks don’t necessarily contribute to the masking as much as a continuous noise.•Both forward and backward masking - masking can occur if a loud masker is played just after the signal!•Masking decays to 0 after 100-200msPhysiological explanations for temporal masking•Basal membrane is ringing preventing detection in that region for a particular time•Neurons take a while to recover - neural fatigueComodulation masking release•A masked signal if comodulated with frequencies outside the critical band can be detected below the masking threshold•In the same way that the overtones/spectrum is used to identify a sound. Sounds outside the critical band, since they are modulated the same as the signal, are used to pull it out (detect it) from more than one critical band region.Perception of loudnessJust


View Full Document

ROCHESTER PHY 103 - Lecture Notes - Psycho-acoustics and MP3 Audio Encoding

Documents in this Course
Load more
Download Lecture Notes - Psycho-acoustics and MP3 Audio Encoding
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture Notes - Psycho-acoustics and MP3 Audio Encoding and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture Notes - Psycho-acoustics and MP3 Audio Encoding 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?