SPHERICAL MICROPHONE ARRAY BASED IMMERSIVE AUDIO SCENE RENDERING

Home> Academic Documents> SPHERICAL MICROPHONE ARRAY BASED IMMERSIVE AUDIO SCENE RENDERING

DOC PREVIEW

This preview shows page 1-2-3 out of 8 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 8 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 8 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 8 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 8 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Proceedings of the 14thInternational Conference on Auditory Display, Paris, France June 24 - 27, 2008SPHERICAL MICROPHONE ARRAY BASED IMMERSIVE AUDIO SCENE RENDERINGAdam M. O’Donovan, Dmitry N. Zotkin, Ramani DuraiswamiPerceptual Interfaces and Reality Laboratory, Computer Science & UMIACS,University of Maryland, College ParkABSTRACTIn many applications such as entertainment, education, mil-itary training, remote telepresence, surveillance, etc. in itis necessary to capture an acoustic field and present it tolisteners, with a goal of creating the same acoustic percep-tion for them as if they were actually present at the scene.. Recently, there is much interest in the use of sphericalmicrophone arrays for acoustic scene capture and reproduc-tion. We describe a 32-microphone spherical array basedsystem implemented for spatial audio capture and reproduc-tion. Our array embeds hardware that is traditionally exter-nal, such as preamplifiers, filters, digital-to-analog convert-ers, and USB adaptor, resulting in a portable lightweight so-lution and requiring no hardware on the PC side whatsoeverother than a high-speed USB port. We provide capabilityanalysis of the array and describe software suite developedfor the application.1. INTRODUCTIONAn important problem related to s patial audio is capture andreproduction of arbitrary acoustic fields. When a humanlistens to an audio scene, much information is extractedby the brain from the audio streams, including the num-ber of competing foreground sources, their directions, envi-ronmental characteristics, presence of background sources,etc. It would be beneficial for many applications if such anarbitrary acoustic scene could be captured and reproducedwith perceptual accuracy. Since audio si gnals received atthe ears change with listener motion, the same effect shouldbe present in the rendered scene, this can be done by the useof a loudspeaker array that attempts to recreate the wholescene in a region or by a head-tracked headphone setup thatdoes it for an individual listener. We focus on headphonepresentation.The key property required from the acoustic scene cap-ture algorithm is the ability to preserve the directionalityof the field in order to render those directional componentsproperly later. While the recording of an acoustic field witha single microphone faithfully preserves the variations inacoustic pressure at the point where the recording was made(assuming an omnidirectional microphone), it is impossibleto infer the directional structure of the field from that record-ing.A microphone array can be u sed to infer directionalityfrom sampled spatial variations of the acoustic field. Oneof the earlier attempts to do that was the use of Ambison-ics technique and the Soundfield microphone [1] to capturethe acoustic field and its three first-order derivatives alongthe coordinate axes. While a certain sense of directional-ity can b e achieved with Ambisonics reproduction, the re-produced sound field is only a rough approximation of theoriginal one. The Ambisonics reproduction includes onlythe first-order spherical harmonics, while accurate repro-duction would require order of about 10 for the frequen-cies up to 8-10 kHz). Recently, researchers turned to usingspherical microphone arrays [2, 3] for spatial structure pre-serving acoustic scene capture. They exhibit a number ofproperties making them especially suitable for this appli-cation, including omnidirectionality, beamforming patternindependent of the steering direction, elegant mathematicalframework for digital beam steering, and ability to utilizewave scattering off the spherical support to improve direc-tionality. Once the directional components of the field arefound, they can be used to present the acoustic field to thelistener by rendering t hose components to appear as arrivingfrom appropriate directions. Such rendering can be done us-ing traditional virtual audio methods (i.e., filtering with thehead-related transfer function (HRTF)) [21]. For perceptualaccuracy, the HRTF of the listener must be used.There exist other recently published methods for cap-turing and reproducing spatial audio scenes. One of them isMotion-Tracked Binaural Sound (MTB) [4], where a num-ber of microphones are mounted on the equator of the ap-proximately head-sized sphere and t he left and right chan-nels of the headphones worn by user are “connected” tothe microphone signals, interpolating between adjacent po-sitions as necessary, based on the current head tracking data.The MTB system successfully creates the impression of pres-ence and responds properly to user motion. I n dividual HRTFsare not incorporated, and sounds rendered are limited to theequatorial plane only. Another capture and reproduction ap-proach is Wave Field Synthesis (WFS) [5, 6] . In WFS, asound field incident to a “transmitting” area is captured atthe boundary of that area and is fed to an array of loud-speakers arranged similarly on the boundary of a “receiv-ing” area, creating the field in the “receiving” area equiva-ICAD08-1Proceedings of the 14thInternational Conference on Auditory Display, Paris, France June 24 - 27, 2008lent to that in the “transmitting” area. This technique is verypowerful, primarily because it can reproduce the field in thelarge area, enabling the user to wander off the reproduction“sweet spot”; h owever, proper field sampling requires ex-tremely large number of microphones/speakers, and mostimplementations focus on sources that are approximately ina horizontal plane.We present the results of a recent research project forportable auditory scene capture and reproduction, where acompact 32-channel microphone array with direct digital in-terface to the computer via standard USB 2.0 port was de-veloped. We have also developed a software package to sup-port the data capture from the array and scene reproductionwith individualized HRTF and head-tracking. The devel-oped system is omnidirectional and supports arbitrary wave-field reproduction (e.g., with elevated or overhead sources).We describe the theory and the algorithms behind the de-veloped hardware and software, the design of the array, theexperimental results obtained, and the capabilities and lim-itations of the array.2. BACKGROUNDIn this section, we describe the basic theory and introducenotation used in the rest of the paper.2.1. Acoustic field representationAny regular acoustic field in a volume is subject to


School:
Email:
New Password:
Confirm Password:

This preview shows page 1-2-3 out of 8 pages.

Please select your school