DOC PREVIEW
MIT HST 722 - Cross- Modal Integration of Speech

This preview shows page 1-2-3 out of 10 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Behavioral/Systems/CognitivePerceptual Fusion and Stimulus Coincidence in the Cross-Modal Integration of SpeechLee M. Miller1and Mark D’Esposito21Section of Neurobiology, Physiology, and Behavior and Center for Mind and Brain, University of California, Davis, California 95616, and2Helen WillsNeuroscience Institute, University of California, Berkeley, California 94720Human speech perception is profoundly influenced by vision. Watching a speaker’s mouth movements significantly improves compre-hension, both for normal listeners in noisy environments and especially for the hearing impaired. A number of brain regions have beenimplicated in audiovisual speech tasks, but little evidence distinguishes them functionally. In an event-related functional magneticresonance imaging study, we differentiate neural systems that evaluate cross-modal coincidence of the physical stimuli from those thatmediate perceptual binding. Regions consistently involved in perceptual fusion per se included Heschl’s gyrus, superior temporal sulcus,middle intraparietal sulcus, and inferior frontal gyrus. Successful fusion elicited activity biased toward the left hemisphere, althoughfailed cross-modal binding recruited regions in both hemispheres. A broad network of other areas, including the superior colliculus,anterior insula, and anterior intraparietal sulcus, were more involved with evaluating the spatiotemporal correspondence of speechstimuli, regardless of a subject’s perception. All of these showed greater activity to temporally offset stimuli than to audiovisuallysynchronous stimuli. Our results demonstrate how elements of the cross-modal speech integration network differ in their sensitivity tophysical reality versus perceptual experience.Key words: cross modal; audiovisual; multisensory; speech; binding; fMRIIntroductionMerging information from different senses confers distinct be-havioral advantages, enabling faster and more accurate discrim-ination than with unimodal stimuli (Hershenson, 1962; Morrell,1968; Stein et al., 1989; Perrott et al., 1990; Hughes et al., 1994;Frens et al., 1995), especially when the signals are degraded(Sumby and Pollack, 1954; MacLeod and Summerfield, 1987;Perrott et al., 1991; Benoit et al., 1994). To realize these advan-tages, the brain continually coordinates sensory inputs across theaudiovisual (Calvert et al., 2000; Grant and Seitz, 2000; Shams etal., 2002; Callan et al., 2003), visual–tactile (Banati et al., 2000;Macaluso et al., 2000; Stein et al., 2001), and audiosomatic(Schulz et al., 2003) domains and combines them into coherentperceptions. With speech, an instance of paramount behavioralimportance, vision strongly influences auditory perception evenat the basic level of the phoneme (McGurk and MacDonald,1976). Watching a speaker’s mouth movements improves com-prehension, especially for normal listeners in noisy environmentsand for the hearing impaired (Sumby and Pollack, 1954; Grant etal., 1998; Sekiyama et al., 2003).Although the psychophysics of cross-modal speech has a longhistory, relatively few studies address the neural substrates ofcombining auditory and visual speech information (for review,see Calvert, 2001). Nonetheless, among human imaging studies, anumber of brain regions have repeatedly been implicated incross-modal integration, particularly of speech and other audio-visual stimuli. These include high-level associative or integrativecortices such as the superior temporal sulcus (STS), intraparietalsulcus (IPS), inferior frontal gyrus (IFG), and insula, as well assubcortical or traditionally unimodal regions such as the superiorcolliculus (SC), the MT/V5 complex, and Heschl’s gyrus (Calvertet al., 1999, 2000, 2001; Callan et al., 2001, 2003, 2004; Olson etal., 2002; Beauchamp et al., 2004; Mottonen et al., 2004; Pekkolaet al., 2005). Given their repeated identification across multiple,well controlled studies, these brain areas almost certainly playsome integral part in processing cross-modal speech, althoughtheir functional roles in this complex task are essentiallyunknown.In this study, we identify the large-scale functional networksdevoted to two separable processes during cross-modal speechintegration: the sensory comparison of auditory and visual stim-ulus attributes and the actual perception of a unified cross-modalevent. We hypothesize that distinct networks of brain regions arepreferentially sensitive to each process. Event-related functionalmagnetic resonance imaging (fMRI) allows us to explore thisbasic distinction between physical and experiential aspects of atask, or between sensory correspondence and perceptual fusion.Subjects were presented with audiovisual speech in which theauditory and visual signals occurred either synchronously or off-set in time, approximating real-life noisy and reverberant condi-tions. For each utterance, the subject indicated whether the audioand video were fused as a single perceptual event or experiencedReceived Dec. 9, 2004; revised April 18, 2005; accepted May 16, 2005.This work was supported by a grant from the National Institutes of Health–National Institute on Deafness andOther Communication Disorders. We thank Ben Inglis for technical assistance with the scanner.Correspondence should be addressed to Dr. Lee M. Miller, Section of Neurobiology, Physiology, and Behavior,University of California, One Shields Avenue, Davis, CA 95616. E-mail: [email protected]:10.1523/JNEUROSCI.0896-05.2005Copyright © 2005 Society for Neuroscience 0270-6474/05/255884-10$15.00/05884 • The Journal of Neuroscience, June 22, 2005 • 25(25):5884 –5893as successive in time. Stimulus properties were dissociated fromperceptual experience by adjusting the audiovisual temporal off-set. In this way, we could statistically assess brain activity relatedto the evaluation of spatiotemporal correspondence indepen-dently from brain activity related to perceptual binding.Materials and MethodsSubjects. Seventeen healthy subjects (11 females; age, 18–33 years) gavewritten consent according to procedures approved by the University ofCalifornia. All were right-handed, were native English speakers with self-reported normal hearing, had normal or corrected vision, and had at least12 years of education. None of the participants had a history of neuro-logical or psychiatric disease, nor were they using any medications duringthe 3 months before the experiment. As described below, all results re-ported are for the 11


View Full Document

MIT HST 722 - Cross- Modal Integration of Speech

Documents in this Course
Load more
Download Cross- Modal Integration of Speech
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Cross- Modal Integration of Speech and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Cross- Modal Integration of Speech 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?