New version page

Discovering a Semantic Basis of Neural Activity Using Simultaneous Sparse Approximation

This preview shows page 1 out of 3 pages.

View Full Document
View Full Document

End of preview. Want to read all 3 pages?

Upload your study docs or become a GradeBuddy member to access this document.

View Full Document
Unformatted text preview:

Discovering a Semantic Basis of Neural Activity Using SimultaneousSparse ApproximationKeywords: sparsity, simultaneous sparse approximation, multi-task feature learning, simultaneous variableselection, functional magnetic resonance imaging, fMRIMark Palatucci [email protected] Mitchell [email protected] Liu [email protected] Mellon University, Pittsburgh, PA 15213 USAAbstractWe consider the problem of predicting brainactivation in response to arbitrary words inEnglish. Whereas previous computationalmodels have encoded words using prede-fined sets of features, we formulate a modelthat can automatically learn features directlyfrom data. We show that our model re-duces to a simultaneous sparse approxima-tion problem and show two examples wherelearned features give insight about how thebrain represents meanings of words.1. IntroductionOver the last several years, researchers have designedalgorithms to learn the complex patterns of brain (neu-ral) activity from data generated by functional mag-netic resonance imaging (fMRI). These algorithms areoften called cognitive state classifiers and can be usedto discriminate between different mental states. Forexample, one study has shown it is possible to distin-guish between categories of objects a person is thinkingabout simply by observing an image of his/her neuralactivity (Mitchell et al., 2004). Others have shownit is possible to determine lies from truth (Davatzikoset al., 2005) and whether someone is a Democrat orRepublican (Kaplan et al., 2007).While a large literature has developed around cogni-tive state classification (which maps neural activityto cognitive states), little attention has been givento the inverse problem: is it possible to predict neu-Preliminary work. Under review by the International Con-ference on Machine Learning (ICML). Do not distribute.ral activity for a novel state? One recent study fromKay (2008) predicts the neural activity in the visualcortex in response to viewing a novel scene.Another study from Mitchell (2008) predicts neuralactivity in response to thinking about an arbitraryword in English. In this work, the semantic mean-ing of a word is encoded by co-occurrence statisticswith other words in a very large text corpus. Using asmall number of training words, a generative model islearned that maps these co-occurrence statistics to im-ages of neural activity recorded while thinking aboutthose words. Their model can then predict images fornew words that were not included in the training set.The model shows predicted images that are similar toobserved images for those words.In their initial model each word is encoded by a vec-tor of co-occur rences with 25 sensory-action verbs(e.g. eat, ride, wear). For example, words related tofoods such as “apples” and “oranges” would have fre-quent co-occurrences with the word “eat” but few co-occurrences with the word “wear”. Conversely, wordsrelated to clothes such as “shirt” or “dress” would co-occur frequently with the word “wear” but not theword “eat”. Thus “eat” and “wear” are example basiswords used to encode relationships of a br oad set ofother words.These 25 sensory-action verbs were chosen based ondomain knowledge from the cognitive neuroscience lit-erature and are considered a semantic basis of latentword meaning. A natural question is:What is the optimal basis of words to represent seman-tic meaning across many concepts?Rather than relying on models that require a predeter-mined set of words, our research tries to build modelsDiscovering a Semantic Basis of Neural Activity Using Simultaneous Sparse Approximationthat will perform automatic variable selection to learna semantic basis of word meaning. We want to learnmodels that not only predict neural activity well, butalso give insights into how the brain represents themeaning of different concepts.1.1. Related WorkRegression models such as the L1regularized Lasso(Tibshirani, 1996) have been used successfully to per-form variable selection. The typical model usually in-volves a large number of explanatory variables (fea-tures) and a single response variable. The Lasso willyield the best prediction of the response using only asmall number of variables.Recently, some attempts have been made to performmultiple response variable selection which will selecta small number of variables that can explain multi-ple responses well. In statistics this is known as thesimultaneous lasso (Turlach et al., 2005). The prob-lem has also been addressed in machine learning asmulti-task feature learning (Argyriou et al., 2007) andin signal processing as simultaneous sparse approxima-tion (Tropp, 2006).Several methods have formulated and solved the prob-lem using convex programming but the current ap-proaches seem very limited in scale. To our knowl-edge, there is no formulation that can solve problemswith thousands of responses and thousands of explana-tory variables. An alternative approach that is moretractable and does not involve convex programmingis the greedy pursuit method simultaneous orthogonalmatching pursuit (SOMP) (Tropp et al., 2006). Neuralnetworks also offer a convenient formulation for mul-tiple outputs and could be used with regularizationconstraints.2. Problem FormulationWe can formulate the multiple response variable selec-tion problem as a convex program. Let N be the num-ber of examples, T be the number of responses, d bethe number of explanatory variables. Let Y ∈ ℜNxTbe the matrix of response variables and X ∈ ℜNxdbeour design matrix of explanatory variables. Our ob-jective then is to find a sparse matrix of coefficientsB ∈ ℜdxT. Let λ be a regularization parameter thatcontrols the row sparsity of the matrix. Formally,bB = argminB||Y − XB||2F+ λdXimaxj∈[1...T ]|Bij| (1)Our problem of discovering a semantic basis of neu-ral activation can be formulated using Equation (1).Output weights of 3 distinct hidden units Figure 1. Weights on the output units for 3 hidden unitsin a 2-layer neural network. Regions with high weights aremarked in red.In our problem, we have a design matrix X whichis the co-occurrences of N = 60 training words withd = 50, 000 other English words. Each column of theresponse matrix Y contains the neural activations for asingle voxel (volume-element) across the N fMRI im-ages. There are T = 20, 000 voxels. Our goal is tofind a small number of words in X that can accuratelypredict the neural activity across multiple


Loading Unlocking...
Login

Join to view Discovering a Semantic Basis of Neural Activity Using Simultaneous Sparse Approximation and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Discovering a Semantic Basis of Neural Activity Using Simultaneous Sparse Approximation and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?