DOC PREVIEW
CMU CS 10701 - Dimensionality Reduction_03_31_2011_ann

This preview shows page 1-2-3-4 out of 11 pages.

Save
View full document
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Machine Learning 10 701 Tom M Mitchell Machine Learning Department Carnegie Mellon University March 31 2011 Today Learning representations III Readings Deep Belief Networks ICA CCA Neuroscience example Latent Dirichlet Allocation Deep Belief Networks Hinton Salakhutdinov Science 2006 Problem training networks with many hidden layers doesn t work very well local minima very slow training if initialize with zero weights Deep belief networks autoencoder networks to learn low dimensional encodings but more layers to learn better encodings 1 Deep Belief Networks Hinton Salakhutdinov 2006 original image reconstructed from 2000 1000 500 30 DBN reconstructed from 2000 300 linear PCA versus logistic transformations linear transformations Encoding of digit images in two dimensions Hinton Salakhutdinov 2006 784 2 linear encoding PCA 784 1000 500 250 2 DBNet 2 Restricted Boltzman Machine Bipartite graph logistic activation Inference fill in any nodes estimate other nodes consider vi hj are boolean variables h1 v1 v2 h2 h3 vn Hinton Salakhutdinov 2006 Deep Belief Networks Training 3 Independent Components Analysis ICA PCA seeks orthogonal directions Y1 YM in feature space X that minimize reconstruction error ICA seeks directions Y1 YM that are most statistically independent I e that minimize I Y the mutual information between the Yj x x Dimensionality reduction across multiple datasets Given data sets A and B find linear projections of each into a common lower dimensional space Generalized SVD minimize sq reconstruction errors of both Canonical correlation analysis maximize correlation of A and B in the projected space learned shared representation data set A data set B 4 slide courtesy of Indra Rustandi An Example Use of CCA arbitrary word Generative theory of word representation predicted brain activity 5 fMRI activation for bottle bottle Mean activation averaged over 60 different stimuli fMRI activation high average below average bottle minus mean activation Idea Predict neural activity from corpus statistics of stimulus word Mitchell et al Science 2008 Generative theory predicted activity for telephone telephone Statistical features from a trillion word text corpus Mapping learned from fMRI data 6 Semantic feature values celery 0 8368 eat 0 3461 taste 0 3153 fill 0 2430 see 0 1145 clean 0 0600 open 0 0586 smell 0 0286 touch 0 0000 drive 0 0000 wear 0 0000 lift 0 0000 break 0 0000 ride Semantic feature values airplane 0 8673 ride 0 2891 see 0 2851 say 0 1689 near 0 1228 open 0 0883 hear 0 0771 run 0 0749 lift 0 0049 smell 0 0010 wear 0 0000 taste 0 0000 rub 0 0000 manipulate Predicted Activation is Sum of Feature Contributions eat Predicted Celery 0 84 taste 0 35 fill 0 32 feat celery from corpus statistics c14382 eat learned high low 500 000 learned parameters Predicted Celery 7 celery airplane fMRI activation Predicted high average Observed below average Predicted and observed fMRI images for celery and airplane after training on 58 other words Evaluating the Computational Model Train it using 58 of the 60 word stimuli Apply it to predict fMRI images for other 2 words Test show it the observed images for the 2 held out and make it predict which is which celery airplane 1770 test pairs in leave 2 out Random guessing 0 50 accuracy Accuracy above 0 61 is significant p 0 05 8 Q4 What are the actual semantic primitives from which neural encodings are composed word predict neural representation verb cooccurrence features predicted neural representation 25 verb co occurrence counts Alternative semantic feature sets PREDEFINED corpus features Mean Acc 25 verb co occurrences 79 486 verb co occurrences 79 50 000 word co occurences 76 300 Latent Semantic Analysis features 73 50 corpus features from Collobert Weston ICML08 78 218 features collected using Mechanical Turk 83 20 features discovered from the data 87 developed by Dean Pommerleau developed by Indra Rustandi 9 Rustandi et al 2009 Discovering shared semantic basis specific to study subject independent of study subject word w 20 learned latent features subj 1 word pict 218 base features predict representation predict representation subj 9 word pict predict representation subj 10 word only predict representation subj 20 word only learned intermediate semantic features trained using Canonical Correlation Analysis Multi study WP WO Multi subject 9 11 CCA Top Stimulus Words component 1 component 2 component 3 component 4 most active stimuli apartment church closet house barn screwdriver pliers refrigerator knife hammer shelter manipulation telephone butterfly bicycle beetle dog pants dress glass coat chair things that touch me 10 Subject 1 Word Picture stimuli Multi study WP WO Multi subject 9 11 CCA Component 1 Subject 1 Word ONLY stimuli Multi study WP WO Multi subject 9 11 CCA Component 1 11


View Full Document

CMU CS 10701 - Dimensionality Reduction_03_31_2011_ann

Documents in this Course
lecture

lecture

12 pages

lecture

lecture

17 pages

HMMs

HMMs

40 pages

lecture

lecture

15 pages

lecture

lecture

20 pages

Notes

Notes

10 pages

Notes

Notes

15 pages

Lecture

Lecture

22 pages

Lecture

Lecture

13 pages

Lecture

Lecture

24 pages

Lecture9

Lecture9

38 pages

lecture

lecture

26 pages

lecture

lecture

13 pages

Lecture

Lecture

5 pages

lecture

lecture

18 pages

lecture

lecture

22 pages

Boosting

Boosting

11 pages

lecture

lecture

16 pages

lecture

lecture

20 pages

Lecture

Lecture

20 pages

Lecture

Lecture

39 pages

Lecture

Lecture

14 pages

Lecture

Lecture

18 pages

Lecture

Lecture

13 pages

Exam

Exam

10 pages

Lecture

Lecture

27 pages

Lecture

Lecture

15 pages

Lecture

Lecture

24 pages

Lecture

Lecture

16 pages

Lecture

Lecture

23 pages

Lecture6

Lecture6

28 pages

Notes

Notes

34 pages

lecture

lecture

15 pages

Midterm

Midterm

11 pages

lecture

lecture

11 pages

lecture

lecture

23 pages

Boosting

Boosting

35 pages

Lecture

Lecture

49 pages

Lecture

Lecture

22 pages

Lecture

Lecture

16 pages

Lecture

Lecture

18 pages

Lecture

Lecture

35 pages

lecture

lecture

22 pages

lecture

lecture

24 pages

Midterm

Midterm

17 pages

exam

exam

15 pages

Lecture12

Lecture12

32 pages

lecture

lecture

19 pages

Lecture

Lecture

32 pages

boosting

boosting

11 pages

pca-mdps

pca-mdps

56 pages

bns

bns

45 pages

mdps

mdps

42 pages

svms

svms

10 pages

Notes

Notes

12 pages

lecture

lecture

42 pages

lecture

lecture

29 pages

lecture

lecture

15 pages

Lecture

Lecture

12 pages

Lecture

Lecture

24 pages

Lecture

Lecture

22 pages

Midterm

Midterm

5 pages

mdps-rl

mdps-rl

26 pages

Load more
Download Dimensionality Reduction_03_31_2011_ann
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Dimensionality Reduction_03_31_2011_ann and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Dimensionality Reduction_03_31_2011_ann and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?