DOC PREVIEW
CMU CS 10701 - lecture-annotated

This preview shows page 1-2-3-24-25-26 out of 26 pages.

Save
View full document
Premium Document
Do you want full access? Go Premium and unlock all 26 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Machine Learning 1010 701 15701 15 781 Fall 2006 Dimensionality Reduction II Factor Analysis and Metric Learning Eric Xing Lecture 20 November 22 2006 Reading Chap C B book Eric Xing 1 Outline z Probabilistic PCA breif z Factor Analysis somewhat detail z ICA will skip z Distance metric learning from very little side info a very cool method Eric Xing 2 1 Recap of PCA z Popular dimensionality reduction technique z Project data onto directions of greatest variation u arg max 1 m rT yi u m i 1 2 r xi rr 1 arg max u T yi yiT u m i 1 arg max u T Cov y u y2 m r yi r u1T yi Tr u y r 2 i U qT yi R q M Tr uq yi r Uxi y1 z Consequence z xi are uncorrelated such that the covariance matrix z Truncation error K q 1 m r rT xi xi is m i 1 1 O q y k uk ukT k uk ukT x k 1 k 1 Eric Xing 3 Recap of PCA z Popular dimensionality reduction technique z Project data onto directions of greatest variation Useful tool for visualising patterns and clusters within the data set but Need centering Does not explicitly model data noise Eric Xing 4 2 Probabilistic Interpretation continuous X continuous X continuous A Y continuous A Y regression Eric Xing 5 Probabilistic PCA z PCA can be cast as a probabilistic model yn xn n n N 0 2 I with q dimensional latent variables xn N 0 I z The resulting data distribution is yn N T 2 I z Maximum likelihood solution is equivalent to PCA ML 1 N y n ML U q q 2 I 1 2 n Diagonal q contains the top q sample covariance eigen values and Uq contains associated eigenvectors Eric Xing Tipping and Bishop J Royal Stat Soc 6 611 1999 6 3 Factor analysis z An unsupervised linear regression model X p x N x 0 I p y x N y x A Y z where is called a factor loading matrix and is diagonal Geometric interpretation z To generate data first generate a point within the manifold then add noise Coordinates of point are components of latent variable Eric Xing 7 Relationship between PCA and FA z z z Probabilistic PCA is equivalent to factor analysis with equal noise for every dimension i e n isotropic Gaussian N 0 2 I In factor analysis n N 0 for a diagonal covariance matrix An iterative algorithm eg EM is required to find parameters if precisions are not known in advance Eric Xing 8 4 Factor analysis z An unsupervised linear regression model X p x N x 0 I p y x N y x A Y z where is called a factor loading matrix and is diagonal Geometric interpretation To generate data first generate a point within the manifold then add noise Coordinates of point are components of latent variable z Eric Xing 9 Marginal data distribution z z A marginal Gaussian e g p x times a conditional Gaussian e g p y x is a joint Gaussian Any marginal e g p y of a joint Gaussian e g p x y is also a Gaussian Since the marginal is Gaussian we can determine it by just computing its mean and variance Assume noise uncorrelated with data z E Y E X W where W N 0 E X E W X 0 0 E X W X W E X W X W Var Y E Y Y T T A Y T E XX E WW T T T T Eric Xing 10 5 FA Constrained Covariance Gaussian z Marginal density for factor analysis y is p dim x is k dim p y N y T z So the effective covariance is the low rank outer product of two long skinny matrices plus a diagonal matrix z In other words factor analysis is just a constrained Gaussian model If were not diagonal then we could model any Gaussian and it would be pointless Eric Xing 11 Review A primer to multivariate Gaussian z Multivariate Gaussian density p x z 1 2 n 2 1 2 T 1 X1 exp x x 1 2 A2 X A joint Gaussian 12 x x p 1 N 1 1 11 x2 x2 2 21 22 z How to write down p x1 p x1 x2 or p x2 x1 using the block elements in and z Formulas to remember p x2 N x2 m 2m V2m m 2m 2 V2m 22 Eric Xing p x1 x2 N x1 m1 2 V1 2 1 x 2 2 m1 2 1 12 22 1 21 V1 2 11 12 22 12 6 Review Some matrix algebra z tr A aii def Trace and derivatives z Cyclical permutations z Derivatives i tr ABC tr CAB tr BCA tr BA B T A tr xT Ax tr xxT A xxT A A z Determinants and derivatives log A A T A Eric Xing 13 FA joint distribution z Model p x N x 0 I p y x N y x z Covariance between x and y Cov X Y E X 0 Y E X X W T E XX XW T T T T T z Hence the joint distribution of x and y x x 0 I T p N T y y z Assume noise is uncorrelated with data or latent variables Eric Xing 14 7 Inference in Factor Analysis z Apply the Gaussian conditioning formulas to the joint distribution we derived above where 11 I 12 12 T T 22 T we can now derive the posterior of the latent variable x given observation y p x y N x m1 2 V1 2 where 1 m1 2 1 12 22 y 2 T T 1 z I T T 1 y Applying the matrix inversion lemma 1 V1 2 11 12 22 21 V1 2 I T 1 1 G 1 1 1 F 1 m1 2 V1 2 T 1 y Here we only need to invert a matrix of size x x instead of y y Eric Xing 15 Geometric interpretation inference is linear projection z The posterior is p x y N x m1 2 V1 2 V1 2 I T 1 1 m1 2 V1 2 T 1 y z Posterior covariance does not depend on observed data y z Computing the posterior mean is just a linear operation Eric Xing 16 8 EM for Factor Analysis z Incomplete data log likelihood function marginal density of y N 1 1 T yn T yn 2 n 1 N 1 log T tr T S where S 2 2 l D z z z 2 log T ML N1 n yn Parameters and are coupled nonlinearly in log likelihood n yn yn T Estimating m is trivial Complete log likelihood lc D log p xn yn log p xn log p yn xn n n N N 1 1 T log I xnT xn log yn …


View Full Document

CMU CS 10701 - lecture-annotated

Documents in this Course
lecture

lecture

12 pages

lecture

lecture

17 pages

HMMs

HMMs

40 pages

lecture

lecture

15 pages

lecture

lecture

20 pages

Notes

Notes

10 pages

Notes

Notes

15 pages

Lecture

Lecture

22 pages

Lecture

Lecture

13 pages

Lecture

Lecture

24 pages

Lecture9

Lecture9

38 pages

lecture

lecture

26 pages

lecture

lecture

13 pages

Lecture

Lecture

5 pages

lecture

lecture

18 pages

lecture

lecture

22 pages

Boosting

Boosting

11 pages

lecture

lecture

16 pages

lecture

lecture

20 pages

Lecture

Lecture

20 pages

Lecture

Lecture

39 pages

Lecture

Lecture

14 pages

Lecture

Lecture

18 pages

Lecture

Lecture

13 pages

Exam

Exam

10 pages

Lecture

Lecture

27 pages

Lecture

Lecture

15 pages

Lecture

Lecture

24 pages

Lecture

Lecture

16 pages

Lecture

Lecture

23 pages

Lecture6

Lecture6

28 pages

Notes

Notes

34 pages

lecture

lecture

15 pages

Midterm

Midterm

11 pages

lecture

lecture

11 pages

lecture

lecture

23 pages

Boosting

Boosting

35 pages

Lecture

Lecture

49 pages

Lecture

Lecture

22 pages

Lecture

Lecture

16 pages

Lecture

Lecture

18 pages

Lecture

Lecture

35 pages

lecture

lecture

22 pages

lecture

lecture

24 pages

Midterm

Midterm

17 pages

exam

exam

15 pages

Lecture12

Lecture12

32 pages

lecture

lecture

19 pages

Lecture

Lecture

32 pages

boosting

boosting

11 pages

pca-mdps

pca-mdps

56 pages

bns

bns

45 pages

mdps

mdps

42 pages

svms

svms

10 pages

Notes

Notes

12 pages

lecture

lecture

42 pages

lecture

lecture

29 pages

lecture

lecture

15 pages

Lecture

Lecture

12 pages

Lecture

Lecture

24 pages

Lecture

Lecture

22 pages

Midterm

Midterm

5 pages

mdps-rl

mdps-rl

26 pages

Load more
Download lecture-annotated
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view lecture-annotated and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view lecture-annotated and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?