CMU CS 10601 - Lecture 14 MCMC - D2307400

Home> Schools> Carnegie Mellon University> Computer Science (CS) > CS 10601> Lecture 14 MCMC

DOC PREVIEW

CMU CS 10601 - Lecture 14 MCMC

School name Carnegie Mellon University

Course Cs 10601- Introduction to Machine Learning

Pages 28

This preview shows page 1-2-3-26-27-28 out of 28 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 28 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 28 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 28 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 28 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 28 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 28 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 28 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Review•Parallel importance sampling!bias due to 1/normalizer!particle filter = recursive parallel IS•MCMC!randomized search for high P(x)!burn-in, mixing!approx. iid: { Xt, Xt+", Xt+2", Xt+3", … }!use to construct estimator of EP(g(X))1Review•Metropolis-Hastings!way to design chain w/ stationary dist’n P(X)!proposal distribution Q(X’ | X)!e.g., random walk N(X’ | X, #2I)!accept w.p. min(1, )!tension btwn long moves, high accept rateMH algorithm•Initialize X1 arbitrarily•For t = 1, 2, …:!Sample X’ ~ Q(X’ | Xt)!Compute p =!With probability min(1, p), set Xt+1 := X’!else Xt+1 := Xt•Note: sequence X1, X2, … will usually contain duplicates182MH example!!!"#$""#$!!!!"#%!"#&!"#'!"#(""#("#'"#&"#%!"!()'$*+,-+.*/3MH example!! !"#$ " "#$ !!!!"#$""#$!4In example•g(x) = x2•True E(g(X)) = 0.28…•Proposal: •Acceptance rate 55–60%•After 1000 samples, minus burn-in of 100:final estimate 0.282361final estimate 0.271167final estimate 0.322270final estimate 0.306541final estimate 0.308716Q(x!| x)=N(x!| x, 0.252I)5Gibbs sampler•Special case of MH•Divide X into blocks of r.v.s B(1), B(2), …•Proposal Q:!pick a block i uniformly!sample XB(i) ~ P(XB(i) | X¬B(i))•Useful property: acceptance rate p = 16Gibbs example!!"#!!"$!!"% !!"&! !"&!"%!"$ !"#' '"&!!"#!!"$!!"%!!"&!!"&!"%!"$!"#7Gibbs example!!"#!! !$"# $ $"#! !"#!!!$"#$$"#!8Gibbs failure example!!!" !#$# "!!%!"!&!#!'$'#&"%9Relational learning•Linear regression, logistic regression: attribute-value learning!set of i.i.d. samples from P(X, Y)•Not all data is like this!an attribute is a property of a single entity!what about properties of sets of entities?10Application: document clustering11Application: recommendations12Latent-variable models13Best-known LVM: PCA•Suppose Xij, Uik, Vjk all ~ Gaussian!yields principal components analysis!or probabilistic PCA!or Bayesian PCA14PCA: the picture15PCA: cartoon example123456…ABCDEF…110010…011000…110110…100110…010100…011101……………………MovieUser16PCA: cartoon examplex1x2x3...xnData matrix X!Compressed matrix Uu1u2u3...unv1 … vkBasis matrix VT17PCA: cartoon examplex1x2x3...xnData matrix X!Compressed matrix Uu1u2u3...unv1 … vkBasis matrix VTrows of VT span the low-rank space17Interpreting PCAu1u2u3...unv1 … vkusersmoviesbasis weightsbasis vectors18Interpreting PCAu1u2u3...unv1 … vkusersmoviesbasis weightsbasis vectorsBasis vectors represent movies that vary togetherWeights say how much each user cares about each type of movie18Mean subtraction!Uik ~ N(0, $2)!Vjk ~ N(0, $2)!Xij ~ N(Ui!Vj, #2)>> mu = mean(X(:));>> colmu = mean(X - mu);>> rowmu = mean(X' - mu)';>> X = X - mu - repmat(colmu, size(X,1), 1) - repmat(rowmu, 1, size(X,2));19Data weights•Let Wij =•Likelihood ! prior = •More generally, Wij ! 020Another use of PCAface images from Groundhog Day, extracted by Cambridge face DB project21Image matrixx1x2x3...xnimagespixels22Result of factoringu1u2u3...unv1 … vkimagespixelsbasis weightsbasis vectorsBasis vectors are often called “eigenfaces”23Eigenfacesimage credit: AT&T Labs Cambridge24PCA: finding the MLE•PCA: !Uik ~ N(0, $2)!Vjk ~ N(0, $2)!Xij ~ N(Ui!Vj, #2)!#/$ % 025PCA & SVD•The singular value decomposition is!X = R & ST!R, S orthonormal; & ! 0 diagonal!All matrices can be expressed this way!See svd, svds in Matlab•So, PCA is U = V

View Full Document


School:
Email:
New Password:
Confirm Password:

This preview shows page 1-2-3-26-27-28 out of 28 pages.

CMU CS 10601 - Lecture 14 MCMC

Sign up for free to view:

Please select your school