DOC PREVIEW
CMU CS 10708 - MN-crf

This preview shows page 1-2-19-20 out of 20 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 20 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 20 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 20 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 20 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 20 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Parameter Learning in MNOutlineLog-linear Markov network (most common representation)Generative v. Discriminative classifiers – A reviewLog-linear CRFs (most common representation)Example: Image SegmentationSlide 7Slide 8Slide 9Example: Inference for LearningSlide 11Representation EquivalenceSlide 13Slide 14Slide 15Slide 16Iterative Proportional Fitting (IPF)Parameter Sharing in your HWIPF parameter sharingSlide 20Parameter Learning in MNOutline•CRF•Learning CRF for 2-d image segmentation•IPF parameter sharing revisited10-708 – Carlos Guestrin 2006-20083Log-linear Markov network(most common representation)•Feature is some function [D] for some subset of variables D–e.g., indicator function•Log-linear model over a Markov network H:–a set of features 1[D1],…, k[Dk]•each Di is a subset of a clique in H•two ’s can be over the same variables–a set of weights w1,…,wk•usually learned from data– iiiiDwZXP )(exp1)(Generative v. Discriminative classifiers – A review•Want to Learn: h:X  Y–X – features–Y – target classes•Bayes optimal classifier – P(Y|X)•Generative classifier, e.g., Naïve Bayes:–Assume some functional form for P(X|Y), P(Y)–Estimate parameters of P(X|Y), P(Y) directly from training data–Use Bayes rule to calculate P(Y|X= x)–This is a ‘gene rative’ model•Indirect computation of P(Y|X) through Bayes rule•But, can generate a sample of the data, P(X) = y P(y) P(X|y)•Discriminative classifiers, e.g., Logistic Regression:–Assume some functional form for P(Y|X)–Estimate parameters of P(Y|X) directly from training data–This is the ‘discriminative’ model•Directly learn P(Y|X)•But cannot obtain a sample of the data, because P(X) is not available410-708 – Carlos Guestrin 2006-200810-708 – Carlos Guestrin 2006-20085Log-linear CRFs(most common representation)•Graph H: only over hidden vars Y1,..,YP–No assumptions about dependency on observed vars X–You must always observe all of X•Feature is some function [D] for some subset of variables D–e.g., indicator function, •Log-linear model over a CRF H:–a set of features 1[D1],…, k[Dk]•each Di is a subset of a clique in H•two ’s can be over the same variables–a set of weights w1,…,wk•usually learned from data– iiiiXDwXZXYP ),(exp)(1)|(Example: Image Segmentation- A set of features 1[D1],…, k[Dk]–each Di is a subset of a clique in H–two ’s can be over the same variablesy1y2y3y4y5y6y7y8y9We will define features as follows:- : measures compatibility of node color and its segmentation-A set of indicator features triggered for each edge labeling pair {ff,bb,fb,bf}-This is a allowed since we can define many features overr the same subset of variables),( yxiiiiXDwXZXYP ),(exp)(1)|(Example: Image Segmentation- A set of features 1[D1],…, k[Dk]–each Di is a subset of a clique in H–two ’s can be over the same variables  fyGMMxPbyGMMxPyxfb|log|log),(otherwisefyfyyyjijiff0,1),(otherwisebyfyyyjijifb0,1),(otherwisefybyyyjijibf0,1),(otherwisebybyyyjijibb0,1),(y1y2y3y4y5y6y7y8y9iiiiXDwXZXYP ),(exp)(1)|(Example: Image Segmentation- A set of features 1[D1],…, k[Dk]–each Di is a subset of a clique in H–two ’s can be over the same variables    Vi Eij bbbffbffmjimmiiyywyxXYP},,,{),(),(exp)|(  Vi bbbffbffmmmiiCwyxXYP},,,{),(exp)|( EijjiEijjimmmyyIyyC )(,We need to learn parameters wmy1y2y3y4y5y6y7y8y9-Now we just need to sum these featuresiiiiXDwXZXYP ),(exp)(1)|(Example: Image Segmentation  Vi bbb ffbffmmmiiCwyxxYP},,,{),(exp)|( ][|][):(1nXCEnCwwDatamwNnmmRequires inference using the current parameter estimates y1y2y3y4y5y6y7y8y9Count for features m in data n EijjiEijjimmmyyIyyC )(,Given N data points (images and their segmentations)Example: Inference for Learning  Vi bbbffbffmmmiiCwyxfXYP},,,{),(exp)|( ][|][):(1nXCEnCwwDatamwNnmm ijjifbwnXbyfyIEnXCE ][|),(][|How to compute E[Cfb|X[n]]   ijjiwfbwnXbyfyIEnXCE ][|),(][|  ijjiwfbwnXbyfyPnXCE ][|,][|y1y2y3y4y5y6y7y8y9 EijjiEijjimmmyyIyyC )(,Example: Inference for Learning  Vi bbbffbffmmmiiCwyxfXYP},,,{),(exp)|( ][|][):(1nXCEnCwwDatamwNnmmHow to compute E[Cfb|X[n]]  ijjiwfbwnXbyfyPnXCE ][|,][|y1y2y3y4y5y6y7y8y9 EijjiEijjimmmyyIyyC )(,Representation Equivalencey1y2y3y4y5y6y7y8y9  Vi bbbffbffmmmiiCwyxXYP},,,{),(exp)|( EijjiEijjimmmyyIyyC )(,  fyGMMxPbyGMMxPyxfb|log|log),(Log linear representationTabular MN representation from HW4  EijjiViiiyyyxXYP ,,)|(     ViiiyiViViyiViyiViiiyxGMMxPGMMxPGMMxPyxiii),(exp|logexp|logexp|,Representation Equivalencey1y2y3y4y5y6y7y8y9  Vi bbbffbffmmmiiCwyxXYP},,,{),(exp)|( EijjiEijjimmmyyIyyC )(,  fyGMMxPbyGMMxPyxfb|log|log),(Log linear representationTabular MN representation from HW4  EijjiViiiyyyxXYP ,,)|(Now do it over the edge potential)()()()(),(bbyyIbbbfyyIbffbyyIfbffyyIffjijijijijiyyThis is correct as for every assignment to yiyj we select one value from the table},,,{)(),(bbfbbfffmmyyImjijiyyTabular MN representation from HW4  EijjiViiiyyyxXYP ,,)|(Now do it over the edge potentialThis is correct as for every assignment to yiyj we select one value from the


View Full Document

CMU CS 10708 - MN-crf

Documents in this Course
Lecture

Lecture

15 pages

Lecture

Lecture

25 pages

Lecture

Lecture

24 pages

causality

causality

53 pages

lecture11

lecture11

16 pages

Exam

Exam

15 pages

Notes

Notes

12 pages

lecture

lecture

18 pages

lecture

lecture

16 pages

Lecture

Lecture

17 pages

Lecture

Lecture

15 pages

Lecture

Lecture

17 pages

Lecture

Lecture

19 pages

Lecture

Lecture

42 pages

Lecture

Lecture

16 pages

r6

r6

22 pages

lecture

lecture

20 pages

lecture

lecture

35 pages

Lecture

Lecture

19 pages

Lecture

Lecture

21 pages

lecture

lecture

21 pages

lecture

lecture

13 pages

review

review

50 pages

Semantics

Semantics

30 pages

lecture21

lecture21

26 pages

hw4

hw4

5 pages

lecture

lecture

12 pages

Lecture

Lecture

25 pages

Lecture

Lecture

25 pages

Lecture

Lecture

14 pages

Lecture

Lecture

15 pages

Load more
Download MN-crf
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view MN-crf and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view MN-crf 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?