MSU CSE 847 - recap - D1882589

Home> Schools> Michigan State University> Computer Science & Engineering (CSE) > CSE 847> recap

DOC PREVIEW

MSU CSE 847 - recap

School name Michigan State University

Course Cse 847-

Pages 41

This preview shows page 1-2-3-19-20-39-40-41 out of 41 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 41 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 41 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 41 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 41 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 41 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 41 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 41 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 41 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 41 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

ReviewComparison of Different Classification ModelsK Nearest Neighbor (kNN) ApproachK Nearest Neighbor Approach (KNN)Slide 5Slide 6Weighted K Nearest NeighborWeight K Nearest NeighborGaussian Generative ModelSlide 10Slide 11Slide 12Slide 13Slide 14Logistic Regression ModelSlide 16Non-linear Logistic Regression ModelSlide 18Slide 19Localized Logistic Regression ModelSlide 21Conditional Exponential ModelMaximum Entropy ModelSupport Vector MachineSlide 25Slide 26Slide 27Logistic Regression Model vs. Support Vector MachineSlide 29Kernel TricksFisher KernelKernel Methods in Generative ModelMulti-class SVMError Correct Output Code (ECOC)Ordinal RegressionSlide 36Decision TreeSlide 38Slide 39Slide 40Generalize Decision TreeReviewRong JinComparison of Different Classification ModelsThe goal of all classifiersPredicating class label y for an input xEstimate p(y|x)K Nearest Neighbor (kNN) Approach (k=1)(k=4) Probability interpretation: estimate p(y|x) as( ){ }, | , ( )( | ) , ( ) is the neighborhood around | ( ) |i i i ix y y y x N xp y x N x xN x= �=K Nearest Neighbor Approach (KNN)What is the appropriate size for neighborhood N(x)?Leave one out approachWeight K nearest neighborNeighbor is defined through a weight functionEstimate p(y|x)How to estimate the appropriate value for 2?( ) ( , )( | )( )i iiiiw x y yp y xw xd=��222( ) exp2iix xw xs� �-� �= -� �� K Nearest Neighbor Approach (KNN)What is the appropriate size for neighborhood N(x)?Leave one out approachWeight K nearest neighborNeighbor is defined through a weight functionEstimate p(y|x)How to estimate the appropriate value for 2?( ) ( , )( | )( )i iiiiw x y yp y xw xd=��222( ) exp2iix xw xs� �-� �= -� �� K Nearest Neighbor Approach (KNN)What is the appropriate size for neighborhood N(x)?Leave one out approachWeight K nearest neighborNeighbor is defined through a weight functionEstimate p(y|x)How to estimate the appropriate value for 2?( ) ( , )( | )( )i iiiiw x y yp y xw xd=��222( ) exp2iix xw xs� �-� �= -� �� Weighted K Nearest NeighborLeave one out + maximum likelihoodEstimate leave one out probability Leave one out likelihood of training dataSearch the optimal 2 by maximizing the leave one out likelihood( ) ( , )1 ( ) ( , )( | )( ) 1 ( )i j j ii j i j j iij ji j i ji j iw x y yw x y yp y xw x w xdd��- += =- +�� %LOO1 11 ( ) ( , )log ( | ) log1 ( )n ni j j iij jj ji jiw x y yl p y xw xd= =� �- += =� �� - +� �� %Weight K Nearest NeighborLeave one out + maximum likelihoodEstimate leave one out probability Leave one out likelihood of training dataSearch the optimal 2 by maximizing the leave one out likelihood( ) ( , )1 ( ) ( , )( | )( ) 1 ( )i j j ii j i j j iij ji j i ji j iw x y yw x y yp y xw x w xdd��- += =- +�� %LOO1 11 ( ) ( , )log ( | ) log1 ( )n ni j j iij jj ji jiw x y yl p y xw xd= =� �- += =� �� - +� �� %Gaussian Generative Modelp(y|x) ~ p(x|y) p(y): posterior = likelihood  priorEstimate p(x|y) and p(y)Allocate a separate set of parameters for each class  {1, 2,…, c}p(xly; )  p(x;y)Maximum likelihood estimation222( )1( | ) exp22yyyxp x ymsps� �-� �= -� �� ( )2221 1( )1log ( | ) log log 2 log22ii i iiN Ni yi i y y yi iyxl p x y p pmpss= =-= + = - +� �Gaussian Generative Modelp(y|x) ~ p(x|y) p(y): posterior = likelihood  priorEstimate p(x|y) and p(y)Allocate a separate set of parameters for each class  {1, 2,…, c}p(xly; )  p(x;y)Maximum likelihood estimation222( )1( | ) exp22yyyxp x ymsps� �-� �= -� �� ( )2221 1( )1log ( | ) log log 2 log22ii i iiN Ni yi i y y yi iyxl p x y p pmpss= =-= + = - +� �Gaussian Generative ModelDifficult to estimate p(x|y) if x is of high dimensionalityNaïve Bayes:Essentially a linear modelHow to make a Gaussian generative model discriminative?(m,m) of each class are only based on the data belonging to that class  lack of discriminative power1 2( | ; ) ( | ; ) ( | ; )... ( | ; )dp x y p x y p x y p x yq q q q�rGaussian Generative ModelMaximum likelihood estimation2222'' 1'22' 1''( )1exp22( | ) ( )( | )( )( | ') ( ')1exp22yyyyccyyyyyyxpp x y p yp y xxp x y p ypmspsmsps==� �-� �-� �� = =� �-� �-� �� ( )122' '22 221 ' 1''log ( | )( )( )1log 2 log log exp22 22ii iiNi iiN ci yy i yy yi yy yyl p y xxp xpmmpss sps== ==� �� --� � �� = - - - -� � �� How to optimize this objective function?Gaussian Generative ModelBound optimization algorithm( ) ( ){ }( ) ( ){ }1 1 1' ' ' ' ' '1 1 12 ' ''2 ' 2'', , ,..., , , : parameter of current iteration' , , ,..., , , : parameter of last iteration( ) ( ')( )(2 )1log log22log2i i i i i ii i ic c cc c cy y y y i y yy y yyyp pp pl lp xppq m s m sq m s m sq qs m m m ms sps==- =� �- - -� �- -� �� +' 2 21' ' ''2 2'2 2' 1 ' 1' '' '( ) ( )exp log exp2 22Nc cii y y i yy yy yyx p xm ms sps== =� �� - -� �� - - -� �� Gaussian Generative Model2 ' ''2 ' 22' '22' 1'1'' ' 2' ''2'2' 1''( ) ( ')( )(2 )1log log22( )exp22( )exp22i i i i i ii i iy y y y i y yy y ycNy i yyyiycy i yyyyl lp xpp xp xq qs m m m ms smspsmsps===- �� - - -� �� - -� �� -� �� -� �� -� �� -� �� -� �� Using log 1x x� -We have decomposed the interaction of parameters between different classesQuestion: how to handle x with multiple features ?Logistic Regression ModelA linear decision boundary: wx+bA probabilistic model p(y|x)Maximum likelihood approach for estimating weights w and threshold b0 positive0 negativew x bw x b�+ > ��+ < �r rr r1( 1| )1 exp( ( ))p y xy w x b=� =+ - �+rr r( ) ( )1 1( ) ( )1 1( ) log ( | ) log ( | )1 1log log1 exp 1 expN Ntrain i ii iN Ni ii il D p x p xw x b w x b+ -+ -= =+ -= =+ -= + + -= +� � � �+ - � - + � +� � � �� r rr

View Full Document


School:
Email:
New Password:
Confirm Password:

This preview shows page 1-2-3-19-20-39-40-41 out of 41 pages.

MSU CSE 847 - recap

Sign up for free to view:

Please select your school