ProjectProject ProposalProject Progress ReportProject Final ReportExponential Model and Maximum Entropy ModelRecap: Logistic Regression ModelHow to Extend Logistic Regression Model to Multiple Classes?Conditional Exponential ModelSlide 9Slide 10Modified Conditional Exponential ModelMaximum Entropy Model: MotivationSlide 13Slide 14Slide 15Slide 16Maximum Entropy Principle (MaxEnt)MaxEnt for Classification ProblemsSlide 19Slide 20MaxEnt ModelTranslation ProblemConstraintsSolution to MaxEntSlide 25Maximum Entropy Model versus Conditional Exponential ModelMaximum Entropy Model vs. Conditional Exponential ModelSolving Maximum Entropy ModelSlide 29Slide 30Slide 31Slide 32Improved Iterative ScalingChoice of FeaturesFeature Selection vs. RegularizersSlide 36Solving the L1 Regularized Conditional Exponential ModelSlide 38ProjectNow it is time to think about the projectIt is a team workEach team will consist of 2 peopleIt is better to consider a project of your ownOtherwise, I will assign you to some “difficult” project .Important date03/11: project proposal due04/01: project progress report due04/22 and 04/24: final presentation05/03: final report dueProject ProposalWhat do I expect?Introduction: describe the research problem that you try to solveRelated wok: describe the existing approaches and their deficiencyProposed approaches: describe your approaches and why it may have potential to alleviate the deficiency with existing approachesPlan: what you plan to do in this project?FormatIt should look like a research paperThe required format (both Microsoft Word and Latex) can be downloaded from www.cse.msu.edu/~cse847/assignments/format.zipProject Progress ReportIntroduction: overview the problem that you try to solve and the solutions that you present in the proposalProgressAlgorithm description in more detailsRelated data collection and cleanupPreliminary resultsFormat should be same as the project reportProject Final ReportIt should like a research paper that is ready for submission to research conferencesWhat do I expect?IntroductionAlgorithm description and discussionEmpirical studiesI am expecting careful analysis of results no matter if it is a successful approach or a complete failurePresentation25 minute presentation5 minute discussionExponential Model and Maximum Entropy ModelRong JinRecap: Logistic Regression ModelAssume the inputs and outputs are related in the log linear functionEstimate weights: MLE approach 1 21( | ; )1 exp ( ){ , ,..., , }mp y xy x w cw w w cqq=+ - � +� �� �=rrv*2121 1max ( ) max log ( | ; )1max log1 exp( )nreg train i iiw wn mji jww l D p y x s ws wy x w cq== == = -= -+ - � +� �� ��� �r rrr r rr r1 2{ , ,..., , }mw w w cHow to Extend Logistic Regression Model to Multiple Classes?y{+1, -1} {1,2,…,C}?1 21( | ; )1 exp ( ){ , ,..., , }mp y xy x w cw w w cqq=+ - � +� �� �=rrv( 1| )log( 1| )p y xx w cp y x== � +=-rrvrConditional Exponential ModelIntroduce a different set of parameters for each classEnsure the sum of probability to be 1( | ; ) exp( ) { , }y y y y yp y x c x w c wq q� + � =r r r r1( | ; ) exp( )( )( ) exp( )y yy yyp y x c x wZ xZ x c x wq = + �= + ��r r rrr r r( | ; )p y x qrConditional Exponential Model Predication probabilityModel parameters:For each class y, we have weights wy and threshold cyMaximum likelihood estimationexp( )( | ; ) , {1, 2,..., } exp( )y yy yyc x wp y x y Cc x wq+ �= �+ ��r rrr r1 1exp( )( ) log ( | ) logexp( )i iN Ny i ytrain i ii iy i yyc x wl D p y xc x w= =+ �= =+ �� ��r rrr rAny Problems?Conditional Exponential ModelAdd a constant vector to every weight vector, we have the same log-likelihood functionNot unique optimum solution!How to resolve this problem?0 00 010 01,exp( )( ) logexp( )exp( )logexp( )i ii iy y y yNy i ytrainiy i yyNy i yiy i yyw w w c c cc c x w wl Dc c x w wc x wc x w==� + � ++ + � +=+ + � ++ �=+ �����r r rr r rr r rr rr rSolution: Set w1 to be a zero vector and c1 to be zeroModified Conditional Exponential Model Prediction probabilityModel parameters:For each class y>1, we have weights wy and threshold cyMaximum likelihood estimation' '' 1' '' 1exp( ){2,..., }1 exp( )( | ; ) 111 exp( )y yy yyy yyc x wy Cc x wp y xyc x wq>>+ ����+ + ��=��=�+ + ����r rr rrr r1{ | 1} { | 1}1 1( ) log ( | )exp( )1log log1 exp( ) 1 exp( )i ii iNtrain i iiy i yi y i yy i y y i yy yl D p y xc x wc x w c x w== >> >=+ �= ++ + � + + ��� �� �rr rr r r rMaximum Entropy Model: MotivationConsider a translation exampleEnglish ‘in’ French {dans, en, à, au-cours-de, pendant}Goal: p(dans), p(en), p(à), p(au-cours-de), p(pendant)Case 1: no prior knowledge on tranlationWhat is your guess of the probabilities?Maximum Entropy Model: MotivationConsider a translation exampleEnglish ‘in’ French {dans, en, à, au cours de, pendant}Goal: p(dans), p(en), p(à), p(au-cours-de), p(pendant)Case 1: no prior knowledge on tranlationWhat is your guess of the probabilities?p(dans)=p(en)=p(à)=p(au-cours-de)=p(pendant)=1/5Case 2: 30% of times either dans or en is usedMaximum Entropy Model: MotivationConsider a translation exampleEnglish ‘in’ French {dans, en, à, au cours de, pendant}Goal: p(dans), p(en), p(à), p(au-cours-de), p(pendant)Case 1: no prior knowledge on tranlationWhat is your guess of the probabilities?p(dans)=p(en)=p(à)=p(au-cours-de)=p(pendant)=1/5Case 2: 30% of times either dans or en is usedWhat is your guess of the probabilities?p(dans)=p(en)=3/20 p(à)=p(au-cours-de)=p(pendant)=7/30Uniform distribution is favoredMaximum Entropy Model: MotivationCase 3: 30% of time dans or en is used, and 50% of times dans or à is usedWhat is your guess of the probabilities?Maximum Entropy Model: MotivationCase 3: 30% of time dans or en is used, and 50% of times dans or à is usedWhat is your guess of the probabilities?A good probability distribution shouldSatisfy the constraintsBe close to uniform distribution, but how?Measure Uniformality using Kullback-Leibler Distance !Maximum Entropy Principle (MaxEnt)A uniformity of distribution is measured by entropy of the distributionSolution: p(dans) = 0.2, p(a) = 0.3, p(en)=0.1, p(au-cours-de) =
View Full Document