Discussion on Logistic Regression and Naïve BayesReview of Logistic RegressionParameter EstimationLinear SeparableMore Training ExamplesNon-Linear SeparableMore Training ExamplesStill More Training ExamplesWhy?Multiple Classes 4 Classes in 2d SpaceLR vs. NBLR vs. NBLR vs. NBDiscussion on Logistic Regression and Naïve BayesJingrui He09/27/2007Review of Logistic Regression Discriminative classifier Function form for Can NOT obtain a sample of the data, because is not available()PYX()()()00exp1,1expiiiiiiwwXPY X wwwX+==++∑∑()PXParameter Estimation Gradient ascent Upon convergence()100ˆ1,ttjjjtjww YPYXwη+⎡⎤←+ − =⎢⎥⎣⎦∑()1ˆ1,tt jjjjtii ijww XYPYXwη+⎡⎤←+ − =⎢⎥⎣⎦∑()()01, 0jjjjlwYPY Xww∂⎡⎤=−= =⎢⎥⎣⎦∂∑()()1, 0jj j jijilwXY PY Xww∂⎡⎤=−==⎢⎥⎣⎦∂∑Linear Separable What’s the value of w? INFINITY! Why? Maximum likelihoodw+_()()()001,exp1expiiiiiiPY X wwwXwwX=+=++∑∑More Training Examples No change in w Why?w+_()100ˆ1,ttjjjtjwwYPY Xwη+←+⎡⎤−=⎢⎥⎣⎦∑()1ˆ1,ttiijj j jtijwwXY PY Xwη+←+⎡⎤−=⎢⎥⎣⎦∑Non-Linear SeparableMore Training ExamplesStill More Training ExamplesWhy? Originally, upon convergence With 3 more points To let the derivative be 0 again Increase()()01, 0jjjjlwYPY Xww∂⎡⎤=−= =⎢⎥⎣⎦∂∑()00lww∂>∂()1,jjPY X w=Multiple Classes R-1 sets of weights , Classification Comparing and 1 Comparing and 0()()0,expjjjiiiPYjXw w wX=∝+∑1, , 1jR=−…()()1011,1expjRjjiijiPY RXwwwX−===++∑∑()0expjjiiiwwX+∑0jji iiwwX+∑4 Classes in 2d SpaceClass 1Class 2Class 3Class 4Class 1 Class 2Class 3Class 4LR vs. NB Loss functions LR: maximum conditional data likelihood NB: maximum data likelihood Different solutions!()()ln ,jjjPX Y w∑()()ln ,jjjPY X w∑LR vs. NB In NB, assume class independent variance()()011,1expiiiPY Xwwwx==++∑012iiiμμσ−221021ln2iiiiμμθθσ−−+∑LR vs.
View Full Document