Discussion on Logistic Regression and Na ve Bayes Jingrui He 09 27 2007 Review of Logistic Regression Discriminative classifier Function form for P Y X wX P Y 1 X w 1 exp w w X exp w0 0 i i i i i i Can NOT obtain a sample of the data because P X is not available Parameter Estimation Gradient ascent w0t t 1 t i wi w t 1 w0 Upon convergence l w w0 l w wi Y j P Y j 1 X j wt j X i j Y j P Y j 1 X j wt j Y j P Y j 1 X j w 0 j X i j Y j P Y j 1 X j w 0 j Linear Separable What s the value of w INFINITY Why Maximum likelihood P Y 1 X w w wX 1 exp w w X exp w0 0 i i i i i i More Training Examples No change in w Why w0t 1 w0t w w it 1 w it j Y j P Y j 1 X j wt j X ij Y j P Y j 1 X j wt Non Linear Separable More Training Examples Still More Training Examples Why Originally upon convergence l w w0 Y j P Y j 1 X j w 0 j With 3 more points l w w0 0 To let the derivative be 0 again Increase P Y j 1 X j w Multiple Classes R 1 sets of weights P Y R X w 1 P Y j X w j exp w j 0 j Classification R 1 j 1 w X ji i 1 exp w j 0 j 1 R 1 i w X i Comparing exp w j 0 w ji X i and 1 i Comparing w j 0 w ji X i and 0 i ji i 4 Classes in 2d Space Class 2 Class 3 Class 1 Class 1 Class 2 Class 4 Class 4 Class 3 LR vs NB Loss functions LR maximum conditional data likelihood j NB maximum data likelihood j ln P Y j X j w ln P X j Y j w Different solutions LR vs NB In NB assume class independent variance P Y 1 X w ln 1 1 1 exp w0 i21 i20 i 2 i2 wx i i i i 0 i1 i2 LR vs NB LR NB
View Full Document