MSU CSE 847 - Middle Term Exam - D2219599

Home> Schools> Michigan State University> Computer Science & Engineering (CSE) > CSE 847> Middle Term Exam

MSU CSE 847 - Middle Term Exam

Pages 29

Download Save

Unformatted text preview:

Middle Term Exam 02 28 Thursday take home turn in at noon time of 02 029 Friday Project 03 14 Phase 1 10 of training data is available for algorithm development 04 04 Phase 2 full training data and test examples are available 04 17 submission submit your prediction before 11 59pm Apr 20 Wednesday 04 23 and 04 25 Project presentation Announce the competition results 04 28 project report is due Logistic Regression Rong Jin Logistic Regression Generative models often lead to linear decision boundary Linear discriminatory model Directly model the linear decision boundary w is the parameter to be decided Logistic Regression Logistic Regression Learn parameter w by Maximum Likelihood Estimation MLE Given training data Logistic Regression Convex objective function global optimal Classification error Gradient descent Logistic Regression Convex objective function global optimal Classification error Gradient descent Illustration of Gradient Descent Example Heart Disease 1 25 29 2 30 34 3 35 39 4 40 44 5 45 49 6 50 54 7 55 59 Input feature x age group id Output y if having heart disease y 1 having heart disease y 1 no heart disease 8 60 64 Example Heart Disease Example Text Categorization Learn to classify text into two categories Input d a document represented by a word histogram Output y 1 1 for political document 1 for nonpolitical document Example Text Categorization Training data Example 2 Text Classification Dataset Reuter 21578 Classification accuracy Na ve Bayes 77 Logistic regression 88 Logistic Regression vs Na ve Bayes Both are linear decision boundaries Na ve Bayes Logistic regression learn weights by MLE Both can be viewed as modeling p d y Na ve Bayes independence assumption Logistic regression assume an exponential family distribution for p d y a broad assumption Discriminative vs Generative Discriminative Models Generative Models Model P y x Pros Usually good performance Cons Slow convergence Expensive computation Sensitive to noise data Model P x y Pros Usually fast converge Cheap computation Robust to noise data Cons Usually performs worse Overfitting Problem Consider text categorization What is the weight for a word j appears in only one training document dk Overfitting Problem Using regularization Without regularization Decrease in the classification accuracy of test data Iteration Solution Regularization Regularized log likelihood The effects of regularizer Favor small weights Guarantee bounded norm of w Guarantee the unique solution Regularized Logistic Regression Using regularization Without regularization Classification performance by regularization Iteration Regularization as Robust Optimization Assume each data point is unknown but bounded in a sphere of radius and center xi Sparse Solution by Lasso Regularization RCV1 collection 800K documents 47K unique words Sparse Solution by Lasso Regularization How to solve the optimization problem Subgradient descent Minimax Bayesian Treatment Compute the posterior distribution of w Laplacian approximation Bayesian Treatment Laplacian approximation Multi class Logistic Regression How to extend logistic regression model to multi class classification Conditional Exponential Model Let classes be Normalization factor partition function Need to learn Conditional Exponential Model Learn weights ws by maximum likelihood estimation Any problem Modified Conditional Exponential Model

View Full Document


School:
Email:
New Password:
Confirm Password:

MSU CSE 847 - Middle Term Exam

Sign up for free to view:

Please select your school