DOC PREVIEW
CMU CS 10601 - Gaussian Naive Bayes and Logistic Regression

This preview shows page 1-2-3-21-22-23-42-43-44 out of 44 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 44 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 44 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 44 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 44 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 44 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 44 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 44 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 44 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 44 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 44 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

10-601 Recitation #4 Gaussian Naive Bayes and Logistic Regression October 5th, 2011 Shing-hon Lau Office hours: Friday 3-4 PMAgenda  HW #2 due tomorrow 5 PM  Submit written copy and post code to Blackboard  Gaussian Naive Bayes  Logistic Regression  Gradient Descent  Discriminative vs. Generative classifiers  Bias/Variance TradeoffNaive Bayes  Features are conditionally independent given class  Assumed that all variables were binary (or discrete)  What if we have continuous features?Gaussian Naive Bayes  Still assume features are conditionally independent given class  Generally assume Gaussian features (why?)  Can use other distributions as wellGaussian Distribution  Also called normal distribution You will compute the MLE and MAP estimates in HW2Why Gaussian?  Many natural phenomenon are normally distributed  Biological functions (height, weight, etc.)  Central Limit Theorem implies that sample means tend to a normal distribution  Mathematically easy to work withWorking with Continuous Variables  Discrete variables:  Continuous variables:Generative vs. Discriminative Classifiers  Generative classifiers learn P(X|Y)  Use Bayes rule to calculate P(Y|X)  Discriminative classifiers learn P(Y|X)  Which type is Naive Bayes?Generative vs. Discriminative Classifiers  Discriminative classifiers are only good for classification  Generative classifiers enable other tasks (e.g., data generation)  Generally speaking, generative is more accurate with less data, discriminative with more dataLogistic Regression  Example of a discriminative classifierLogistic Regression  Can handle arbitrarily many classesLogistic RegressionFinding the weights  We were able to derive an analytic expression for the weights for the special Gaussian case  How can we find the weights in the general case?Good weightsMaximum Conditional Likelihood EstimateMaximum Conditional Likelihood EstimateMaximizing l(w)  No closed form for the maximum  So how do we find the MCLE?Gradient Descent  Iterative optimization algorithm  Basic idea:  Find the direction of greatest decrease  Take a small step in that direction  Repeat these two steps until we are satisfied  Usually stop when change is very small  Guaranteed to find optimal point in some cases1-D Gradient Descent  1-D gradient is the derivative1-D Gradient Descent  Start at a random point (1, 1)1-D Gradient Descent  Take a step in the negative gradient direction1-D Gradient Descent  Repeat the process1-D Gradient Descent  Eventually converge to the optimum pointProblems with Gradient Descent  If step-size is too big, can end up going back and forth between two values  If the function is not convex/concave, we may end up in a local optimaBad Step-Size  Start at (1, 1)Bad Step-Size  Step to (-1, 1)Bad Step-Size  Step back to (1, 1)Non-convexityNon-convexityNon-convexityGradient Descent  Generally have to shrink your step size as you continue to iterate  Try to stick to convex/concave functions  Do random restarts if you must use a non-convex/non-concave objectiven-dimensional Gradient Descent  Use partial derivatives to compute the gradient  Partial derivatives:n-dimensional Gradient Descent  Gradient is n-dimensional vector  Gradient direction is the direction of greatest increase  First-order (i.e., linear) approximation  Second-order Newton methodsLogistic Regression and Gradient Descent  Objective function is concave!MAP and Logistic RegressionMAP and Logistic Regression  Use a MAP estimate to avoid overfitting (just like Naive Bayes)  What can happen to weights without regularization term?  What does the regularization term help do?Other Forms of Regularization  Can apply a Lasso or Ridge penalty to weights  Lasso makes many weights zero  Ridge shrinks all of the weightsBias/Variance Tradeoff  Simpler models are more biased because they make more assumptions  More complex models are more variable, since they depend on the particulars of the data provided  Have to trade the two off to get the best classifier possibleExtra SlidesCorrelated Features  Worst case scenario: duplicated features  What will Naive Bayes do with duplicated features?  What will Logistic Regression do with duplicated features?  What if there is just correlation?Estimating Some Features Jointly  What if we are not willing to assume that all features are conditionally independent?  How can we do Naive Bayes?  What is the price we pay for not assuming conditional independence?Estimating Some Features Jointly  Graphical models are a formalization of this idea  Can do things like Tree-Augmented Naive Bayes (TAN)  More general framework for an arbitrary set of conditional independence assumptionsNeural Network Preview  One sigmoid function is good (Logistic Regression), so more must be better  Can chain them so that the output of one are the inputs to the next  “Mimics” the brain (kind of), so such systems are termed Neural NetworksNumerical Gradient Descent  What if we can only evaluate the function but cannot evaluate its derivative?  Can take tiny steps in each direction to determine gradient  Generally a lot more expensive because of all the function


View Full Document

CMU CS 10601 - Gaussian Naive Bayes and Logistic Regression

Documents in this Course
lecture

lecture

40 pages

Problem

Problem

12 pages

lecture

lecture

36 pages

Lecture

Lecture

31 pages

Review

Review

32 pages

Lecture

Lecture

11 pages

Lecture

Lecture

18 pages

Notes

Notes

10 pages

Boosting

Boosting

21 pages

review

review

21 pages

review

review

28 pages

Lecture

Lecture

31 pages

lecture

lecture

52 pages

Review

Review

26 pages

review

review

29 pages

Lecture

Lecture

37 pages

Lecture

Lecture

35 pages

Boosting

Boosting

17 pages

Review

Review

35 pages

lecture

lecture

32 pages

Lecture

Lecture

28 pages

Lecture

Lecture

30 pages

lecture

lecture

29 pages

leecture

leecture

41 pages

lecture

lecture

34 pages

review

review

38 pages

review

review

31 pages

Lecture

Lecture

41 pages

Lecture

Lecture

15 pages

Lecture

Lecture

21 pages

Lecture

Lecture

38 pages

Notes

Notes

37 pages

lecture

lecture

29 pages

Load more
Download Gaussian Naive Bayes and Logistic Regression
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Gaussian Naive Bayes and Logistic Regression and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Gaussian Naive Bayes and Logistic Regression 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?