DOC PREVIEW
CMU CS 15381 - 041707probabilisticLearning

This preview shows page 1-2-14-15-30-31 out of 31 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 31 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 31 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 31 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 31 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 31 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 31 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 31 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Artificial Intelligence: Representation and Problem Solving15-381April 17, 2007Probabilistic LearningMichael S. Lewicki ! Carnegie MellonArtificial Intelligence: Probabilistic LearningReminder•No class on Thursday - spring carnival.2Michael S. Lewicki ! Carnegie MellonArtificial Intelligence: Probabilistic LearningRecall the basic algorithm for learning decision trees:1. starting with whole training data2. select attribute or value along dimension that gives “best” split using information gain or other criteria3. create child nodes based on split4. recurse on each child using child data until a stopping criterion is reached•all examples have same class•amount of data is too small•tree too large•Does this capture probabilistic relationships?3Michael S. Lewicki ! Carnegie MellonArtificial Intelligence: Probabilistic LearningQuantifying the certainty of decions4<2 years at current job?missed payments?defaulted?NNNYNYNNNNNNNYYYNNNYNNYYYNNYNN•••••••••Predicting credit risk•Suppose instead of a yes or no answer want some estimate of how strongly we believe a loan applicant is a credit risk.•This might be useful if we want some flexibility in adjusting our decision criteria.-Eg, suppose we’re willing to take more risk if times are good.-Or, if we want to examine case we believe are higher risks more carefully.Michael S. Lewicki ! Carnegie MellonArtificial Intelligence: Probabilistic LearningThe mushroom data•Or suppose we wanted to know how likely a mushroom was safe to eat?•Do decision trees give us that information?5EDIBLE?CAP-SHAPECAP-SURFACE• •!•1edibleflatfibrous• •!•2poisonousconvexsmooth• •!•3edibleflatfibrous• •!•4edibleconvexscaly• •!•5poisonousconvexsmooth• •!•6edibleconvexfibrous• •!•7poisonousflatscaly• •!•8poisonousflatscaly• •!•9poisonousconvexfibrous• •!•10poisonousconvexfibrous• •!•11poisonousflatsmooth• •!•12edibleconvexsmooth• •!•13poisonousknobbedscaly• •!•14poisonousflatsmooth• •!•15poisonousflatfibrous• •!•••••••••••••Mushroom dataMichael S. Lewicki ! Carnegie MellonArtificial Intelligence: Probabilistic Learning1 2 3 4 5 6 700.511.522.5petal length (cm)petal width (cm)Fisher’s Iris data6Iris virginicaIris setosaIris versicolorIn which example would you be more confident about the class?Decision trees provide a classification but not uncertainty.Michael S. Lewicki ! Carnegie MellonArtificial Intelligence: Probabilistic LearningThe general classification problem7DataD = {x1, . . . , xT}xi= {x1, . . . , xN}idesired outputy = {y1, . . . , yK}modelθ = {θ1, . . . , θM}Given data, we want to learn a model that can correctly classify novel observations.yi=!1 if xi∈ Ci≡ class i,0 otherwiseoutput is a binary classification vector:input is a set of T observations, each an N-dimensional vector (binary, discrete, or continuous)model (e.g. a decision tree) is defined by M parameters.How do we approach this probabilistically?Michael S. Lewicki ! Carnegie MellonArtificial Intelligence: Probabilistic LearningThe answer to all questions of uncertainty•Let’s apply Bayes’ rule to infer the most probable class given the observation:•This is the answer, but what does it mean?•How do we specify the terms?-p(Ck) is the prior probability on the different classes-p(x|Ck) is the data likelihood, ie probability of x given class Ck•How should we define this?8p(Ck|x) =p(x|Ck)p(Ck)p(x)=p(x|Ck)p(Ck)!kp(x|Ck)p(Ck)Michael S. Lewicki ! Carnegie MellonArtificial Intelligence: Probabilistic LearningWhat classifier would give “optimal” performance?•Consider the iris data again.•How would we minimize the number of future mis-classifications?•We would need to know the true distribution of the classes.•Assume they follow a Gaussian distribution.•The number of samples in each class is the same (50), so (assume) p(Ck) is equal for all classes.•Because p(x) is the same for all classes we have:91 2 3 4 5 6 700.511.522.5petal length (cm)petal width (cm)00.10.20.30.40.50.60.70.80.9p(petal length |C2) p(petal length |C3) p(Ck|x) =p(x|Ck)p(Ck)p(x)∝ p(x|Ck)p(Ck)Michael S. Lewicki ! Carnegie MellonArtificial Intelligence: Probabilistic Learning1 2 3 4 5 6 700.10.20.30.40.50.60.70.80.9Where do we put the boundary?10p(petal length |C2) p(petal length |C3)Michael S. Lewicki ! Carnegie MellonArtificial Intelligence: Probabilistic Learning1 2 3 4 5 6 700.10.20.30.40.50.60.70.80.9Where do we put the boundary?11decision boundaryR32 = C3 is misclassified as C2R23 = C2 is misclassified as C3p(petal length |C2) p(petal length |C3) Michael S. Lewicki ! Carnegie MellonArtificial Intelligence: Probabilistic Learning1 2 3 4 5 6 700.10.20.30.40.50.60.70.80.9Where do we put the boundary?12Shifting the boundary trades-off the two errors.R32 = C3 is misclassified as C2R23 = C2 is misclassified as C3p(petal length |C2) p(petal length |C3)Michael S. Lewicki ! Carnegie MellonArtificial Intelligence: Probabilistic Learning•The misclassification error is defined by •which in our case is proportional to the data likelihood1 2 3 4 5 6 700.10.20.30.40.50.60.70.80.9Where do we put the boundary?13R32 = C3 is misclassified as C2R23 = C2 is misclassified as C3p(error) =!R32p(C3|x)dx +!R23p(C2|x)dxp(petal length |C2) p(petal length |C3) Michael S. Lewicki ! Carnegie MellonArtificial Intelligence: Probabilistic Learning•The misclassification error is defined by •which in our case is proportional to the data likelihood1 2 3 4 5 6 700.10.20.30.40.50.60.70.80.9Where do we put the boundary?14This region would yieldbut we’re still classifying this region as C2!p(C3|x) > p(C2|x)p(error) =!R32p(C3|x)dx +!R23p(C2|x)dxp(petal length |C2) p(petal length |C3)Michael S. Lewicki ! Carnegie MellonArtificial Intelligence: Probabilistic Learning•The minimal misclassification error at the point where1 2 3 4 5 6 700.20.40.60.81The optimal decision boundary15Optimal decision boundaryp(petal length |C2) p(petal length |C3) p(C3|x) = p(C2|x)⇒ p(x|C3)p(C3)/p(x) = p(x|C2)p(C2)/p(x)⇒ p(x|C3) = p(x|C2)p(C2 | petal length) p(C3 | petal length) Note: this assumes we have only two classes.Michael S. Lewicki ! Carnegie MellonArtificial Intelligence: Probabilistic LearningBayesian classification for more complex models•Recall the class conditional probability:•How do we define the data likelihood, p(x|Ck)ie the probability of x given class Ck16p(Ck|x)


View Full Document

CMU CS 15381 - 041707probabilisticLearning

Documents in this Course
Planning

Planning

19 pages

Planning

Planning

19 pages

Lecture

Lecture

42 pages

Lecture

Lecture

27 pages

Lecture

Lecture

19 pages

FOL

FOL

41 pages

lecture

lecture

34 pages

Exam

Exam

7 pages

Lecture

Lecture

22 pages

Handout

Handout

11 pages

Midterm

Midterm

14 pages

lecture

lecture

83 pages

Handouts

Handouts

38 pages

mdp

mdp

37 pages

HW2

HW2

7 pages

nn

nn

25 pages

lecture

lecture

13 pages

Handout

Handout

5 pages

Lecture

Lecture

27 pages

Lecture

Lecture

62 pages

Lecture

Lecture

5 pages

Load more
Download 041707probabilisticLearning
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view 041707probabilisticLearning and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view 041707probabilisticLearning 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?