DOC PREVIEW
CMU CS 15381 - final-review

This preview shows page 1-2-3-25-26-27 out of 27 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Artificial Intelligence: Representation and Problem Solving15-381May 8, 2007Review Session for FinalMichael S. Lewicki ! Carnegie MellonArtificial Intelligence: Final Review SessionEssential things to know (from 2nd half)•Probability and Uncertainty•Bayes nets•Decision Trees•Probabilistic Learning•Cross Validation•Clustering and k-Nearest Neighbors•Neural Nets•MDPs•HMMs•Reinforcement Learning2Michael S. Lewicki ! Carnegie MellonArtificial Intelligence: Final Review SessionProbability and Uncertainty•sum rule (marginalization)•product rule•Bayes rule•independence 3p(x, y) = p(x|y)p(y)p(X = xi) =N!j=1p(X = xi, Y = yj)p(x) =!yp(x, y)p(x) ="dy p(x, y)short hand notationp(y|x) =p(x|y)p(y)p(x)=p(x|y)p(y)!yp(x|y)p(y)p(x, y) = p(x)p(y)iff x and y are independentp(x1, x2, . . . , xN) =N!n=1p(xn)Michael S. Lewicki ! Carnegie MellonArtificial Intelligence: Final Review SessionBayes Nets•Bayes Net = directed graph of variables with the property that a variable is conditionally independent of its non-descendants given its parentsP(X | Parents(X), Y,Z,W) = P(X|Parents(X))•Associated with each variable is the Conditional Probability Table (CPT) of P(X|Parents(X))•Any entry of the joint probability distribution can be computed from the Bayes Net by looking only at the CPT:4P (x1, . . . , xn) ≡ P (X1= x1∧ . . . ∧ Xn= xn)=n!i=1P (xi|parents(Xi))Michael S. Lewicki ! Carnegie MellonArtificial Intelligence: Final Review SessionBayes Nets•Inference: Given observed values for a subset of variables (the “evidence” E2), compute the probability distribution of another set of variables E1 (the “query”), symbolically denoted by: P(E1 | E2)•In principle, any inference can be computed from a Bayes Nets by summing up the appropriate entries of the joint distribution: 5Michael S. Lewicki ! Carnegie MellonArtificial Intelligence: Final Review SessionBayes Nets•Inference requires in general exponential time in the number of variables, because of the number of terms in the sums•The sums can be simplified by cleverly grouping the terms that depend only on one (or a small number of) variable•This can implemented in polynomial time if the graph is a “polytree” (i.e., the undirected version of the graph is a tree)•Expected:-Understand how probabilities can be expressed as sums of products-How to group terms advantageously and do so on simple examples6Michael S. Lewicki ! Carnegie MellonArtificial Intelligence: Final Review SessionfebacdeabcStatistical dependences in belief nets•Is an independence statement, e.g. is a 󲰥 b | c implied by a graph?•If two nodes are conditionally independent given another variable, e.g. a 󲰥 b | c then p(a|b,c) = p(a|c).•Note: There is a technique called “d-separation” for determining whether two noes (or sets of nodes) are independent. It was covered in previous years and is in previous exams, but the book does not cover this and we did not cover it in lecture, so you are not responsible for this material.•What you should understand: How information can propagate in belief networks to affect the beliefs about certain nodes.7Is a independent of b given c?No. The inferred value of f depends on b. e is also factor in determining d. d in turn influences a. So the value of b can influence the value of a, so they are not independent.Is a is independent of b given e ?Yes. e fixed completely determines the probability of b via p(b|e).febacdeabcMichael S. Lewicki ! Carnegie MellonArtificial Intelligence: Learning and Decision TreesTypes of learning8world(or data)model{θ1, . . . , θn}desired output{y1, . . . , yn}supervisedworld(or data)model{θ1, . . . , θn}unsupervisedworld(or data)model{θ1, . . . , θn}model outputreinforcementreinforcementMichael S. Lewicki ! Carnegie MellonArtificial Intelligence: Learning and Decision TreesPolynomial curve fitting90 1!1010 1!1010 1!1010 1!101y(x, w) = w0+ w1x + w2x2+ · · · + wMxM=M!j=0wjxjE(w) =12N!n=1[y(xn, w) − tn]2example from Bishop (2006), Pattern Recognition and Machine Learning“Overfitting”Michael S. Lewicki ! Carnegie MellonArtificial Intelligence: Decision Trees 210General principle•As its complexity increases, the model is able to better classify the training data•Performance on the test data initially increases, but then falls as the model overfits, or becomes specialized for classifying the noise training•The complexity in decision trees is the number of free parameters, ie the number of nodes%correct classificationComplexity of model (eg size of tree)Classification performance on training dataClassification performance on test dataRegion of overfitting the training dataMichael S. Lewicki ! Carnegie MellonArtificial Intelligence: Learning and Decision TreesDecision trees: classifying from a set of attributes11<2 years at current job?missed payments?defaulted?NNNYNYNNNNNNNYYYNNNYNNYYYNNYNNPredicting credit riskbad: 3good: 7missed payments?N Ybad: 2good: 1bad: 1good: 6<2 years at current job?N Ybad: 0good: 3bad: 1good: 3•each level splits the data according to different attributes•goal: achieve perfect classification with minimal number of decisions-not always possible due to noise or inconsistencies in the dataMichael S. Lewicki ! Carnegie MellonArtificial Intelligence: Learning and Decision TreesEntropy12•How many bits does it take to specify the attribute of ‘defaulted?’-P(defaulted = Y) = 3/10-P(defaulted = N) = 7/10•How much can we reduce the entropy (or uncertainty) of ‘defaulted’ by knowing the other attributes?•Ideally, we could reduce it to zero, in which case we classify perfectly.<2 years at current job?missed payments?defaulted?NNNYNYNNNNNNNYYYNNNYNNYYYNNYNNPredicting credit riskH(Y ) = −!i=Y,NP (Y = yi) log2P (Y = yi)= −0.3 log20.3 − 0.7 log20.7= 0.8813Michael S. Lewicki ! Carnegie MellonArtificial Intelligence: Learning and Decision TreesConditional entropy•H(Y|X) is the remaining entropy of Y given XorThe expected (or average) entropy of P(y|x)•H(Y|X=x) is the specific conditional entropy, i.e. the entropy of Y knowing the value of a specific attribute x.13H(Y |X) ≡ −!xP (x)!yP (y|x) log2P (y|x)= −!xP (x)!yP (Y = y|X = x) log2P (Y = y|X = x)= −!xP (x)!yH(Y |X = x)Michael S. Lewicki ! Carnegie MellonArtificial Intelligence: Learning and Decision TreesBack to the credit risk example14<2 yrsmisseddef?NNNYNYNNNNNNNYYYNNNYNNYYYNNYNNPredicting credit riskH(Y |X) ≡ −!xP (x)!yP (y|x) log2P (y|x)=


View Full Document

CMU CS 15381 - final-review

Documents in this Course
Planning

Planning

19 pages

Planning

Planning

19 pages

Lecture

Lecture

42 pages

Lecture

Lecture

27 pages

Lecture

Lecture

19 pages

FOL

FOL

41 pages

lecture

lecture

34 pages

Exam

Exam

7 pages

Lecture

Lecture

22 pages

Handout

Handout

11 pages

Midterm

Midterm

14 pages

lecture

lecture

83 pages

Handouts

Handouts

38 pages

mdp

mdp

37 pages

HW2

HW2

7 pages

nn

nn

25 pages

lecture

lecture

13 pages

Handout

Handout

5 pages

Lecture

Lecture

27 pages

Lecture

Lecture

62 pages

Lecture

Lecture

5 pages

Load more
Download final-review
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view final-review and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view final-review 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?