DOC PREVIEW
Berkeley COMPSCI 188 - Lecture 17: Bayes Nets IV

This preview shows page 1-2-3-25-26-27 out of 27 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 27 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

1CS 188: Artificial IntelligenceFall 2008Lecture 17: Bayes Nets IVLecture 17: Bayes Nets IV10/28/2008Dan Klein – UC Berkeley12Inference Inference: calculating some statistic from a joint probability distribution Examples:Posterior probability:R BLPosterior probability: Most likely explanation:TDT’33Mini-Examples: VE Mini-Alarm Network B: there is a burglary A: the alarm soundsC: neighbor callsABB Pb 0.1¬b0.9C: neighbor calls4CB A Pb a 0.8b¬a0.2¬ba 0.1¬b ¬a0.9A C Pa c 0.7a¬c0.3¬ac 0.5¬a ¬c0.54Basic Operation: Join First basic operation: join factors Combining two factors: Just like a database joinBuild a factor over the union of the variables involvedBuild a factor over the union of the variables involved Example: Computation for each entry: pointwise products55Basic Operation: Eliminate Second basic operation: marginalization Take a factor and sum out a variable Shrinks a factor to a smaller oneA projectionoperationA projectionoperation Example: Definition:66Example: P(A)B Pb0.1B AA, BStartJoin on B Sum out B7A B Pa b 0.08a¬b0.09¬ab 0.02¬a ¬b0.81B A Pb a 0.8b¬a0.2¬ba 0.1¬b ¬a0.9b0.1¬b0.9AA Pa 0.17¬a0.837Example: Multiple JoinsB A Pb a 0.8b¬a0.2B Pb 0.1¬b0.9BJoin on BB, AB A Pb a 0.08b¬a0.028¬ba 0.1¬b ¬a0.9A¬cA C Pa c 0.7a¬c0.3¬ac 0.5¬a ¬c0.5C¬ba 0.09¬b ¬a0.81A C Pa c 0.7a¬c0.3¬ac 0.5¬a ¬c0.58Example: Multiple JoinsJoin on AB, ACB A Pb a 0.08b¬a0.02B, A, CBACP9b¬a0.02¬ba 0.09¬b ¬a0.81A C Pa c 0.7a¬c0.3¬ac 0.5¬a ¬c0.5BACPb a c 0.056b a¬c0.024b¬ac 0.010b¬a ¬c0.010¬ba c 0.063¬ba¬c0.027¬b ¬ac 0.405¬b ¬a ¬c0.4059Query: P(C)Sum out BB, A, CB A C Pbac0.056A, CA C Pa c 0.119a¬c0.05110bac0.056b a¬c0.024b¬ac 0.010b¬a ¬c0.010¬ba c 0.063¬ba¬c0.027¬b ¬ac 0.405¬b ¬a ¬c0.405Sum out A¬ac 0.415¬a ¬c0.415CC Pc 0.534¬c0.46610Variable Elimination Why is inference by enumeration so slow? You join up the whole joint distribution before you sum out the hidden variables You do lots of computations irrelevant to the evidenceYou end up repeating a lot of workYou end up repeating a lot of work Idea: interleave joining and marginalizing! Marginalize variables as soon as possible Called “Variable Elimination” Still NP-hard, but usually much faster than inference by enumeration1111P(C) : Marginalizing EarlySum out BB, ACB A Pb a 0.08b¬a0.02ACA Pa 0.17¬a0.8312b¬a0.02¬ba 0.09¬b ¬a0.81A C Pa c 0.7a¬c0.3¬ac 0.5¬a ¬c0.5¬a0.83A C Pa c 0.7a¬c0.3¬ac 0.5¬a ¬c0.512Marginalizing EarlyACA Pa 0.17¬a0.83Join on AA, CA C Pa c 0.119a¬c0.05113¬a0.83A C Pa c 0.7a¬c0.3¬ac 0.5¬a ¬c0.5Sum out Aa¬c0.051¬ac 0.415¬a ¬c0.415CC Pc 0.534¬c0.46613General Variable Elimination Query: Start with initial factors: Local CPTs (but instantiated by evidence) While there are still hidden variables (not Q or evidence): Pick a hidden variable H Join all factors mentioning H Sum out H Join all remaining factors and normalize1414Example: P(B|a)B Pb0.1B a, BStart / SelectJoin on B Normalize15A B Pa b 0.08a¬b0.09B A Pb a 0.8b¬a0.2¬ba 0.1¬b ¬a0.9b0.1¬b0.9aA B Pa b 8/17a¬b9/1715Example: P(B|¬c)B A Pb a 0.8b¬a0.2B Pb 0.1¬b0.9BJoin on AB Pb 0.1¬b0.9A,¬cB16¬ba 0.1¬b ¬a0.9A¬cA C Pa c 0.7a¬c0.3¬ac 0.5¬a ¬c0.5A,¬cB A C Pb a¬c0.24b¬a ¬c0.10¬ba¬c0.03¬b ¬a ¬c0.4516Example: P(B|¬c)B Pb 0.1¬b0.9A, ¬cBSum out AB Pb 0.1¬cB17B A C Pb a¬c0.24b¬a ¬c0.10¬ba¬c0.03¬b ¬a ¬c0.45¬b0.9¬cB C Pb¬c0.34¬b ¬c0.4817Example: P(B|¬c)Join on BB Pb0.1BB, ¬cB C Pb¬c0.03418b0.1¬b0.9¬cB C Pb¬c0.34¬b ¬c0.48b¬c0.034¬b ¬c0.423NormalizeB C Pb¬c0.073¬b ¬c0.92718Variable Elimination What you need to know: Should be able to run it on small examples, understand the factor creation / reduction flow Better than enumeration: VE caches intermediate computations Saves time by marginalizing variables as soon as possible rather than at the endthan at the end Polynomial time for tree-structured graphs – sound familiar? We will see special cases of VE later You’ll have to implement the special cases Approximations Exact inference is slow, especially with a lot of hidden nodes Approximate methods give you a (close, wrong?) answer, faster1919Sampling Basic idea: Draw N samples from a sampling distribution S Compute an approximate posterior probability Show this converges to the true probability P Outline: Sampling from an empty network Rejection sampling: reject samples disagreeing with evidence Likelihood weighting: use evidence to weight samples2020Prior SamplingCloudySprinklerRainCloudySprinklerRainSprinklerRainWetGrassSprinklerRainWetGrass2121Prior Sampling This process generates samples with probability…i.e. the BN’s joint probability Let the number of samples of an event be Then I.e., the sampling procedure is consistent2222Example We’ll get a bunch of samples from the BN:c, ¬s, r, wc, s, r, w¬c, s, r, ¬wc, ¬s, r, wCloudySprinklerRainCSRc, ¬s, r, w¬c, s, ¬r, w If we want to know P(W) We have counts <w:4, ¬w:1> Normalize to get P(W) = <w:0.8, ¬w:0.2> This will get closer to the true distribution with more samples Can estimate anything else, too What about P(C| ¬r)? P(C| ¬r, ¬w)?WetGrassW2323Rejection Sampling Let’s say we want P(C) No point keeping all samples around Just tally counts of C outcomesLet’s say we want P(C| s)CloudySprinklerRainWetGrassCSRWLet’s say we want P(C| s) Same thing: tally C outcomes, but ignore (reject) samples which don’t have S=s This is rejection sampling It is also consistent for conditional probabilities (i.e., correct in the limit)c, ¬s, r, wc, s, r, w¬c, s, r, ¬wc, ¬s, r, w¬c, s, ¬r, w2424Likelihood Weighting Problem with rejection sampling: If evidence is unlikely, you reject a lot of samples You don’t exploit your evidence as you sample Consider P(B|a)BurglaryAlarm Idea: fix evidence variables and sample the rest Problem: sample distribution not consistent! Solution: weight by probability of evidence given parentsBurglary AlarmBurglaryAlarm2525Likelihood SamplingCloudySprinklerRainCloudySprinklerRainSprinklerRainWetGrassSprinklerRainWetGrass2626Likelihood Weighting Sampling distribution if z sampled and e fixed evidence Now, samples have weightsCloudyRainCSR Together, weighted sampling distribution is


View Full Document

Berkeley COMPSCI 188 - Lecture 17: Bayes Nets IV

Documents in this Course
CSP

CSP

42 pages

Metrics

Metrics

4 pages

HMMs II

HMMs II

19 pages

NLP

NLP

23 pages

Midterm

Midterm

9 pages

Agents

Agents

8 pages

Lecture 4

Lecture 4

53 pages

CSPs

CSPs

16 pages

Midterm

Midterm

6 pages

MDPs

MDPs

20 pages

mdps

mdps

2 pages

Games II

Games II

18 pages

Load more
Download Lecture 17: Bayes Nets IV
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture 17: Bayes Nets IV and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 17: Bayes Nets IV 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?