Berkeley COMPSCI 188 - Bayes nets III (1PP) - D521410

Home> Schools> University of California, Berkeley> Computer Science (COMPSCI) > COMPSCI 188> Bayes nets III (1PP)

DOC PREVIEW

Berkeley COMPSCI 188 - Bayes nets III (1PP)

School name University of California, Berkeley

Course Compsci 188- Introduction to Artificial Intelligence

Pages 37

This preview shows page 1-2-17-18-19-36-37 out of 37 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 37 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 37 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 37 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 37 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 37 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 37 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 37 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 37 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

1CS 188: Artificial IntelligenceFall 2008Lecture 16: Bayes Nets IIILecture 16: Bayes Nets III10/23/2008Dan Klein – UC Berkeley12Announcements Midterms graded, up on glookup, back TuesdayW4 also graded, back in sections / boxW4 also graded, back in sections / box Past homeworks in return box in 2ndfloor lab23Causality? When Bayes’ nets reflect the true causal patterns: Often simpler (nodes have fewer parents) Often easier to think about Often easier to elicit from expertsBNs need not actually be causalBNs need not actually be causal Sometimes no causal net exists over the domain E.g. consider the variables Traffic and Drips End up with arrows that reflect correlation, not causation What do the arrows really mean? Topology may happen to encode causal structure Topology only guaranteed to encode conditional independencies34Example: Traffic Basic traffic net Let’s multiply out the jointRTr 1/4¬r3/4r t 3/4¬t1/4¬rt 1/2¬t1/2r t 3/16r¬t1/16¬rt 6/16¬r ¬t6/1645Example: Reverse Traffic Reverse causality?TRt 9/16¬t7/16t r 1/3¬r2/3¬tr 1/7¬r6/7r t 3/16r¬t1/16¬rt 6/16¬r ¬t6/1656Topology Limits DistributionsX YAll distributions6X Y7Non-Guaranteed Independence Adding an arc doesn’t guarantee dependence, it just makes it possibleX1X2X1X2h 0.5t 0.5h 0.5t 0.5X1X2h 0.5t 0.5h | h 0.5t | h 0.5X1X2h | t 0.5t | t 0.578Alternate BNs89Summary Bayes nets compactly encode joint distributions Guaranteed independencies of distributions can be deduced from BN graph structure A Bayes’ net may have other independencies that are not detectable until you inspect its specific distribution The Bayes’ ball algorithm (aka d-separation) tells us when an observation of one variable can change belief about another variable910Inference Inference: calculating some statistic from a joint probability distribution Examples:Posterior probability:R BLPosterior probability: Most likely explanation:TDT’1011Reminder: Alarm Network1112Inference by Enumeration Given unlimited time, inference in BNs is easy Recipe: State the marginal probabilities you needFigure out ALL the atomic probabilities you needFigure out ALL the atomic probabilities you need Calculate and combine them Example:1213ExampleWhere did we use the BN structure?We didn’t!1314Example In this simple method, we only need the BN to synthesize the joint entries1415Normalization TrickNormalize1516Inference by Enumeration?1617Variable Elimination Why is inference by enumeration so slow? You join up the whole joint distribution before you sum out the hidden variables You end up repeating a lot of work! Idea: interleave joining and marginalizing! Called “Variable Elimination” Still NP-hard, but usually much faster than inference by enumeration We’ll need some new notation to define VE1718Factor Zoo I Joint distribution: P(X,Y) Entries P(x,y) for all x, y Sums to 1T W Phot sun 0.4hot rain 0.1coldsun0.2 Selected joint: P(x,Y) A slice of the joint distribution Entries P(x,y) for fixed x, all y Sums to P(x)18coldsun0.2cold rain 0.3T W Pcold sun 0.2cold rain 0.319Factor Zoo II Family of conditionals: P(X |Y) Multiple conditionals Entries P(x | y) for all x, y Sums to |Y|T W Phot sun 0.8hot rain 0.2coldsun0.4 Single conditional: P(Y | x) Entries P(y | x) for fixed x, all y Sums to 119coldsun0.4cold rain 0.6T W Pcold sun 0.4cold rain 0.620Factor Zoo III Specified family: P(y | X) Entries P(y | x) for fixed y,all x Sums to … who knows!T W Phot rain 0.2cold rain 0.6 In general, when we write P(Y1… YN| X1… XM) It is a “factor,” a multi-dimensional array Its values are all P(y1… yN| x1… xM) Any unassigned X or Y is a dimension missing (selected) from the array2021Basic Objects Track objects called factors Initial factors are local CPTs One per node in the BN Any known values are specified E.g. if we know J = j and E = ¬e, the initial factors are VE: Alternately join and marginalize factors2122Basic Operation: Join First basic operation: join factors Combining two factors: Just like a database joinBuild a factor over the union of the variables involvedBuild a factor over the union of the variables involved Example: Computation for each entry: pointwise products2223Basic Operation: Join In general, we join on a variable Take all factors mentioning that variable Join them all togetherExample:Example: Join on A: Pick up these: Join to form: 2324Basic Operation: Eliminate Second basic operation: marginalization Take a factor and sum out a variable Shrinks a factor to a smaller oneA projectionoperationA projectionoperation Example: Definition:2425General Variable Elimination Query: Start with initial factors: Local CPTs (but instantiated by evidence) While there are still hidden variables (not Q or evidence): Pick a hidden variable H Join all factors mentioning H Project out H Join all remaining factors and normalize2526ExampleChoose AChoose A2627ExampleChoose EFinish with BNormalize2728Variable Elimination What you need to know: Should be able to run it on small examples, understand the factor creation / reduction flow Better than enumeration: VE caches intermediate computations Saves time by marginalizing variables as soon as possible rather than at the endthan at the end Polynomial time for tree-structured graphs – sound familiar? We will see special cases of VE later You’ll have to implement the special cases Approximations Exact inference is slow, especially with a lot of hidden nodes Approximate methods give you a (close, wrong?) answer, faster2829Sampling Basic idea: Draw N samples from a sampling distribution S Compute an approximate posterior probability Show this converges to the true probability P Outline: Sampling from an empty network Rejection sampling: reject samples disagreeing with evidence Likelihood weighting: use evidence to weight samples2930Prior SamplingCloudySprinklerRainCloudySprinklerRainSprinklerRainWetGrassSprinklerRainWetGrass3031Prior Sampling This process generates samples with probability…i.e. the BN’s joint probability Let the number of samples of an event be Then I.e., the sampling procedure is consistent3132Example We’ll get a

View Full Document