Berkeley COMPSCI 188 - Bayes nets III - D1690285

Home> Schools> University of California, Berkeley> Computer Science (COMPSCI) > COMPSCI 188> Bayes nets III

DOC PREVIEW

Berkeley COMPSCI 188 - Bayes nets III

School name University of California, Berkeley

Course Compsci 188- Introduction to Artificial Intelligence

Pages 6

This preview shows page 1-2 out of 6 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 6 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 6 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 6 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

1CS 188: Artificial IntelligenceFall 2011Lecture 15: Bayes’Nets III: Inference10/13/2011Dan Klein – UC BerkeleyMany slides over this course adapted from Stuart Russell, Andrew MooreBayes’ Net Semantics A set of nodes, one per variable X A directed, acyclic graph A conditional distribution for each node A collection of distributions over X, one for each combination of parents’ values CPT: conditional probability table Description of a noisy “causal” processA1XAnA Bayes net = Topology (graph) + Local Conditional Probabilities2Probabilities in BNs For all joint distributions, we have (chain rule): Bayes’nets implicitly encode joint distributions As a product of local conditional distributions To see what probability a BN gives to a full assignment, multiply all the relevant conditionals together: This lets us reconstruct any entry of the full joint Not every BN can represent every joint distribution The topology enforces certain conditional independencies3All Conditional Independences Given a Bayes net structure, can run d-separation to build a complete list of conditional independences that are necessarily true of the form This list determines the set of probability distributions that can be represented 4Same Assumptions, Different Graphs? Can you have two different graphs that encode the same assumptions? Yes! Examples:5Example: Independence For this graph, you can fiddle with θ (the CPTs) all you want, but you won’t be able to represent any distribution in which the flips are dependent!h 0.5t 0.5h 0.5t 0.5X1X2All distributions62Topology Limits Distributions Given some graph topology G, only certain joint distributions can be encoded The graph structure guarantees certain (conditional) independences (There might be more independence) Adding arcs increases the set of distributions, but has several costs Full conditioning can encode any distribution7XYZXYZXYZXYZXYZXYZ XYZXYZXYZ XYZCausality? When Bayes’ nets reflect the true causal patterns: Often simpler (nodes have fewer parents) Often easier to think about Often easier to elicit from experts BNs need not actually be causal Sometimes no causal net exists over the domain E.g. consider the variables Traffic and Drips End up with arrows that reflect correlation, not causation What do the arrows really mean? Topology may happen to encode causal structure Topology only guaranteed to encode conditional independence *More about causality: [Causility – Judea Pearl]8Bayes Nets Representation Summary Bayes nets compactly encode joint distributions Guaranteed independencies of distributions can be deduced from BN graph structure D-separation gives precise conditional independence guarantees from graph alone A Bayes’ net’s joint distribution may have further (conditional) independence that is not detectable until you inspect its specific distribution15Inference Inference: calculating some useful quantity from a joint probability distribution Examples: Posterior probability: Most likely explanation:16B EAJ MInference by Enumeration Given unlimited time, inference in BNs is easy Recipe: State the marginal probabilities you need Figure out ALL the atomic probabilities you need Calculate and combine them Example:17B EAJ MExample: Enumeration In this simple method, we only need the BN to synthesize the joint entries183Inference by Enumeration?19Variable Elimination Why is inference by enumeration so slow? You join up the whole joint distribution before you sum out the hidden variables You end up repeating a lot of work! Idea: interleave joining and marginalizing! Called “Variable Elimination” Still NP-hard, but usually much faster than inference by enumeration We’ll need some new notation to define VE20Factor Zoo I Joint distribution: P(X,Y) Entries P(x,y) for all x, y Sums to 1 Selected joint: P(x,Y) A slice of the joint distribution Entries P(x,y) for fixed x, all y Sums to P(x)21T W Phot sun 0.4hot rain 0.1cold sun 0.2cold rain 0.3T W Pcold sun 0.2cold rain 0.3Factor Zoo II Family of conditionals: P(X |Y) Multiple conditionals Entries P(x | y) for all x, y Sums to |Y| Single conditional: P(Y | x) Entries P(y | x) for fixed x, all y Sums to 122T W Phot sun 0.8hot rain 0.2cold sun 0.4cold rain 0.6T W Pcold sun 0.4cold rain 0.6Factor Zoo III Specified family: P(y | X) Entries P(y | x) for fixed y,but for all x Sums to … who knows! In general, when we write P(Y1… YN| X1… XM) It is a “factor,” a multi-dimensional array Its values are all P(y1… yN| x1… xM) Any assigned X or Y is a dimension missing (selected) from the array23T W Phot rain 0.2cold rain 0.6Example: Traffic Domain Random Variables R: Raining T: Traffic L: Late for class! First query: P(L)24TLR+r 0.1-r 0.9+r +t 0.8+r -t 0.2-r +t 0.1-r -t 0.9+t +l 0.3+t -l 0.7-t +l 0.1-t -l 0.94 Track objects called factors Initial factors are local CPTs (one per node) Any known values are selected E.g. if we know , the initial factors are VE: Alternately join factors and eliminate variables25Variable Elimination Outline+r 0.1-r 0.9+r +t 0.8+r -t 0.2-r +t 0.1-r -t 0.9+t +l 0.3+t -l 0.7-t +l 0.1-t -l 0.9+t +l 0.3-t +l 0.1+r 0.1-r 0.9+r +t 0.8+r -t 0.2-r +t 0.1-r -t 0.9 First basic operation: joining factors Combining factors: Just like a database join Get all factors over the joining variable Build a new factor over the union of the variables involved Example: Join on R Computation for each entry: pointwise products26Operation 1: Join Factors+r 0.1-r 0.9+r +t 0.8+r -t 0.2-r +t 0.1-r -t 0.9+r +t 0.08+r -t 0.02-r +t 0.09-r -t 0.81TRR,TExample: Multiple Joins28TRJoin RLR, TL+r 0.1-r 0.9+r +t 0.8+r -t 0.2-r +t 0.1-r -t 0.9+t +l 0.3+t -l 0.7-t +l 0.1-t -l 0.9+r +t 0.08+r -t 0.02-r +t 0.09-r -t 0.81+t +l 0.3+t -l 0.7-t +l 0.1-t -l 0.9Example: Multiple Joins29Join TR, T, LR, TL+r +t 0.08+r -t 0.02-r +t 0.09-r -t 0.81+t +l 0.3+t -l 0.7-t +l 0.1-t -l 0.9+r +t +l0.024+r +t -l0.056+r -t +l0.002+r -t -l0.018-r +t +l0.027-r +t -l0.063-r -t +l0.081-r -t -l0.729Operation 2: Eliminate Second basic operation: marginalization Take a factor and sum out a variable Shrinks a factor to a smaller one A projection operation Example:30+r +t 0.08+r -t 0.02-r +t 0.09-r -t

View Full Document

Berkeley COMPSCI 188 - Bayes nets III

Sign up for free to view:

This document and 3 million+ documents and flashcards
High quality study guides, lecture notes, practice exams
Course Packets handpicked by editors offering a comprehensive review of your courses
Better Grades Guaranteed


School:
Email:
New Password:
Confirm Password:

This preview shows page 1-2 out of 6 pages.

Berkeley COMPSCI 188 - Bayes nets III

Sign up for free to view:

Please select your school