DOC PREVIEW
Berkeley COMPSCI 188 - Bayes nets III

This preview shows page 1-2 out of 6 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

1CS 188: Artificial IntelligenceFall 2011Lecture 15: Bayes’Nets III: Inference10/13/2011Dan Klein – UC BerkeleyMany slides over this course adapted from Stuart Russell, Andrew MooreBayes’ Net Semantics A set of nodes, one per variable X A directed, acyclic graph A conditional distribution for each node A collection of distributions over X, one for each combination of parents’ values CPT: conditional probability table Description of a noisy “causal” processA1XAnA Bayes net = Topology (graph) + Local Conditional Probabilities2Probabilities in BNs For all joint distributions, we have (chain rule): Bayes’nets implicitly encode joint distributions As a product of local conditional distributions To see what probability a BN gives to a full assignment, multiply all the relevant conditionals together: This lets us reconstruct any entry of the full joint Not every BN can represent every joint distribution The topology enforces certain conditional independencies3All Conditional Independences Given a Bayes net structure, can run d-separation to build a complete list of conditional independences that are necessarily true of the form This list determines the set of probability distributions that can be represented 4Same Assumptions, Different Graphs? Can you have two different graphs that encode the same assumptions? Yes! Examples:5Example: Independence For this graph, you can fiddle with θ (the CPTs) all you want, but you won’t be able to represent any distribution in which the flips are dependent!h 0.5t 0.5h 0.5t 0.5X1X2All distributions62Topology Limits Distributions Given some graph topology G, only certain joint distributions can be encoded The graph structure guarantees certain (conditional) independences (There might be more independence) Adding arcs increases the set of distributions, but has several costs Full conditioning can encode any distribution7XYZXYZXYZXYZXYZXYZ XYZXYZXYZ XYZCausality? When Bayes’ nets reflect the true causal patterns: Often simpler (nodes have fewer parents) Often easier to think about Often easier to elicit from experts BNs need not actually be causal Sometimes no causal net exists over the domain E.g. consider the variables Traffic and Drips End up with arrows that reflect correlation, not causation What do the arrows really mean? Topology may happen to encode causal structure Topology only guaranteed to encode conditional independence *More about causality: [Causility – Judea Pearl]8Bayes Nets Representation Summary Bayes nets compactly encode joint distributions Guaranteed independencies of distributions can be deduced from BN graph structure D-separation gives precise conditional independence guarantees from graph alone A Bayes’ net’s joint distribution may have further (conditional) independence that is not detectable until you inspect its specific distribution15Inference Inference: calculating some useful quantity from a joint probability distribution Examples: Posterior probability: Most likely explanation:16B EAJ MInference by Enumeration Given unlimited time, inference in BNs is easy Recipe: State the marginal probabilities you need Figure out ALL the atomic probabilities you need Calculate and combine them Example:17B EAJ MExample: Enumeration In this simple method, we only need the BN to synthesize the joint entries183Inference by Enumeration?19Variable Elimination Why is inference by enumeration so slow? You join up the whole joint distribution before you sum out the hidden variables You end up repeating a lot of work! Idea: interleave joining and marginalizing! Called “Variable Elimination” Still NP-hard, but usually much faster than inference by enumeration We’ll need some new notation to define VE20Factor Zoo I Joint distribution: P(X,Y) Entries P(x,y) for all x, y Sums to 1 Selected joint: P(x,Y) A slice of the joint distribution Entries P(x,y) for fixed x, all y Sums to P(x)21T W Phot sun 0.4hot rain 0.1cold sun 0.2cold rain 0.3T W Pcold sun 0.2cold rain 0.3Factor Zoo II Family of conditionals: P(X |Y) Multiple conditionals Entries P(x | y) for all x, y Sums to |Y| Single conditional: P(Y | x) Entries P(y | x) for fixed x, all y Sums to 122T W Phot sun 0.8hot rain 0.2cold sun 0.4cold rain 0.6T W Pcold sun 0.4cold rain 0.6Factor Zoo III Specified family: P(y | X) Entries P(y | x) for fixed y,but for all x Sums to … who knows! In general, when we write P(Y1… YN| X1… XM) It is a “factor,” a multi-dimensional array Its values are all P(y1… yN| x1… xM) Any assigned X or Y is a dimension missing (selected) from the array23T W Phot rain 0.2cold rain 0.6Example: Traffic Domain Random Variables R: Raining T: Traffic L: Late for class! First query: P(L)24TLR+r 0.1-r 0.9+r +t 0.8+r -t 0.2-r +t 0.1-r -t 0.9+t +l 0.3+t -l 0.7-t +l 0.1-t -l 0.94 Track objects called factors Initial factors are local CPTs (one per node) Any known values are selected E.g. if we know , the initial factors are VE: Alternately join factors and eliminate variables25Variable Elimination Outline+r 0.1-r 0.9+r +t 0.8+r -t 0.2-r +t 0.1-r -t 0.9+t +l 0.3+t -l 0.7-t +l 0.1-t -l 0.9+t +l 0.3-t +l 0.1+r 0.1-r 0.9+r +t 0.8+r -t 0.2-r +t 0.1-r -t 0.9 First basic operation: joining factors Combining factors: Just like a database join Get all factors over the joining variable Build a new factor over the union of the variables involved Example: Join on R Computation for each entry: pointwise products26Operation 1: Join Factors+r 0.1-r 0.9+r +t 0.8+r -t 0.2-r +t 0.1-r -t 0.9+r +t 0.08+r -t 0.02-r +t 0.09-r -t 0.81TRR,TExample: Multiple Joins28TRJoin RLR, TL+r 0.1-r 0.9+r +t 0.8+r -t 0.2-r +t 0.1-r -t 0.9+t +l 0.3+t -l 0.7-t +l 0.1-t -l 0.9+r +t 0.08+r -t 0.02-r +t 0.09-r -t 0.81+t +l 0.3+t -l 0.7-t +l 0.1-t -l 0.9Example: Multiple Joins29Join TR, T, LR, TL+r +t 0.08+r -t 0.02-r +t 0.09-r -t 0.81+t +l 0.3+t -l 0.7-t +l 0.1-t -l 0.9+r +t +l0.024+r +t -l0.056+r -t +l0.002+r -t -l0.018-r +t +l0.027-r +t -l0.063-r -t +l0.081-r -t -l0.729Operation 2: Eliminate Second basic operation: marginalization Take a factor and sum out a variable Shrinks a factor to a smaller one A projection operation Example:30+r +t 0.08+r -t 0.02-r +t 0.09-r -t


View Full Document

Berkeley COMPSCI 188 - Bayes nets III

Documents in this Course
CSP

CSP

42 pages

Metrics

Metrics

4 pages

HMMs II

HMMs II

19 pages

NLP

NLP

23 pages

Midterm

Midterm

9 pages

Agents

Agents

8 pages

Lecture 4

Lecture 4

53 pages

CSPs

CSPs

16 pages

Midterm

Midterm

6 pages

MDPs

MDPs

20 pages

mdps

mdps

2 pages

Games II

Games II

18 pages

Load more
Download Bayes nets III
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Bayes nets III and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Bayes nets III 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?