CMU CS 10701 - Bayesian Networks – Representation (cont.) Inference - D2030988

Home> Schools> Carnegie Mellon University> Computer Science (CS) > CS 10701> Bayesian Networks – Representation (cont.) Inference

CMU CS 10701 - Bayesian Networks – Representation (cont.) Inference

School name Carnegie Mellon University

Course Cs 10701- Introduction to Machine Learning

Pages 50

Download Save

Unformatted text preview:

Required Readings from Koller Friedman Representation 2 1 2 2 Inference 5 1 6 1 6 2 6 7 1 Optional 2 3 5 2 5 3 6 3 6 7 2 Bayesian Networks Representation cont Inference Machine Learning 10701 15781 Carlos Guestrin Carnegie Mellon University March 22st 2006 Announcements One page project proposal due now We ll go over midterm in this week s recitation Homework 4 out later today due April 5th two weeks from today Handwriting recognition Character recognition e g kernel SVMs rr r r r c r a z c bc Handwriting recognition 2 Car starts BN 18 binary attributes Inference P BatteryAge Starts f 218 terms why so fast Not impressed HailFinder BN more than 354 58149737003040059690390169 terms Factored joint distribution Preview Flu Allergy Sinus Headache Nose The independence assumption Flu Allergy Sinus Headache Nose Local Markov Assumption A variable X is independent of its non descendants given its parents Explaining away Flu Allergy Sinus Headache Nose Local Markov Assumption A variable X is independent of its non descendants given its parents Chain rule Joint distribution Flu Allergy Sinus Headache Nose Local Markov Assumption A variable X is independent of its non descendants given its parents Two trivial special cases Edgeless graph Fully connected graph The Representation Theorem Joint Distribution to BN BN Encodes independence assumptions If conditional independencies Obtain in BN are subset of conditional independencies in P Joint probability distribution Real Bayesian networks applications Diagnosis of lymph node disease Speech recognition Microsoft office and Windows http www research microsoft com research dtg Study Human genome Robot mapping Robots to identify meteorites to study Modeling fMRI data Anomaly detection Fault dianosis Modeling sensor network data A general Bayes net Set of random variables Directed acyclic graph Encodes independence assumptions CPTs Joint distribution Another example Variables B Burglar E Earthquake A Burglar alarm N Neighbor calls R Radio report Both burglars and earthquakes can set off the alarm If the alarm sounds a neighbor may call An earthquake may be announced on the radio Another example Building the BN B Burglar E Earthquake A Burglar alarm N Neighbor calls R Radio report Independencies encoded in BN We said All you need is the local Markov assumption Xi NonDescendantsXi PaXi But then we talked about other in dependencies e g explaining away What are the independencies encoded by a BN Only assumption is local Markov But many others can be derived using the algebra of conditional independencies Understanding independencies in BNs BNs with 3 nodes Local Markov Assumption Indirect causal effect X Z Y Indirect evidential effect X Z A variable X is independent of its non descendants given its parents Common effect Y X Common cause Z Z X Y Y Understanding independencies in BNs Some examples A B C E D G F H I J K An active trail Example A B C D G E F F F When are A and H independent H Active trails formalized A path X1 X2 Xk is an active trail when variables O X1 Xn are observed if for each consecutive triplet in the trail Xi 1 Xi Xi 1 and Xi is not observed Xi O Xi 1 Xi Xi 1 and Xi is not observed Xi O Xi 1 Xi Xi 1 and Xi is not observed Xi O Xi 1 Xi Xi 1 and Xi is observed Xi O or one of its descendents Active trails and independence Theorem Variables Xi and Xj are independent given Z X1 Xn if the is no active trail between Xi and Xj when variables Z X1 Xn are observed A B C E D G F H I J K The BN Representation Theorem If conditional independencies in BN are subset of conditional independencies in P Obtain Joint probability distribution Important because Every P has at least one BN structure G If joint probability distribution Obtain Then conditional independencies in BN are subset of conditional independencies in P Important because Read independencies of P from BN structure G Simpler BNs A distribution can be represented by many BNs Simpler BN requires fewer parameters Learning Bayes nets Known structure Unknown structure Fully observable data Missing data Data x 1 x m CPTs P Xi PaXi structure parameters Learning the CPTs Data x 1 x m For each discrete variable Xi What you need to know Bayesian networks Semantics of a BN Conditional independence assumptions Representation A compact representation for large probability distributions Not an algorithm Variables Graph CPTs Why BNs are useful Learning CPTs from fully observable data Play with applet General probabilistic inference Flu Query Sinus Headache Using Bayes rule Normalization Allergy Nose Marginalization Flu Allergy t Sinus Probabilistic inference example Flu Allergy Sinus Headache Nose t Inference seems exponential in number of variables Inference is NP hard Actually P complete Reduction 3 SAT X 1 X 2 X 3 X 2 X 3 X 4 Inference unlikely to be efficient in general but Fast probabilistic inference example Variable elimination Flu Allergy Sinus Headache Nose t Potential for Exponential reduction in computation Understanding variable elimination Exploiting distributivity Flu Sinus Nose t Understanding variable elimination Order can make a HUGE difference Flu Allergy Sinus Headache Nose t Understanding variable elimination Intermediate results Flu Allergy Sinus Headache Nose t Intermediate results are probability distributions Understanding variable elimination Another example Sinus Nose t Headache Pharmacy Pruning irrelevant variables Flu Allergy Sinus Headache Nose t Prune all non ancestors of query variables Variable elimination algorithm Given a BN and a query P X e P X e Instantiate evidence e IMPORTANT Prune non ancestors of X e Choose an ordering on variables e g X1 Xn For i 1 to n If Xi X e Collect factors f1 fk that include Xi Generate a new factor by eliminating Xi from these factors Variable Xi has been eliminated Normalize P X e to obtain P X e Complexity of variable elimination Poly tree graphs Variable elimination order Start from leaves up find topological order eliminate variables in reverse order Linear in number of variables versus exponential Complexity of variable elimination Graphs with loops Exponential in number of variables in largest factor generated Complexity of variable elimination Tree width Moralize graph Connect parents into a clique and remove edge directions Complexity of VE elimination Only exponential in tree width Tree width is maximum node cut 1 Example Large tree width with small number of parents Compact representation Easy inference Choosing an

View Full Document

CMU CS 10701 - Bayesian Networks – Representation (cont.) Inference

Sign up for free to view:

This document and 3 million+ documents and flashcards
High quality study guides, lecture notes, practice exams
Course Packets handpicked by editors offering a comprehensive review of your courses
Better Grades Guaranteed


School:
Email:
New Password:
Confirm Password:

CMU CS 10701 - Bayesian Networks – Representation (cont.) Inference

Sign up for free to view:

Please select your school