Bayesian Networks – Representation (cont.)InferenceAnnouncementsHandwriting recognitionHandwriting recognition 2Car starts BNFactored joint distribution - PreviewThe independence assumptionExplaining awayChain rule & Joint distributionTwo (trivial) special casesThe Representation Theorem – Joint Distribution to BNReal Bayesian networks applicationsA general Bayes netAnother exampleAnother example – Building the BNIndependencies encoded in BNUnderstanding independencies in BNs – BNs with 3 nodesUnderstanding independencies in BNs – Some examplesAn active trail – ExampleActive trails formalizedActive trails and independence?The BN Representation Theorem“Simpler” BNsLearning Bayes netsLearning the CPTsWhat you need to knowGeneral probabilistic inferenceMarginalizationProbabilistic inference exampleInference is NP-hard (Actually #P-complete)Fast probabilistic inference example – Variable eliminationUnderstanding variable elimination – Exploiting distributivityUnderstanding variable elimination – Order can make a HUGE differenceUnderstanding variable elimination – Intermediate resultsUnderstanding variable elimination – Another examplePruning irrelevant variablesVariable elimination algorithmComplexity of variable elimination – (Poly)-tree graphsComplexity of variable elimination – Graphs with loopsComplexity of variable elimination –Tree-widthExample: Large tree-width with small number of parentsChoosing an elimination orderMost likely explanation (MLE)Max-marginalizationExample of variable elimination for MLE – Forward passExample of variable elimination for MLE – Backward passMLE Variable elimination algorithm – Forward passMLE Variable elimination algorithm – Backward passWhat you need to knowAcknowledgements1Required Readings from Koller & Friedman:Representation: 2.1, 2.2Inference: 5.1, 6.1, 6.2, 6.7.1Optional:2.3, 5.2, 5.3, 6.3, 6.7.2Bayesian Networks –Representation (cont.)Inference Machine Learning – 10701/15781Carlos GuestrinCarnegie Mellon UniversityMarch 22st, 20062Announcements One page project proposal due now We’ll go over midterm in this week’s recitation Homework 4 out later today, due April 5thtwo weeks from today3Handwriting recognitionCharacter recognition, e.g., kernel SVMszcbcacrrrrrr4Handwriting recognition 25Car starts BN 18 binary attributes Inference P(BatteryAge|Starts=f) 218terms, why so fast? Not impressed? HailFinder BN – more than 354= 58149737003040059690390169 terms6Factored joint distribution -PreviewFluAllergySinusHeadacheNose7The independence assumption FluAllergySinusHeadacheNoseLocal Markov Assumption:A variable X is independentof its non-descendants given its parents8Explaining awayFluAllergySinusHeadacheNoseLocal Markov Assumption:A variable X is independentof its non-descendants given its parents9Chain rule & Joint distributionLocal Markov Assumption:A variable X is independentof its non-descendants given its parents FluAllergySinusHeadacheNose10Two (trivial) special casesEdgeless graph Fully-connected graph11The Representation Theorem –Joint Distribution to BNEncodes independenceassumptionsBN:Joint probabilitydistribution:ObtainIf conditionalindependenciesin BN are subset of conditional independencies in P12Real Bayesian networks applications Diagnosis of lymph node disease Speech recognition Microsoft office and Windows http://www.research.microsoft.com/research/dtg/ Study Human genome Robot mapping Robots to identify meteorites to study Modeling fMRI data Anomaly detection Fault dianosis Modeling sensor network data13A general Bayes net Set of random variables Directed acyclic graph Encodes independence assumptions CPTs Joint distribution:14Another example Variables: B – Burglar E – Earthquake A – Burglar alarm N – Neighbor calls R – Radio report Both burglars and earthquakes can set off the alarm If the alarm sounds, a neighbor may call An earthquake may be announced on the radio15Another example – Building the BN B – Burglar E – Earthquake A – Burglar alarm N – Neighbor calls R – Radio report16Independencies encoded in BN We said: All you need is the local Markov assumption (Xi⊥ NonDescendantsXi| PaXi) But then we talked about other (in)dependencies e.g., explaining away What are the independencies encoded by a BN? Only assumption is local Markov But many others can be derived using the algebra of conditional independencies!!!17Understanding independencies in BNs– BNs with 3 nodesLocal Markov Assumption:A variable X is independentof its non-descendants given its parents Z YXIndirect causal effect:ZYXZ YXIndirect evidential effect:Common effect:ZYXCommon cause:18Understanding independencies in BNs– Some examplesAHCEGDBFKJI19An active trail – ExampleA HCEGDBFF’’F’When are A and H independent?20Active trails formalized A path X1 –X2 –···–Xkis an active trail when variables O⊆{X1,…,Xn} are observed if for each consecutive triplet in the trail: Xi-1→Xi→Xi+1, and Xiis not observed (Xi∉O) Xi-1←Xi←Xi+1, and Xiis not observed (Xi∉O) Xi-1←Xi→Xi+1, and Xiis not observed (Xi∉O) Xi-1→Xi←Xi+1, and Xiis observed (Xi∈O), or one of its descendents21Active trails and independence?AHCEGDBFKJI Theorem: Variables Xiand Xjare independent given Z⊆{X1,…,Xn} if the is no active trail between Xiand Xjwhen variables Z⊆{X1,…,Xn} are observedThe BN Representation Theorem22Joint probabilitydistribution:ObtainIf conditionalindependenciesin BN are subset of conditional independencies in PImportant because: Every P has at least one BN structure GIf joint probabilitydistribution:ObtainThen conditionalindependenciesin BN are subset of conditional independencies in PImportant because: Read independencies of P from BN structure G23“Simpler” BNs A distribution can be represented by many BNs: Simpler BN, requires fewer parameters24Learning Bayes netsKnown structure Unknown structureFully observable dataMissing datax(1)…x(m)Datastructure parametersCPTs –P(Xi| PaXi)25Learning the CPTsx(1)…x(m)DataFor each discrete variable Xi26What you need to know Bayesian networks A compact representation for large probability distributions Not an algorithm Semantics of a BN Conditional independence assumptions Representation Variables Graph CPTs Why BNs
View Full Document