BN Semantics 1Let’s start on BNs…What if variables are independent?Conditional parameterization – two nodesConditional parameterization – three nodesThe naïve Bayes model – Your first real Bayes NetWhat you need to know (From last class)AnnouncementsThis classCausal structurePossible queriesCar starts BNFactored joint distribution - PreviewNumber of parametersKey: Independence assumptions(Marginal) IndependenceConditional independenceThe independence assumptionExplaining awayWhat about probabilities? Conditional probability tables (CPTs)Joint distributionA general Bayes netQuestions????Today: The Representation Theorem – Joint Distribution to BNToday: The Representation Theorem – BN to Joint DistributionLet’s start proving it for naïve Bayes – From joint distribution to BNLet’s start proving it for naïve Bayes – From BN to joint distribution 1Let’s start proving it for naïve Bayes – From BN to joint distribution 2Today: The Representation TheoremLocal Markov assumption & I-mapsFactorized distributionsBN Representation Theorem – I-map to factorizationBN Representation Theorem – I-map to factorization: ProofDefining a BNBN Representation Theorem – Factorization to I-mapBN Representation Theorem – Factorization to I-map: ProofThe BN Representation TheoremIndependencies encoded in BNUnderstanding independencies in BNs – BNs with 3 nodesUnderstanding independencies in BNs – Some examplesUnderstanding independencies in BNs – Some more examplesAn active trail – ExampleActive trails formalizedActive trails and independence?More generally: Soundness of d-separationAdding edges doesn’t hurtExistence of dependency when not d-separatedMore generally: Completeness of d-separationInterpretation of completenessWhat you need to knowAcknowledgements1BN Semantics 1Graphical Models – 10708Carlos GuestrinCarnegie Mellon UniversitySeptember 15th, 2006Readings:K&F: 3.1, 3.2, 3.310-708 – Carlos Guestrin 20062 Let’s start on BNs…Consider P(Xi)Assign probability to each xi 2 Val(Xi)Independent parametersConsider P(X1,…,Xn)How many independent parameters if |Val(Xi)|=k?10-708 – Carlos Guestrin 20063 What if variables are independent?What if variables are independent?(Xi Xj), 8 i,jNot enough!!! (See homework 1 )Must assume that (X Y), 8 X,Y subsets of {X1,…,Xn} Can writeP(X1,…,Xn) = i=1…n P(Xi)How many independent parameters now?10-708 – Carlos Guestrin 20064 Conditional parameterization – two nodesGrade is determined by Intelligence10-708 – Carlos Guestrin 20065 Conditional parameterization – three nodesGrade and SAT score are determined by Intelligence(G S | I)10-708 – Carlos Guestrin 20066 The naïve Bayes model – Your first real Bayes NetClass variable: CEvidence variables: X1,…,Xnassume that (X Y | C), 8 X,Y subsets of {X1,…,Xn}10-708 – Carlos Guestrin 20067 What you need to know (From last class)Basic definitions of probabilitiesIndependenceConditional independenceThe chain ruleBayes ruleNaïve Bayes10-708 – Carlos Guestrin 20068 AnnouncementsHomework 1:Out yesterdayDue September 27th – beginning of class!It’s hard – start early, ask questionsCollaboration policyOK to discuss in groupsTell us on your paper who you talked withEach person must write their own unique paperNo searching the web, papers, etc. for answers, we trust you want to learnUpcoming recitationMonday 5:30-7pm in Wean 4615A – Matlab TutorialDon’t forget to register to the mailing list at:https://mailman.srv.cs.cmu.edu/mailman/listinfo/10708-announce10-708 – Carlos Guestrin 20069 This classWe’ve heard of Bayes nets, we’ve played with Bayes nets, we’ve even used them in your researchThis class, we’ll learn the semantics of BNs, relate them to independence assumptions encoded by the graph10-708 – Carlos Guestrin 200610 Causal structureSuppose we know the following:The flu causes sinus inflammationAllergies cause sinus inflammationSinus inflammation causes a runny noseSinus inflammation causes headachesHow are these connected?10-708 – Carlos Guestrin 200611 Possible queriesFluAllergySinusHeadacheNoseInferenceMost probable explanationActive data collection10-708 – Carlos Guestrin 200612 Car starts BN18 binary attributesInference P(BatteryAge|Starts=f)218 terms, why so fast?Not impressed?HailFinder BN – more than 354 = 58149737003040059690390169 terms10-708 – Carlos Guestrin 200613 Factored joint distribution - PreviewFluAllergySinusHeadacheNose10-708 – Carlos Guestrin 200614 Number of parametersFluAllergySinusHeadacheNose10-708 – Carlos Guestrin 200615 Key: Independence assumptionsFluAllergySinusHeadacheNoseKnowing sinus separates the variables from each other10-708 – Carlos Guestrin 200616 (Marginal) IndependenceFlu and Allergy are (marginally) independentMore Generally:Flu = t Flu = fAllergy = tAllergy = fAllergy = tAllergy = fFlu = tFlu = f10-708 – Carlos Guestrin 200617 Conditional independenceFlu and Headache are not (marginally) independentFlu and Headache are independent given Sinus infectionMore Generally:10-708 – Carlos Guestrin 200618 The independence assumption FluAllergySinusHeadacheNoseLocal Markov Assumption:A variable X is independentof its non-descendants given its parents (Xi NonDescendantsXi | PaXi)10-708 – Carlos Guestrin 200619 Explaining awayFluAllergySinusHeadacheNoseLocal Markov Assumption:A variable X is independentof its non-descendants given its parents (Xi NonDescendantsXi | PaXi)10-708 – Carlos Guestrin 200620 What about probabilities?Conditional probability tables (CPTs)FluAllergySinusHeadacheNose10-708 – Carlos Guestrin 200621 Joint distributionFluAllergySinusHeadacheNoseWhy can we decompose? Markov Assumption!10-708 – Carlos Guestrin 200622 A general Bayes netSet of random variablesDirected acyclic graph CPTsJoint distribution:Local Markov Assumption:A variable X is independent of its non-descendants given its parents – (Xi NonDescendantsXi | PaXi)10-708 – Carlos Guestrin 200623 Questions????What distributions can be represented by a BN?What BNs can represent a distribution?What are the independence assumptions encoded in a BN?in addition to the local Markov
View Full Document