11BN Semantics 1Graphical Models – 10708Carlos GuestrinCarnegie Mellon UniversitySeptember 15th, 2006Readings:K&F: 3.1, 3.2, 3.310-708 –Carlos Guestrin 20062Let’s start on BNs… Consider P(Xi) Assign probability to each xi∈ Val(Xi) Independent parameters Consider P(X1,…,Xn) How many independent parameters if |Val(Xi)|=k?210-708 –Carlos Guestrin 20063What if variables are independent? What if variables are independent? (Xi⊥ Xj), ∀ i,j Not enough!!! (See homework 1 ☺) Must assume that (X ⊥ Y), ∀ X,Y subsets of {X1,…,Xn} Can write P(X1,…,Xn) = ∏i=1…nP(Xi) How many independent parameters now?10-708 –Carlos Guestrin 20064Conditional parameterization –two nodes Grade is determined by Intelligence310-708 –Carlos Guestrin 20065Conditional parameterization –three nodes Grade and SAT score are determined by Intelligence (G ⊥ S | I)10-708 –Carlos Guestrin 20066The naïve Bayes model –Your first real Bayes Net Class variable: C Evidence variables: X1,…,Xn assume that (X ⊥ Y | C), ∀ X,Y subsets of {X1,…,Xn}410-708 –Carlos Guestrin 20067What you need to know (From last class) Basic definitions of probabilities Independence Conditional independence The chain rule Bayes rule Naïve Bayes10-708 –Carlos Guestrin 20068Announcements Homework 1: Out yesterday Due September 27th – beginning of class! It’s hard – start early, ask questions Collaboration policy OK to discuss in groups Tell us on your paper who you talked with Each person must write their own unique paper No searching the web, papers, etc. for answers, we trust you want to learn Upcoming recitation Monday 5:30-7pm in Wean 4615A – Matlab Tutorial Don’t forget to register to the mailing list at: https://mailman.srv.cs.cmu.edu/mailman/listinfo/10708-announce510-708 –Carlos Guestrin 20069This class We’ve heard of Bayes nets, we’ve played with Bayes nets, we’ve even used them in your research This class, we’ll learn the semantics of BNs, relate them to independence assumptions encoded by the graph10-708 –Carlos Guestrin 200610Causal structure Suppose we know the following: The flu causes sinus inflammation Allergies cause sinus inflammation Sinus inflammation causes a runny nose Sinus inflammation causes headaches How are these connected?610-708 –Carlos Guestrin 200611Possible queriesFluAllergySinusHeadacheNose Inference Most probable explanation Active data collection10-708 –Carlos Guestrin 200612Car starts BN 18 binary attributes Inference P(BatteryAge|Starts=f) 218terms, why so fast? Not impressed? HailFinder BN – more than 354= 58149737003040059690390169 terms710-708 –Carlos Guestrin 200613Factored joint distribution -PreviewFluAllergySinusHeadacheNose10-708 –Carlos Guestrin 200614Number of parametersFluAllergySinusHeadacheNose810-708 –Carlos Guestrin 200615Key: Independence assumptionsFluAllergySinusHeadacheNoseKnowing sinus separates the variables from each other10-708 –Carlos Guestrin 200616(Marginal) Independence Flu and Allergy are (marginally) independent More Generally:Allergy = fAllergy = tFlu = fFlu = tAllergy = fAllergy = tFlu = fFlu = t910-708 –Carlos Guestrin 200617Conditional independence Flu and Headache are not (marginally) independent Flu and Headache are independent given Sinus infection More Generally:10-708 –Carlos Guestrin 200618The independence assumption FluAllergySinusHeadacheNoseLocal Markov Assumption:A variable X is independentof its non-descendants given its parents (Xi⊥⊥⊥⊥ NonDescendantsXi| PaXi)1010-708 –Carlos Guestrin 200619Explaining awayFluAllergySinusHeadacheNoseLocal Markov Assumption:A variable X is independentof its non-descendants given its parents (Xi ⊥⊥⊥⊥ NonDescendantsXi| PaXi)10-708 –Carlos Guestrin 200620What about probabilities?Conditional probability tables (CPTs)FluAllergySinusHeadacheNose1110-708 –Carlos Guestrin 200621Joint distributionFluAllergySinusHeadacheNoseWhy can we decompose? Markov Assumption!10-708 –Carlos Guestrin 200622A general Bayes net Set of random variables Directed acyclic graph CPTs Joint distribution: Local Markov Assumption: A variable X is independent of its non-descendants given its parents – (Xi ⊥⊥⊥⊥ NonDescendantsXi | PaXi)1210-708 –Carlos Guestrin 200623Questions???? What distributions can be represented by a BN? What BNs can represent a distribution? What are the independence assumptions encoded in a BN? in addition to the local Markov assumption10-708 –Carlos Guestrin 200624Today: The Representation Theorem –Joint Distribution to BNJoint probabilitydistribution:ObtainBN:Encodes independenceassumptionsIf conditionalindependenciesin BN are subset of conditional independencies in P1310-708 –Carlos Guestrin 200625Today: The Representation Theorem –BN to Joint DistributionIf joint probabilitydistribution:BN:Encodes independenceassumptionsObtainThen conditionalindependenciesin BN are subset of conditional independencies in P10-708 –Carlos Guestrin 200626Let’s start proving it for naïve Bayes –From joint distribution to BN Independence assumptions: Xiindependent given C Let’s assume that P satisfies independencies must prove that P factorizes according to BN: P(C,X1,…,Xn) = P(C) ∏iP(Xi|C) Use chain rule!1410-708 –Carlos Guestrin 200627Let’s start proving it for naïve Bayes –From BN to joint distribution 1 Let’s assume that P factorizes according to the BN: P(C,X1,…,Xn) = P(C) ∏iP(Xi|C) Prove the independence assumptions: Xiindependent given C Actually, (X ⊥ Y | C), ∀ X,Y subsets of {X1,…,Xn}10-708 –Carlos Guestrin 200628Let’s start proving it for naïve Bayes –From BN to joint distribution 2 Let’s consider a simpler case Grade and SAT score are determined by Intelligence P(I,G,S) = P(I)P(G|I)P(S|I) Prove that P(G,S|I) = P(G|I) P(S|I)1510-708 –Carlos Guestrin 200629Today: The Representation TheoremBN:Encodes independenceassumptionsJoint probabilitydistribution:ObtainIf conditionalindependenciesin BN are subset of conditional independencies in PIf joint probabilitydistribution:ObtainThen conditionalindependenciesin BN are subset of conditional independencies in
View Full Document