DOC PREVIEW
CALTECH CS 155 - Probabilistic Graphical Models

This preview shows page 1-2-16-17-18-34-35 out of 35 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 35 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 35 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 35 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 35 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 35 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 35 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 35 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 35 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

ProbabilisticGraphical ModelsLecture 3 – Bayesian Networks SemanticsCS/CNS/EE 155Andreas Krause2Bayesian networksCompact representation of distributions over large number of variables(Often) allows efficient exact inference (computing marginals, etc.)HailFinder56 vars~ 3 states each~1026terms> 10.000 yearson Top supercomputersJavaBayes applet3Causal parametrizationGraph with directed edges from (immediate) causes to (immediate) effects Earthquake BurglaryAlarmJohnCallsMaryCalls4Bayesian networksA Bayesian network structure is a directed, acyclic graph G, where each vertex s of G is interpreted as a random variable Xs (with unspecified distribution)A Bayesian network (G,P) consists of A BN structure G and ....a set of conditional probability distributions (CPDs)P(Xs| PaXs), where PaXsare the parents of node Xssuch that(G,P) defines joint distribution5Representing the world using BNsWant to make sure that I(P) ⊆ I(P’)Need to understand CI properties of BN (G,P) s1s2s3s4s5s7s6s11s12s9s10s8s1s3s12s9True distribution P’with cond. ind. I(P’)Bayes net (G,P)with I(P)represent6Local Markov AssumptionEach BN Structure G is associated with the following conditional independence assumptionsX ⊥ NonDescendentsX| PaXWe write Iloc(G) for these conditional independencesSuppose (G,P) is a Bayesian network representing PDoes it hold that Iloc(G) ⊆ I(P)?If this holds, we say G is an I-map for P.7Factorization Theorems1s2s3s4s5s7s6s11s12s9s10s8s1s3s12s9Iloc(G) ⊆ I(P)True distribution Pcan be represented exactly asBayesian network (G,P)G is an I-map of P(independence map)8Additional conditional independenciesBN specifies joint distribution through conditional parameterization that satisfies Local Markov PropertyIloc(G) = {(Xi⊥ NondescendantsXi| PaXi)}But we also talked about additional properties of CIWeak Union, Intersection, Contraction, …Which additional CI does a particular BN specify?All CI that can be derived through algebraic operations proving CI is very cumbersome!!Is there an easy way to find all independences of a BN just by looking at its graph??9BNs with 3 nodesX Y ZX Y ZXYZXYZLocal Markov Property:X ⊥ NonDesc(X) | Pa(X)10V-structuresKnow E ⊥ BSuppose we know A. Does E ⊥ B | A hold?Earthquake BurglaryAlarm11BNs with 3 nodesX Y ZX Y ZXYZXYZLocal Markov Property:X ⊥ NonDesc(X) | Pa(X)Indirect causal effectIndirect evidential effectCommon causeCommon effectX ⊥ Z | Y¬ (X ⊥ Z)X ⊥ Z¬ (X ⊥ Z | Y)12ExamplesABCDEFGIHJ13More examplesABCDEFGIHJ14Active trailsWhen are A and I independent?ABCDGHEFI15Active trailsAn undirected path in BN structure G is called active trail for observed variables O ⊆ {X1,…,Xn}, if for every consecutive triple of vars X,Y,Z on the pathX  Y  Z and Y is unobserved (Y ∉O)X  Y  Z and Y is unobserved (Y ∉O)X  Y  Z and Y is unobserved (Y ∉O)X  Y  Z and Y or any of Y’s descendants is observedAny variables Xiand Xjfor which ∄ active trail for observations O are called d-separated by OWe write d-sep(Xi;Xj| O)Sets A and B are d-separated given O if d-sep(X,Y |O) for all X∈A, Y∈B. Write d-sep(A; B | O)16d-separation and independenceProof uses algebraic properties of conditional independenceABCDEFGIHITheorem:d-sep(X;Y | Z)  X ⊥ Y | Z i.e., X cond. ind. Y given Zif there does not existany active trailbetween X and Yfor observations Z17Soundness of d-separationHave seen: P factorizes according to G  Iloc(G)⊆ I(P)Define I(G) = {(X ⊥ Y | Z): d-sepG(X;Y |Z)}Theorem: Soundness of d-separationP factorizes over G  I(G) ⊆ I(P)Hence, d-separation captures only true independencesHow about I(G) = I(P)?18Does the converse hold?Suppose P factorizes over G.Does it hold that I(P) ⊆ I(G)?19Existence of dependences for non-d-separated variablesTheorem: If X and Y are not d-separated given Z, then there exists some distribution P factorizing over G in which X and Y are dependent given ZProof sketch:20Completeness of d-separationTheorem: For “almost all” distributions P that factorize over G it holds that I(G) = I(P)“almost all”: except for a set of distributions with measure 0, assuming only that no finite set of distributions has measure > 021Algorithm for d-separationHow can we check if X ⊥ Y | Z?Idea: Check every possible path connecting X and Y and verify conditionsExponentially many paths!!! Linear time algorithm:Find all nodes reachable from X1. Mark Z and its ancestors2. Do breadth-first search startingfrom X; stop if path is blockedHave to be careful with implementation details (see reading)ABCDEFGIHI22Representing the world using BNsWant to make sure that I(P) ⊆ I(P’)Ideally: I(P) = I(P’)Want BN that exactly captures independencies in P’!s1s2s3s4s5s7s6s11s12s9s10s8s1s3s12s9True distribution P’with cond. ind. I(P’)Bayes net (G,P)with I(P)represent23Minimal I-mapsLemma: Suppose G’ is derived from G by adding edgesThen I(G’) ⊆ I(G) Proof:Thus, want to find graph G with I(G) ⊆ I(P) such that when we remove any single edge, for the resulting graph G’ it holds that I(G’)  I(P)Such a graph G is called minimal I-map24Existence of Minimal I-MapsDoes every distribution have a minimal I-Map?25Algorithm for finding minimal I-mapGiven random variables and known conditional independencesPick ordering X1,…,Xnof the variablesFor each XiFind minimal subset A ⊆{X1,…,Xi-1} such that P(Xi| X1,…,Xi-1) = P(Xi| A)Specify / learn CPD P(Xi| A)Will produce minimal I-map!26Uniqueness of Minimal I-mapsIs the minimal I-Map unique?E BAJ ME BAJ MJ MAE B27Perfect mapsMinimal I-maps are easy to find, but can contain many unnecessary dependencies.A BN structure G is called P-map (perfect map) for distribution P if I(G) = I(P)Does every distribution P have a P-map?28Existence of perfect maps29Existence of perfect maps30Uniqueness of perfect maps31I-EquivalenceTwo graphs G, G’ are called I-equivalent if I(G) = I(G’)I-equivalence partitions graphs into equivalence classes32Skeletons of BNsI-equivalent BNs must have same skeletonABCDEFGIHJABCDEFGIHJ33Importance of V-structuresTheorem: If G, G’ have same skeleton and same V-structure, then I(G) = I(G’)Does the converse hold?34Immoralities and I-equivalenceA V-structure X  Y  Z is called immoral if there is no edge between X and Z (“unmarried parents”)Theorem: I(G) = I(G’)  G and G’ have the same skeleton and the same immoralities.35TasksSubscribe to Mailing


View Full Document

CALTECH CS 155 - Probabilistic Graphical Models

Download Probabilistic Graphical Models
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Probabilistic Graphical Models and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Probabilistic Graphical Models 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?