DOC PREVIEW
CMU CS 10701 - An introduction to graphical models

This preview shows page 1-2-3-4-5-6 out of 19 pages.

Save
View full document
Premium Document
Do you want full access? Go Premium and unlock all 19 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

An introduction to graphical models Kevin P Murphy 10 May 2001 1 Introduction The following quotation from the Preface of Jor99 provides a very concise introduction to graphical models Graphical models are a marriage between probability theory and graph theory They provide a natural tool for dealing with two problems that occur throughout applied mathematics and engineering uncertainty and complexity and in particular they are playing an increasingly important role in the design and analysis of machine learning algorithms Fundamental to the idea of a graphical model is the notion of modularity a complex system is built by combining simpler parts Probability theory provides the glue whereby the parts are combined ensuring that the system as a whole is consistent and providing ways to interface models to data The graph theoretic side of graphical models provides both an intuitively appealing interface by which humans can model highly interacting sets of variables as well as a data structure that lends itself naturally to the design of efficient general purpose algorithms Many of the classical multivariate probabalistic systems studied in fields such as statistics systems engineering information theory pattern recognition and statistical mechanics are special cases of the general graphical model formalism examples include mixture models factor analysis hidden Markov models Kalman filters and Ising models The graphical model framework provides a way to view all of these systems as instances of a common underlying formalism This view has many advantages in particular specialized techniques that have been developed in one field can be transferred between research communities and exploited more widely Moreover the graphical model formalism provides a natural framework for the design of new systems In this paper we will flesh out this remark by discussing the following topics Representation how can a graphical model compactly represent a joint probability distribution Inference how can we efficiently infer the hidden states of a system given partial and possibly noisy observations Learning how do we estimate the parameters and structure of the model Decision theory what happens when it is time to convert beliefs into actions Applications what has this machinery been used for 1 P C F P C T 0 5 0 5 Cloudy C P S F P S T F 0 5 T 0 9 Sprinkler Rain 0 5 C F 0 1 WetGrass T P R F P R T 0 8 0 2 0 2 0 8 S R P W F P W T F F 1 0 0 0 T F 0 1 0 9 F T 0 1 0 9 T T 0 01 0 99 Figure 1 A simple Bayesian network adapted from RN95 2 Representation Probabilistic graphical models are graphs in which nodes represent random variables and the lack of arcs represent conditional independence assumptions Hence they provide a compact representation of joint probability distributions as we will see below For example if we have N binary random variables an atomic representation of the joint P X1 XN needs O 2N parameters whereas a graphical model may need exponentially fewer depending on which conditional assumptions we make This can help both inference and learning as we explain below There are two main kinds of graphical models undirected and directed Undirected graphical models also known as Markov networks or Markov random fields MRFs are more popular with the physics and vision communities Log linear models are a special case of undirected graphical models and are popular in statistics Directed graphical models also known as Bayesian networks BNs belief networks generative models causal models etc are more popular with the AI and machine learning communities Note that despite the name Bayesian networks do not necessarily imply a commitment to Bayesian methods rather they are so called because they use Bayes rule for inference as we will see below It is also possible to have a model with both directed and undirected arcs which is called a chain graph 2 1 Directed graphical models In a directed graphical model i e a Bayesian network an arc from A to B can be informally interpreted as indicating that A causes B 1 Hence directed cycles are disallowed Consider the example in Figure 1 Here nodes represent binary random variables We see that the event grass is wet W true has two possible causes either the water sprinker is on S true or it is raining R true The strength of this relationship is shown in the table below W this is called W s conditional 1 See the following books for a more detailed discussion of the use of Bayes nets for causal modelling Pea00 SGS00 GC99 Shi00 2 X Y X1 Y1 Y2 a Xn Ym b Figure 2 Factor analysis represented as a Bayesian network probability table CPT For example we see that P W true S true R f alse 0 9 second entry of second row and hence P W f alse S true R f alse 1 0 9 0 1 since each row must sum to one Since the C node has no parents its CPT specifies the prior probability that it is cloudy in this case 0 5 The simplest statement of the conditional independence relationships encoded in a Bayesian network can be stated as follows a node is independent of its ancestors given its parents where the ancestor parent relationship is with respect to some fixed topological ordering of the nodes Let us see how we can use this fact to specify the joint distribution more compactly By the chain rule of probability the joint probability of all the nodes in Figure 1 is P C S R W P C P S C P R C S P W C S R By using conditional independence relationships we can rewrite this as P C S R W P C P S C P R C P W S R where we were allowed to simplify the third term because R is independent of S given its parent C written R S C and the last term because W C S R We can see that the conditional independence relationships allow us to represent the joint more compactly Here the savings are minimal but in general if we had n binary nodes the full joint would require O 2 n parameters but the factored form would only require O n2k parameters where k is the maximum fan in of a node We will now show how many popular statistical models can be represented as directed graphical models 2 1 1 PCA ICA and all that In the above example the nodes represented discrete random variables It is also possible that they represent continuous random variables In this case we must specify the conditional probability distribution CPD of a node given its parents in parametric form A common example is a Gaussian Consider the example in Figure 2 a this represents the model P X Y P X P Y X Y is observed hence it is shown shaded but X is hidden


View Full Document

CMU CS 10701 - An introduction to graphical models

Documents in this Course
lecture

lecture

12 pages

lecture

lecture

17 pages

HMMs

HMMs

40 pages

lecture

lecture

15 pages

lecture

lecture

20 pages

Notes

Notes

10 pages

Notes

Notes

15 pages

Lecture

Lecture

22 pages

Lecture

Lecture

13 pages

Lecture

Lecture

24 pages

Lecture9

Lecture9

38 pages

lecture

lecture

26 pages

lecture

lecture

13 pages

Lecture

Lecture

5 pages

lecture

lecture

18 pages

lecture

lecture

22 pages

Boosting

Boosting

11 pages

lecture

lecture

16 pages

lecture

lecture

20 pages

Lecture

Lecture

20 pages

Lecture

Lecture

39 pages

Lecture

Lecture

14 pages

Lecture

Lecture

18 pages

Lecture

Lecture

13 pages

Exam

Exam

10 pages

Lecture

Lecture

27 pages

Lecture

Lecture

15 pages

Lecture

Lecture

24 pages

Lecture

Lecture

16 pages

Lecture

Lecture

23 pages

Lecture6

Lecture6

28 pages

Notes

Notes

34 pages

lecture

lecture

15 pages

Midterm

Midterm

11 pages

lecture

lecture

11 pages

lecture

lecture

23 pages

Boosting

Boosting

35 pages

Lecture

Lecture

49 pages

Lecture

Lecture

22 pages

Lecture

Lecture

16 pages

Lecture

Lecture

18 pages

Lecture

Lecture

35 pages

lecture

lecture

22 pages

lecture

lecture

24 pages

Midterm

Midterm

17 pages

exam

exam

15 pages

Lecture12

Lecture12

32 pages

lecture

lecture

19 pages

Lecture

Lecture

32 pages

boosting

boosting

11 pages

pca-mdps

pca-mdps

56 pages

bns

bns

45 pages

mdps

mdps

42 pages

svms

svms

10 pages

Notes

Notes

12 pages

lecture

lecture

42 pages

lecture

lecture

29 pages

lecture

lecture

15 pages

Lecture

Lecture

12 pages

Lecture

Lecture

24 pages

Lecture

Lecture

22 pages

Midterm

Midterm

5 pages

mdps-rl

mdps-rl

26 pages

Load more
Download An introduction to graphical models
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view An introduction to graphical models and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view An introduction to graphical models and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?