DOC PREVIEW
MIT 6 867 - Machine learning: lecture 21

This preview shows page 1-2 out of 5 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 5 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 5 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 5 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Machine learning lecture 21 Outline Bayesian networks cont d graphs and consistency Tommi S Jaakkola MIT CSAIL tommi csail mit edu Undirected graphical models Markov random fields graphs independence consistency associated distribution Bayesian networks as undirected models Quantitative probabilistic inference medical diagnosis example basic algorithms and problems Tommi Jaakkola MIT CSAIL Bayesian networks review Graphs probabilities and consistency Graph d separation independence Suppose x1 x2 and x3 represent three independent coin tosses so that the probability distribution can be written as a product P x1 P x2 P x3 L N T S C N L N T S L S T conditional independence properties provide the basis for qualitative inferences This distribution is consistent with all the following graphs in the sense that all the independence properties we can infer from the graphs also hold for this distribution 1 Graph associated probability distribution P N P L P S N L P T L P C S T x1 2 x2 x1 3 x2 x1 4 x2 x1 x2 x3 any distribution that factors in this manner is consistent with all the independence properties implied by the graph Tommi Jaakkola MIT CSAIL 2 3 Outline x3 x3 x3 Moreover 1 and 2 are consistent with any distribution over x1 x2 and x3 Tommi Jaakkola MIT CSAIL 4 Undirected graphical models Bayesian networks cont d graphs and consistency Undirected graphical models Markov random fields graphs independence consistency associated distribution Bayesian networks as undirected models For example a simple lattice model with binary variables xi 1 1 spins and pairwise interactions edges E x1 x2 Quantitative probabilistic inference medical diagnosis example basic algorithms and problems P x1 xn 1 Z exp Jij xixj i j E where Jij specifies the interaction strength between nearby variables xi and xj Tommi Jaakkola MIT CSAIL 5 Tommi Jaakkola MIT CSAIL 6 Undirected graphical models graph semantics Graph semantics comparison Graph semantics of undirected graphical models comes from simple graph separation x1 x2 x1 x2 x3 x4 x3 x4 x1 and x4 are independent x1 and x4 are given x2 and x3 independent given x3 not Directed and undirected graphs are complementary The following two independence properties cannot be captured simultaneously with a Bayesian network x1 x2 x1 x2 x3 x4 x3 x4 Marginal but not conditional independence cannot be captured with an undirected graph x1 x2 x3 Tommi Jaakkola MIT CSAIL 7 Tommi Jaakkola MIT CSAIL 8 Undirected graphs associated distribution Graph transformations The simple graph separation properties again impose independence or Markov properties on the associated distribution We can transform directed graphical models Bayesian networks into undirected graphical models simply via moralization x1 x1 x2 x3 x2 x1 x3 x4 x3 x5 x5 x1 x2 x2 x1 x2 x4 Theorem Hammersley Clifford Any distribution consistent with an undirected graph has to factor according to the maximal cliques in the graph 1 P x c xc Z x3 x4 P x1 P x2 P x3 x1 x2 P x4 x2 x4 x3 x4 P x1 P x2 P x3 x1 x2 P x4 x2 only the graph representation changes not the distribution The resulting undirected graph will be consistent with the distribution associated with the original directed graph c C where xc denotes the variables in clique c Tommi Jaakkola MIT CSAIL 9 Outline Tommi Jaakkola MIT CSAIL 10 Example setting medical diagnosis Bayesian networks cont d graphs and consistency The QMR DT model Shwe et al 1991 Diseases Undirected graphical models Markov random fields graphs independence consistency associated distribution Bayesian networks as undirected models d f Findings Quantitative probabilistic inference medical diagnosis example basic algorithms and problems Tommi Jaakkola MIT CSAIL Diseases Findings about 600 binary 0 1 disease variables representing diseases that are present or absent about 4000 associated binary 0 1 findings findings may be either positive or negative 11 Tommi Jaakkola MIT CSAIL 12 Example cont d Assumptions in detail The model is based on a number of simplifying assumptions Diseases d Diseases are marginally independent d1 d2 d1 Hodgkins disease d2 Plasma cell myeloma d3 f1 f Findings f2 Findings are conditionally independent given the diseases Assumptions explicit in the graph relevant variables marginal independence of diseases conditional independence of findings d1 d2 f1 Bone X ray fracture f2 f1 f2 Further assumptions about the probability distribution causal independence Tommi Jaakkola MIT CSAIL 13 Tommi Jaakkola MIT CSAIL 14 Assumptions in detail Causal independence noisy or We have to specify how n potentially 100 or more underlying diseases conspire to influence any finding We assume that each finding is negative if all the associated diseases if present independently fail to produce a positive outcome d1 d2 dn other d1 dpai dj d2 f The size of the conditional probability table for P f d1 d2 d3 would increase exponentially with the number of associated diseases P fi 1 other qi0 P fi 1 dj 1 qij fi P fi 0 dpai P fi 0 other e g causal independence assumption 1 qi0 j pai P fi 0 dj j pai 1 qij dj and P fi 1 dpai 1 P fi 0 dpai Tommi Jaakkola MIT CSAIL 15 Tommi Jaakkola MIT CSAIL 16 Joint distribution Three inference problems After all these assumptions we can write down the following joint distribution over n diseases and m findings m n P f d P fi dpa P dj Given a set of observed findings f f1 fk we wish to infer what the underlying diseases are Diseases d i i 1 where j 1 P fi 0 dpai 1 qi0 j pai 1 qij dj The only adjustable parameters in this model are qij and P dj Diseases d f f Findings 1 What are the marginal posterior probabilities over the diseases 2 What is the most likely setting of all the underlying disease variables 3 Which test should we carry out next in order to get the most information about the diseases Findings Tommi Jaakkola MIT CSAIL 17 Tommi Jaakkola MIT CSAIL 18 Inference problem cont d First inference problem posterior marginals For the purposes of inferring the presence or absence of the underlying diseases we can ignore any findings that remain unobserved as if they were not in the model to begin with Diseases d Diseases d f d f Findings What messages if any do the disease variables have to share for them to be able to compute the posterior marginals locally f Findings Diseases Given the observations we already have all the information only implicitly Findings P d1 d1 P d2 d2 P d3 d3 f2 f1 P f1 d1 d2 Tommi Jaakkola MIT CSAIL 19 Inference graph transformation P d1 d1 P d2 d2 P


View Full Document

MIT 6 867 - Machine learning: lecture 21

Download Machine learning: lecture 21
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Machine learning: lecture 21 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Machine learning: lecture 21 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?