MIT 6 867 - Machine learning: lecture 21 - D780081

Home> Schools> Massachusetts Institute of Technology> Electrical Engineering and Computer Science (6) > 6 867> Machine learning: lecture 21

DOC PREVIEW

MIT 6 867 - Machine learning: lecture 21

School name Massachusetts Institute of Technology

Course 6 867- Machine Learning

Pages 5

This preview shows page 1-2 out of 5 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 5 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 5 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 5 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Machine learning lecture 21 Outline Bayesian networks cont d graphs and consistency Tommi S Jaakkola MIT CSAIL tommi csail mit edu Undirected graphical models Markov random fields graphs independence consistency associated distribution Bayesian networks as undirected models Quantitative probabilistic inference medical diagnosis example basic algorithms and problems Tommi Jaakkola MIT CSAIL Bayesian networks review Graphs probabilities and consistency Graph d separation independence Suppose x1 x2 and x3 represent three independent coin tosses so that the probability distribution can be written as a product P x1 P x2 P x3 L N T S C N L N T S L S T conditional independence properties provide the basis for qualitative inferences This distribution is consistent with all the following graphs in the sense that all the independence properties we can infer from the graphs also hold for this distribution 1 Graph associated probability distribution P N P L P S N L P T L P C S T x1 2 x2 x1 3 x2 x1 4 x2 x1 x2 x3 any distribution that factors in this manner is consistent with all the independence properties implied by the graph Tommi Jaakkola MIT CSAIL 2 3 Outline x3 x3 x3 Moreover 1 and 2 are consistent with any distribution over x1 x2 and x3 Tommi Jaakkola MIT CSAIL 4 Undirected graphical models Bayesian networks cont d graphs and consistency Undirected graphical models Markov random fields graphs independence consistency associated distribution Bayesian networks as undirected models For example a simple lattice model with binary variables xi 1 1 spins and pairwise interactions edges E x1 x2 Quantitative probabilistic inference medical diagnosis example basic algorithms and problems P x1 xn 1 Z exp Jij xixj i j E where Jij specifies the interaction strength between nearby variables xi and xj Tommi Jaakkola MIT CSAIL 5 Tommi Jaakkola MIT CSAIL 6 Undirected graphical models graph semantics Graph semantics comparison Graph semantics of undirected graphical models comes from simple graph separation x1 x2 x1 x2 x3 x4 x3 x4 x1 and x4 are independent x1 and x4 are given x2 and x3 independent given x3 not Directed and undirected graphs are complementary The following two independence properties cannot be captured simultaneously with a Bayesian network x1 x2 x1 x2 x3 x4 x3 x4 Marginal but not conditional independence cannot be captured with an undirected graph x1 x2 x3 Tommi Jaakkola MIT CSAIL 7 Tommi Jaakkola MIT CSAIL 8 Undirected graphs associated distribution Graph transformations The simple graph separation properties again impose independence or Markov properties on the associated distribution We can transform directed graphical models Bayesian networks into undirected graphical models simply via moralization x1 x1 x2 x3 x2 x1 x3 x4 x3 x5 x5 x1 x2 x2 x1 x2 x4 Theorem Hammersley Clifford Any distribution consistent with an undirected graph has to factor according to the maximal cliques in the graph 1 P x c xc Z x3 x4 P x1 P x2 P x3 x1 x2 P x4 x2 x4 x3 x4 P x1 P x2 P x3 x1 x2 P x4 x2 only the graph representation changes not the distribution The resulting undirected graph will be consistent with the distribution associated with the original directed graph c C where xc denotes the variables in clique c Tommi Jaakkola MIT CSAIL 9 Outline Tommi Jaakkola MIT CSAIL 10 Example setting medical diagnosis Bayesian networks cont d graphs and consistency The QMR DT model Shwe et al 1991 Diseases Undirected graphical models Markov random fields graphs independence consistency associated distribution Bayesian networks as undirected models d f Findings Quantitative probabilistic inference medical diagnosis example basic algorithms and problems Tommi Jaakkola MIT CSAIL Diseases Findings about 600 binary 0 1 disease variables representing diseases that are present or absent about 4000 associated binary 0 1 findings findings may be either positive or negative 11 Tommi Jaakkola MIT CSAIL 12 Example cont d Assumptions in detail The model is based on a number of simplifying assumptions Diseases d Diseases are marginally independent d1 d2 d1 Hodgkins disease d2 Plasma cell myeloma d3 f1 f Findings f2 Findings are conditionally independent given the diseases Assumptions explicit in the graph relevant variables marginal independence of diseases conditional independence of findings d1 d2 f1 Bone X ray fracture f2 f1 f2 Further assumptions about the probability distribution causal independence Tommi Jaakkola MIT CSAIL 13 Tommi Jaakkola MIT CSAIL 14 Assumptions in detail Causal independence noisy or We have to specify how n potentially 100 or more underlying diseases conspire to influence any finding We assume that each finding is negative if all the associated diseases if present independently fail to produce a positive outcome d1 d2 dn other d1 dpai dj d2 f The size of the conditional probability table for P f d1 d2 d3 would increase exponentially with the number of associated diseases P fi 1 other qi0 P fi 1 dj 1 qij fi P fi 0 dpai P fi 0 other e g causal independence assumption 1 qi0 j pai P fi 0 dj j pai 1 qij dj and P fi 1 dpai 1 P fi 0 dpai Tommi Jaakkola MIT CSAIL 15 Tommi Jaakkola MIT CSAIL 16 Joint distribution Three inference problems After all these assumptions we can write down the following joint distribution over n diseases and m findings m n P f d P fi dpa P dj Given a set of observed findings f f1 fk we wish to infer what the underlying diseases are Diseases d i i 1 where j 1 P fi 0 dpai 1 qi0 j pai 1 qij dj The only adjustable parameters in this model are qij and P dj Diseases d f f Findings 1 What are the marginal posterior probabilities over the diseases 2 What is the most likely setting of all the underlying disease variables 3 Which test should we carry out next in order to get the most information about the diseases Findings Tommi Jaakkola MIT CSAIL 17 Tommi Jaakkola MIT CSAIL 18 Inference problem cont d First inference problem posterior marginals For the purposes of inferring the presence or absence of the underlying diseases we can ignore any findings that remain unobserved as if they were not in the model to begin with Diseases d Diseases d f d f Findings What messages if any do the disease variables have to share for them to be able to compute the posterior marginals locally f Findings Diseases Given the observations we already have all the information only implicitly Findings P d1 d1 P d2 d2 P d3 d3 f2 f1 P f1 d1 d2 Tommi Jaakkola MIT CSAIL 19 Inference graph transformation P d1 d1 P d2 d2 P

View Full Document


School:
Email:
New Password:
Confirm Password:

This preview shows page 1-2 out of 5 pages.

MIT 6 867 - Machine learning: lecture 21

Sign up for free to view:

Please select your school