DOC PREVIEW
CMU CS 10701 - Graphical Models

This preview shows page 1-2-3-4 out of 11 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

1 Machine Learning 10-701 Tom M. Mitchell Machine Learning Department Carnegie Mellon University February 8, 2011 Today: • Graphical models • Bayes Nets: • Representing distributions • Conditional independencies • Simple inference • Simple learning Readings: Required: • Bishop chapter 8, through 8.2 Graphical Models • Key Idea: – Conditional independence assumptions useful – but Naïve Bayes is extreme! – Graphical models express sets of conditional independence assumptions via graph structure – Graph structure plus associated parameters define joint probability distribution over set of variables/nodes • Two types of graphical models: – Directed graphs (aka Bayesian Networks) – Undirected graphs (aka Markov Random Fields) today2 Graphical Models – Why Care? • Among most important ML developments of the decade • Graphical models allow combining: – Prior knowledge in form of dependencies/independencies – Observed data to estimate parameters • Principled and ~general methods for – Probabilistic inference – Learning • Useful in practice – Diagnosis, help systems, text analysis, time series models, ... Conditional Independence Definition: X is conditionally independent of Y given Z, if the probability distribution governing X is independent of the value of Y, given the value of Z Which we often write E.g.,3 Marginal Independence Definition: X is marginally independent of Y if Equivalently, if Equivalently, if Represent Joint Probability Distribution over Variables4 Describe network of dependencies Bayesian Networks define Joint Distribution in terms of this graph, plus parameters5 Bayesian Network StormClouds Lightning Rain Thunder WindSurf Bayes network: a directed acyclic graph defining a joint probability distribution over a set of variables Each node denotes a random variable A conditional probability distribution (CPD) is associated with each node N, defining P(N | Parents(N)) The joint distribution over all variables in the network is defined in terms of these CPD’s, plus the graph Parents P(W|Pa) P(¬W|Pa) L, R 0 1.0 L, ¬R 0 1.0 ¬L, R 0.2 0.8 ¬L, ¬R 0.9 0.1 WindSurf Bayesian Network StormClouds Lightning Rain Thunder WindSurf What can we say about conditional independencies in a Bayes Net? One thing is this: Each node is conditionally independent of its non-descendents, given only its immediate parents. Parents P(W|Pa) P(¬W|Pa) L, R 0 1.0 L, ¬R 0 1.0 ¬L, R 0.2 0.8 ¬L, ¬R 0.9 0.1 WindSurf6 Bayesian Networks Definition A Bayes network represents the joint probability distribution over a collection of random variables A Bayes network is a directed acyclic graph and a set of CPD’s • Each node denotes a random variable • Edges denote dependencies • CPD for each node Xi defines P(Xi | Pa(Xi))!• The joint distribution over all variables is defined as Pa(X) = immediate parents of X in the graph Some helpful terminology Parents = Pa(X) = immediate parents Antecedents = parents, parents of parents, ... Children = immediate children Descendents = children, children of children, ...7 Bayesian Networks • CPD for each node Xi describes P(Xi | Pa(Xi)) Chain rule of probability: But in a Bayes net: StormClouds Lightning Rain Thunder WindSurf Parents P(W|Pa) P(¬W|Pa) L, R 0 1.0 L, ¬R 0 1.0 ¬L, R 0.2 0.8 ¬L, ¬R 0.9 0.1 WindSurf How Many Parameters? In full joint distribution? Given this Bayes Net?8 Bayes Net Inference: P(BattPower=t | Radio=t, Starts=f) Most probable explanation: What is most likely value of Leak, BatteryPower given Starts=f? Active data collection: What is most useful variable to observe next, to improve our knowledge of node X? Algorithm for Constructing Bayes Network • Choose an ordering over variables, e.g., X1, X2, ... Xn • For i=1 to n – Add Xi to the network – Select parents Pa(Xi) as minimal subset of X1 ... Xi-1 such that Notice this choice of parents assures (by chain rule) (by construction)9 Example • Bird flu and Allegies both cause Nasal problems • Nasal problems cause Sneezes and Headaches What is the Bayes Network for X1,…Xn with NO assumed conditional independencies?10 What is the Bayes Network for Naïve Bayes? What do we do if variables are mix of discrete and real valued?11 Bayes Network for a Hidden Markov Model Assume the future is conditionally independent of the past, given the present St-2 St-1 St St+1 St+2 Ot-2 Ot-1 Ot Ot+1 Ot+2 Unobserved state: Observed output: How Can We Train a Bayes Net 1. when graph is given, and each training example gives value of every RV? Easy: use data to obtain MLE or MAP estimates of θ for each CPD P( Xi | Pa(Xi); θ) e.g. like training the CPD’s of a naïve Bayes classifier 2. when graph unknown or some RV’s unobserved? this is more difficult…


View Full Document

CMU CS 10701 - Graphical Models

Documents in this Course
lecture

lecture

12 pages

lecture

lecture

17 pages

HMMs

HMMs

40 pages

lecture

lecture

15 pages

lecture

lecture

20 pages

Notes

Notes

10 pages

Notes

Notes

15 pages

Lecture

Lecture

22 pages

Lecture

Lecture

13 pages

Lecture

Lecture

24 pages

Lecture9

Lecture9

38 pages

lecture

lecture

26 pages

lecture

lecture

13 pages

Lecture

Lecture

5 pages

lecture

lecture

18 pages

lecture

lecture

22 pages

Boosting

Boosting

11 pages

lecture

lecture

16 pages

lecture

lecture

20 pages

Lecture

Lecture

20 pages

Lecture

Lecture

39 pages

Lecture

Lecture

14 pages

Lecture

Lecture

18 pages

Lecture

Lecture

13 pages

Exam

Exam

10 pages

Lecture

Lecture

27 pages

Lecture

Lecture

15 pages

Lecture

Lecture

24 pages

Lecture

Lecture

16 pages

Lecture

Lecture

23 pages

Lecture6

Lecture6

28 pages

Notes

Notes

34 pages

lecture

lecture

15 pages

Midterm

Midterm

11 pages

lecture

lecture

11 pages

lecture

lecture

23 pages

Boosting

Boosting

35 pages

Lecture

Lecture

49 pages

Lecture

Lecture

22 pages

Lecture

Lecture

16 pages

Lecture

Lecture

18 pages

Lecture

Lecture

35 pages

lecture

lecture

22 pages

lecture

lecture

24 pages

Midterm

Midterm

17 pages

exam

exam

15 pages

Lecture12

Lecture12

32 pages

lecture

lecture

19 pages

Lecture

Lecture

32 pages

boosting

boosting

11 pages

pca-mdps

pca-mdps

56 pages

bns

bns

45 pages

mdps

mdps

42 pages

svms

svms

10 pages

Notes

Notes

12 pages

lecture

lecture

42 pages

lecture

lecture

29 pages

lecture

lecture

15 pages

Lecture

Lecture

12 pages

Lecture

Lecture

24 pages

Lecture

Lecture

22 pages

Midterm

Midterm

5 pages

mdps-rl

mdps-rl

26 pages

Load more
Download Graphical Models
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Graphical Models and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Graphical Models 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?