DOC PREVIEW
CMU CS 10701 - Lecture

This preview shows page 1-2-15-16-31-32 out of 32 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 32 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 32 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 32 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 32 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 32 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 32 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 32 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Graphical ModelsAarti SinghSlides Courtesy: Carlos GuestrinMachine Learning 10-701/15-781Nov 10, 2010Recitation• HMMs & Graphical Models• Strongly recommended!!• Place: NSH 1507 (Note)• Time: 5-6 pmMiniid to dependent dataHMM Graphical Models- sequential dependence - general dependenceApplications• Character recognition, e.g., kernel SVMsccccccrrrrrrApplications• Webpage ClassificationSportsScienceNewsApplications• Speech recognition• Diagnosis of diseases• Study Human genome• Robot mapping• Modeling fMRI data• Fault diagnosis• Modeling sensor network data• Modeling protein-protein interactions• Weather prediction• Computer vision• Statistical physics• Many, many more …Graphical Models• Key Idea:– Conditional independence assumptions useful– but Naïve Bayes is extreme!– Graphical models express sets of conditional independence assumptions via graph structure– Graph structure plus associated parameters define joint probability distribution over set of variables/nodes• Two types of graphical models:– Directed graphs (aka Bayesian Networks)– Undirected graphs (aka Markov Random Fields)Topics in Graphical Models• Representation– Which joint probability distributions does a graphical model represent?• Inference– How to answer questions about the joint probability distribution?• Marginal distribution of a node variable• Most likely assignment of node variables• Learning– How to learn the parameters and structure of a graphical model?Conditional Independence9• X is conditionally independent of Y given Z:probability distribution governing X is independent of the value of Y, given the value of Z• Equivalent to:• Also to:Directed - Bayesian Networks• Representation– Which joint probability distributions does a graphical model represent?For any arbitrary distribution,Chain rule:More generally: Fully connecteddirected graph between X1, …, XnDirected - Bayesian Networks• Representation– Which joint probability distributions does a graphical model represent?Absence of edges in a graphical model conveys useful information.Directed - Bayesian Networks• Representation– Which joint probability distributions does a graphical model represent?BN is a directed acyclic graph (DAG) that provides a compact representation for joint distributionLocal Markov Assumption: A variable X is independent of its non-descendants given its parents (only the parents)Bayesian Networks Example• Suppose we know the following:– The flu causes sinus inflammation– Allergies cause sinus inflammation– Sinus inflammation causes a runny nose– Sinus inflammation causes headaches• Causal Network• Local Markov Assumption: If you have no sinus infection, then flu has no influence on headache (flu causes headache but only through sinus)FluAllergySinusHeadacheNoseMarkov independence assumption Local Markov Assumption: A variable X is independent of its non-descendants given its parents (only the parents)parents non-desc assumptionS HNFAFluAllergySinusHeadacheNoseF,A - -S F,A,N H  {F,A,N}|SS F,A,H N  {F,A,H}|S- A F  A- F A  FMarkov independence assumption FluAllergySinusHeadacheNoseLocal Markov Assumption: A variable X is independent of its non-descendants given its parents (only the parents)Joint distribution:P(F, A, S, H, N) = P(F) P(F|A) P(S|F,A) P(H|S,F,A) P(N|S,F,A,H)Chain rule= P(F) P(A) P(S|F,A) P(H|S) P(N|S)Markov AssumptionF  A, H  {F,A}|S, N  {F,A,H}|SHow many parameters in a BN?• Discrete variables X1, …, Xn• Directed Acyclic Graph (DAG)– Defines parents of Xi, PaXi• CPTs (Conditional Probability Tables)– P(Xi| PaXi)E.g. Xi= S, PaXi= {F, A}F=f, A=f F=t, A=f F=f, A=t F=t,A=t S=t 0.9 0.8 0.7 0.3S=f 0.1 0.2 0.3 0.7n variables, K values, max d parents/node O(nK x Kd)FASHNTwo (trivial) special casesFully disconnected graph Fully connected graphXiXiparents:  parents: X1, …, Xi-1non-descendants: X1,…,Xi-1, non-descendants: Xi+1,…, XnXi X1,…,Xi-1,Xi+1,…, XnNo independence assumptionX1X2X3X4X1X2X3X4Bayesian Networks Example• Naïve Bayes Xi X1,…,Xi-1,Xi+1,…, Xn|YP(X1,…,Xn,Y) =P(Y)P(X1|Y)…P(X1|Y)• HMMX1X2X3X4YO1O2OT-1OTS1S2ST-1STExplaining AwayFluAllergySinusHeadacheNoseLocal Markov Assumption: A variable X is independent of its non-descendants given its parents (only the parents)F  A P(F|A=t) = P(F)F  A|S ?P(F|A=t,S=t) = P(F|S=t)?P(F=t|S=t) is high, but P(F=t|A=t,S=t) not as highsince A = t explains away S=tInfact, P(F=t|A=t,S=t) < P(F=t|S=t)F  A|N ? No!No!Independencies encoded in BN• We said: All you need is the local Markov assumption– (Xi NonDescendantsXi| PaXi)• But then we talked about other (in)dependencies– e.g., explaining away• What are the independencies encoded by a BN?– Only assumption is local Markov– But many others can be derived using the algebra of conditional independencies!!!D-separation• a is D-separated from b by c ≡ a  b|c• Three important configurationsca……bCausal directioncCommon causeabcV-structure(Explaining away)a bca b…D-separation• A, B, C – non-intersecting set of nodes• A is D-separated from B by C ≡ A  B|Cif all paths between nodes in A & B are “blocked”i.e. path contains a node z such that eitherand z in C, ORand neither z nor any of its descendants is in C.z zzD-separation Examplea fecbz zzAnd z in CAnd neither z nor its descendants are in Cora  b | f ?Yes, Consider z = f or z = ea  b | c ?No, Consider z = eA is D-separated from B by C if every path between A and B contains a node z such that eitherRepresentation Theorem• Set of distributions that factorize according to the graph - F• Set of distributions that respect conditional independencies implied by d-separation properties of graph – IF II FImportant because: Given independencies of P can get BN structure GImportant because: Read independencies of P from BN structure GMarkov Blanket• Conditioning on the Markov Blanket, node i is independent of all other nodes.Only terms that remain are the ones which involve i• Markov Blanket of node i - Set of parents, children and co-parents of node iUndirected – Markov Random Fields• Popular in statistical physics and computer vision communities• Example – Image Denoising xi– value at pixel iyi– observed noisy valueConditional Independence properties• No directed edges•


View Full Document

CMU CS 10701 - Lecture

Documents in this Course
lecture

lecture

12 pages

lecture

lecture

17 pages

HMMs

HMMs

40 pages

lecture

lecture

15 pages

lecture

lecture

20 pages

Notes

Notes

10 pages

Notes

Notes

15 pages

Lecture

Lecture

22 pages

Lecture

Lecture

13 pages

Lecture

Lecture

24 pages

Lecture9

Lecture9

38 pages

lecture

lecture

26 pages

lecture

lecture

13 pages

Lecture

Lecture

5 pages

lecture

lecture

18 pages

lecture

lecture

22 pages

Boosting

Boosting

11 pages

lecture

lecture

16 pages

lecture

lecture

20 pages

Lecture

Lecture

20 pages

Lecture

Lecture

39 pages

Lecture

Lecture

14 pages

Lecture

Lecture

18 pages

Lecture

Lecture

13 pages

Exam

Exam

10 pages

Lecture

Lecture

27 pages

Lecture

Lecture

15 pages

Lecture

Lecture

24 pages

Lecture

Lecture

16 pages

Lecture

Lecture

23 pages

Lecture6

Lecture6

28 pages

Notes

Notes

34 pages

lecture

lecture

15 pages

Midterm

Midterm

11 pages

lecture

lecture

11 pages

lecture

lecture

23 pages

Boosting

Boosting

35 pages

Lecture

Lecture

49 pages

Lecture

Lecture

22 pages

Lecture

Lecture

16 pages

Lecture

Lecture

18 pages

Lecture

Lecture

35 pages

lecture

lecture

22 pages

lecture

lecture

24 pages

Midterm

Midterm

17 pages

exam

exam

15 pages

Lecture12

Lecture12

32 pages

lecture

lecture

19 pages

boosting

boosting

11 pages

pca-mdps

pca-mdps

56 pages

bns

bns

45 pages

mdps

mdps

42 pages

svms

svms

10 pages

Notes

Notes

12 pages

lecture

lecture

42 pages

lecture

lecture

29 pages

lecture

lecture

15 pages

Lecture

Lecture

12 pages

Lecture

Lecture

24 pages

Lecture

Lecture

22 pages

Midterm

Midterm

5 pages

mdps-rl

mdps-rl

26 pages

Load more
Download Lecture
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?