DOC PREVIEW
CMU CS 10708 - Structure Learning 2: the good, the bad, the ugly

This preview shows page 1-2-3-4-5 out of 15 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Structure Learning 2:the good, the bad, the uglyConsistency of BIC and Bayesian scoresPriors for general graphsBDe priorScore equivalenceChow-Liu for Bayesian scoreStructure learning for general graphsUnderstanding score decompositionFixed variable order 1Fixed variable order 2Learn BN structure using local searchExploit score decomposition in local searchOrder search versus graph searchBayesian model averagingWhat you need to know about learning BN structuresKoller & Friedman Chapter 13Structure Learning 2:the good, the bad, the ugly Graphical Model – 10708Carlos GuestrinCarnegie Mellon UniversityOctober 26th, 2005Consistency of BIC and Bayesian scoresConsistency is limiting behavior, says nothing about finite sample size!!! A scoring function is consistent if, for true model G*, as M→∞, with probability 1 G*maximizes the score All structures not I-equivalent to G*have strictly lower score Theorem: BIC score is consistent Corollary: the Bayesian score is consistent  What about maximum likelihood?Priors for general graphs For finite datasets, prior is important! Prior over structure satisfying prior modularity What about prior over parameters, how do we represent it? K2 prior: fix an α, P(θXi|PaXi) = Dirichlet(α,…, α)  K2 is “inconsistent”BDe prior Remember that Dirichlet parameters analogous to “fictitious samples” Pick a fictitious sample size M’ For each possible family, define a prior distribution P(Xi,PaXi) Represent with a BN Usually independent (product of marginals) BDe prior:  Has “consistency property”:Score equivalence If G and G’ are I-equivalent then they have same score Theorem: Maximum likelihood and BIC scores satisfy score equivalence Theorem:  If P(G) assigns same prior to I-equivalent structures (e.g., edge counting) and parameter prior is dirichlet then Bayesian score satisfies score equivalence if and only if prior over parameters represented as a BDe prior!!!!!!Chow-Liu for Bayesian score Edge weight wXj→Xiis advantage of adding Xjas parent for Xi Now have a directed graph, need directed spanning forest Note that adding an edge can hurt Bayesian score – choose forest not tree But, if score satisfies score equivalence, then wXj→Xi= wXj→Xi! Simple maximum spanning forest algorithm worksStructure learning for general graphs In a tree, a node only has one parent Theorem: The problem of learning a BN structure with at most dparents is NP-hard for any (fixed) d≥2 Most structure learning approaches use heuristics Exploit score decomposition (Quickly) Describe two heuristics that exploit decomposition in different waysUnderstanding score decompositionDifficultySATGradeHappyJobCoherenceLetterIntelligenceFixed variable order 1 Pick a variable order ≺ e.g., X1,…,Xn Xican only pick parents in {X1,…,Xi-1} Any subset Acyclicity guaranteed! Total score = sum score of each nodeFixed variable order 2 Fix max number of parents to k For each i in order ≺ Pick PaXi⊆{X1,…,Xi-1} Exhaustively search through all possible subsets PaXiis maximum U⊆{X1,…,Xi-1} FamScore(Xi|U : D)Optimal BN for each order!!! Greedy search through space of orders: E.g., try switching pairs of variables in order If neighboring vars in order are switch, only need to recompute score for this pair  O(n) speed up per iteration Local moves may be worseLearn BN structure using local searchLocal search,possible moves:Only if acyclic!!!• Add edge• Delete edge• Invert edgeSelect using favorite scoreStarting from Chow-Liu treeExploit score decomposition in local search Add edge and delete edge: Only rescore one family! Reverse edge Rescore only two familiesDifficultySATGradeHappyJobCoherenceLetterIntelligenceOrder search versus graph search Order search advantages For fixed order, optimal BN – more “global” optimization Space of orders much smaller than space of graphs Graph search advantages Not restricted to k parents Especially if exploiting CPD structure, such as CSI Cheaper per iteration Finer moves within a graphBayesian model averaging So far, we have selected a single structure But, if you are really Bayesian, must average over structures Similar to averaging over parameters Inference for structure averaging is very hard!!! Clever tricks in readingWhat you need to know about learning BN structures Decomposable scores Maximum likelihood Information theoretic interpretation Bayesian BIC approximation Priors Structure and parameter assumptions BDe if and only if score equivalence Best tree (Chow-Liu) Best TAN Nearly best k-treewidth (in O(Nk+1)) Search techniques Search through orders Search through structures Bayesian model


View Full Document

CMU CS 10708 - Structure Learning 2: the good, the bad, the ugly

Documents in this Course
Lecture

Lecture

15 pages

Lecture

Lecture

25 pages

Lecture

Lecture

24 pages

causality

causality

53 pages

lecture11

lecture11

16 pages

Exam

Exam

15 pages

Notes

Notes

12 pages

lecture

lecture

18 pages

lecture

lecture

16 pages

Lecture

Lecture

17 pages

Lecture

Lecture

15 pages

Lecture

Lecture

17 pages

Lecture

Lecture

19 pages

Lecture

Lecture

42 pages

Lecture

Lecture

16 pages

r6

r6

22 pages

lecture

lecture

20 pages

lecture

lecture

35 pages

Lecture

Lecture

19 pages

Lecture

Lecture

21 pages

lecture

lecture

21 pages

lecture

lecture

13 pages

review

review

50 pages

Semantics

Semantics

30 pages

lecture21

lecture21

26 pages

MN-crf

MN-crf

20 pages

hw4

hw4

5 pages

lecture

lecture

12 pages

Lecture

Lecture

25 pages

Lecture

Lecture

25 pages

Lecture

Lecture

14 pages

Lecture

Lecture

15 pages

Load more
Download Structure Learning 2: the good, the bad, the ugly
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Structure Learning 2: the good, the bad, the ugly and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Structure Learning 2: the good, the bad, the ugly 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?