UW-Madison CS 731 - Likelihood Scores - D2474740

Home> Schools> University of Wisconsin, Madison> (CS) > CS 731> Likelihood Scores

DOC PREVIEW

UW-Madison CS 731 - Likelihood Scores

School name University of Wisconsin, Madison

Course Cs 731- Advanced Artificial Intelligence (

Pages 9

This preview shows page 1-2-3 out of 9 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 9 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 9 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 9 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 9 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Slide 1Reminder from Information TheoryScoring Maximum Likelihood FunctionTwo Graph StructuresEx ContinuedLikelihood Score: General NetworksProof Cont.Tree-Augmented Naïve Bayes (TAN) ModelSlide 9Likelihood ScoresLecture XXReminder from Information Theory•Mutual Information: •Conditional Mutual Information: •Entropy: Conditional Mutual Information:Scoring Maximum Likelihood Function•When scoring function is the Maximum Likelihood, the model would make the data as probable as possible by choosing the graph structure that would produce the highest score for the MLE estimate of the parameter, we define:Two Graph Structures•Consider two simple graph structures: •The difference is: X YX YEx Continued•By counting how many times each conditional probability parameter appears in this term:•When is empirical distribution observed in the data where is the mutual information between X and Y in the distribution The goal is to maximize the mutual informationLikelihood Score: General Networks•Proposition:•Proof: Looping over all variablesLooping over all rows of CPTAll settings of this variableEntropy: to penalize addition of too many arcs between the nodes to prevent over fitting.Proof Cont.Tree-Augmented Naïve Bayes (TAN) Model•Bayesian network in which one node is distinguished as the Class node•Arc from Class to every other node (feature node), as in naïve Bayes•Remaining arcs form a directed tree among the feature nodesTAN Learning Algorithm (guarantees maximum likelihood TAN model)•Compute (based on data set) conditional mutual information between each pair of features, conditional on Class•Compute the maximum weight spanning tree of the complete graph over features, with each edge weighted by conditional mutual information computed above•Choose any feature as root and direct arcs from root, to get directed tree over features•Add Class variable with arcs to all feature nodes, to get final network structure•Learn parameters from data as for any other Bayes

View Full Document


School:
Email:
New Password:
Confirm Password:

This preview shows page 1-2-3 out of 9 pages.

UW-Madison CS 731 - Likelihood Scores

Sign up for free to view:

Please select your school