DOC PREVIEW
Berkeley COMPSCI 188 - Post-midterm course review

This preview shows page 1-2-3-18-19-37-38-39 out of 39 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 39 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 39 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 39 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 39 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 39 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 39 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 39 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 39 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 39 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CS 188: Artificial Intelligence Spring 2007Final ExamUtility-Based AgentsTodayQuestionsMarkov ModelsConditional IndependenceExampleHidden Markov ModelsSlide 14Slide 15Forward AlgorithmViterbi AlgorithmSlide 23Parameter estimationGeneral Naïve BayesEstimation: SmoothingSlide 30Types of Supervised classifiersSlide 33The Binary PerceptronThe Multiclass PerceptronLinear SeparatorsFeature designMDP and Reinforcement LearningMarkov Decision ProcessesBellman’s Equation for Selecting actionsElements of RLMDPsReinforcement LearningModel-Free LearningProblems with TD Value LearningQ-LearningApplications to NLPNLP applications of Bayes Rules!!AmbiguitiesSlide 59Thanks!Phase II: Update MeansCS 188: Artificial IntelligenceSpring 2007Lecture 29: Post-midterm course review5/8/2007Srini Narayanan – ICSI and UC BerkeleyFinal Exam8:10 to 11 AM on 5/15/2007 at 50 BIRGEFinal prep page upIncludes all topics (see page). Weighted toward post midterm topics.2 double sided cheat sheets allowed as is a calculator.Final exam review Thursday 4 PM Soda 306.Utility-Based AgentsTodayReview of post midterm topics relevant for the final.Reasoning about timeMarkov ModelsHMM forward algorithm, Vitterbi Algorithm.ClassificationNaïve Bayes, PerceptronReinforcement LearningMDP, Value Iteration, Policy iterationTD-value learning, Q-learning, Advanced topicsApplications to NLPQuestionsWhat is the basic conditional independence assertion for markov models?What is a problem with Markov Models for prediction into the future?What are the basic CI assertions for HMM?How do inference algorithms exploit the CI assertionsForward Algorithm Viterbi algorithm.Markov ModelsA Markov model is a chain-structured BNEach node is identically distributed (stationarity)Value of X at a given time is called the stateAs a BN:Parameters: called transition probabilities or dynamics, specify how the state evolves over time (also, initial probs)X2X1X3X4Conditional IndependenceBasic conditional independence:Past and future independent of the presentEach time step only depends on the previousThis is called the (first order) Markov propertyNote that the chain is just a (growing BN)We can always use generic BN reasoning on it (if we truncate the chain)X2X1X3X4ExampleFrom initial state (observation of sun)From initial state (observation of rain)P(X1) P(X2) P(X3) P(X)P(X1) P(X2) P(X3) P(X)Hidden Markov ModelsMarkov chains not so useful for most agentsEventually you don’t know anything anymoreNeed observations to update your beliefsHidden Markov models (HMMs)Underlying Markov chain over states SYou observe outputs (effects) at each time stepAs a Bayes’ net:X5X2E1X1X3X4E2E3E4E5ExampleAn HMM isInitial distribution:Transitions:Emissions:Conditional IndependenceHMMs have two important independence properties:Markov hidden process, future depends on past via the presentCurrent observation independent of all else given current stateQuiz: does this mean that observations are independent given no evidence?[No, correlated by the hidden state]X5X2E1X1X3X4E2E3E4E5Forward AlgorithmCan ask the same questions for HMMs as Markov chainsGiven current belief state, how to update with evidence?This is called monitoring or filteringFormally, we want:X5X2E1X1X3X4E2E3E4E5Viterbi AlgorithmQuestion: what is the most likely state sequence given the observations?Slow answer: enumerate all possibilitiesBetter answer: cached incremental versionX5X2E1X1X3X4E2E3E4E5ClassificationSupervised ModelsGenerative ModelsNaïve BayesDiscriminative ModelsPerceptronUnsupervised ModelsK-meansAgglomerative ClusterParameter estimationWhat are the parameters for Naïve Bayes?What is Maximum Likelihood estimation for NB?What are the problems with ML estimates?General Naïve BayesA general naive Bayes model:We only specify how each feature depends on the classTotal number of parameters is linear in nCE1EnE2|C| parametersn x |E| x |C| parameters|C| x |E|n parametersEstimation: SmoothingProblems with maximum likelihood (relative frequency) estimates:If I flip a coin once, and it’s heads, what’s the estimate for P(heads)?What if I flip 10 times with 8 heads?What if I flip 10M times with 8M heads?Basic idea:We have some prior expectation about parameters (here, the probability of heads)Given little evidence, we should skew towards our priorGiven a lot of evidence, we should listen to the dataEstimation: Laplace SmoothingLaplace’s estimate (extended):Pretend you saw every outcome k extra timesWhat’s Laplace with k = 0?k is the strength of the priorLaplace for conditionals:Smooth each condition independently:H H TTypes of Supervised classifiersGenerative ModelsNaïve BayesDiscriminative ModelsPerceptronQuestionsWhat is a binary threshold perceptron?How can we make a multi-class perceptron?What sorts of patterns can perceptrons classify correctlyThe Binary PerceptronInputs are featuresEach feature has a weightSum is the activationIf the activation is:Positive, output 1Negative, output 0f1f2f3w1w2w3>0?The Multiclass PerceptronIf we have more than two classes:Have a weight vector for each classCalculate an activation for each classHighest activation winsLinear SeparatorsBinary classification can be viewed as the task of separating classes in feature space:w . x = 0w . x < 0w . x > 0Feature designCan we design features f1 and f2 to use a perceptron to separate the the two classes?MDP and Reinforcement LearningWhat is an MDP (Basics) ?What is Bellman’s equation and how is it used in value iteration?What is reinforcement learningTD-value learningQ learningExploration vs. exploitationMarkov Decision ProcessesMarkov decision processes (MDPs)A set of states s  SA model T(s,a,s’) = P(s’ | s,a)Probability that action a in state s leads to s’A reward function R(s, a, s’) (sometimes just R(s) for leaving a state or R(s’) for entering one)A start state (or distribution)Maybe a terminal stateMDPs are the simplest case of reinforcement learningIn general reinforcement learning, we don’t know the model or the reward functionBellman’s Equation for Selecting actions Definition of utility leads to a simple


View Full Document

Berkeley COMPSCI 188 - Post-midterm course review

Documents in this Course
CSP

CSP

42 pages

Metrics

Metrics

4 pages

HMMs II

HMMs II

19 pages

NLP

NLP

23 pages

Midterm

Midterm

9 pages

Agents

Agents

8 pages

Lecture 4

Lecture 4

53 pages

CSPs

CSPs

16 pages

Midterm

Midterm

6 pages

MDPs

MDPs

20 pages

mdps

mdps

2 pages

Games II

Games II

18 pages

Load more
Download Post-midterm course review
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Post-midterm course review and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Post-midterm course review 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?