DOC PREVIEW
Berkeley COMPSCI 188 - Lecture 18: Decision Diagrams

This preview shows page 1-2-22-23 out of 23 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 23 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 23 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 23 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 23 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 23 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CS 188: Artificial Intelligence Fall 2009AnnouncementsDecision NetworksSlide 4Example: Decision NetworksDecisions as Outcome TreesEvidence in Decision NetworksSlide 8Slide 9Value of InformationSlide 12VPI Example: WeatherVPI PropertiesQuick VPI QuestionsReasoning over TimeMarkov ModelsConditional IndependenceExample: Markov ChainMini-Forward AlgorithmSlide 21ExampleStationary DistributionsWeb Link AnalysisCS 188: Artificial IntelligenceFall 2009Lecture 18: Decision Diagrams10/29/2009Dan Klein – UC BerkeleyAnnouncementsMid-Semester EvaluationsLink is in your emailAssignmentsW3 out later this weekP4 out next weekContest!2Decision NetworksMEU: choose the action which maximizes the expected utility given the evidenceCan directly operationalize this with decision networksBayes nets with nodes for utility and actionsLets us calculate the expected utility for each actionNew node types:Chance nodes (just like BNs)Actions (rectangles, cannot have parents, act as observed evidence)Utility node (diamond, depends on action and chance nodes)WeatherForecastUmbrellaU3[DEMO: Ghostbusters]Decision NetworksAction selection:Instantiate all evidenceSet action node(s) each possible wayCalculate posterior for all parents of utility node, given the evidenceCalculate expected utility for each actionChoose maximizing actionWeatherForecastUmbrellaU4Example: Decision NetworksWeatherUmbrellaUW P(W)sun 0.7rain 0.3A W U(A,W)leave sun 100leave rain 0take sun 20take rain 70Umbrella = leaveUmbrella = takeOptimal decision = leaveDecisions as Outcome TreesAlmost exactly like expectimax / MDPsWhat’s changed?6U(t,s)Weather Weathertakeleave{}sunU(t,r)rainU(l,s) U(l,r)rainsunEvidence in Decision NetworksFind P(W|F=bad)Select for evidenceFirst we join P(W) and P(bad|W)Then we normalizeWeatherForecastW P(W)sun 0.7rain 0.3F P(F|rain)good 0.1bad 0.9F P(F|sun)good 0.8bad 0.2W P(W)sun 0.7rain 0.3W P(F=bad|W)sun 0.2rain 0.9W P(W,F=bad)sun 0.14rain 0.27W P(W | F=bad)sun 0.34rain 0.66UmbrellaUExample: Decision NetworksWeatherForecast=badUmbrellaUA W U(A,W)leave sun 100leave rain 0take sun 20take rain 70W P(W|F=bad)sun 0.34rain 0.66Umbrella = leaveUmbrella = takeOptimal decision = take8Decisions as Outcome Trees9U(t,s)W | {b} W | {b}takeleavesunU(t,r)rainU(l,s) U(l,r)rainsun{b}Value of InformationIdea: compute value of acquiring evidenceCan be done directly from decision networkExample: buying oil drilling rightsTwo blocks A and B, exactly one has oil, worth kYou can drill in one locationPrior probabilities 0.5 each, & mutually exclusiveDrilling in either A or B has MEU = k/2Question: what’s the value of information?Value of knowing which of A or B has oilValue is expected gain in MEU from new infoSurvey may say “oil in a” or “oil in b,” prob 0.5 eachIf we know OilLoc, MEU is k (either way)Gain in MEU from knowing OilLoc?VPI(OilLoc) = k/2Fair price of information: k/2OilLocDrillLocUD O Ua a ka b 0b a 0b b kO Pa 1/2b 1/211Value of InformationAssume we have evidence E=e. Value if we act now:Assume we see that E’ = e’. Value if we act then:BUT E’ is a random variable whose value isunknown, so we don’t know what e’ will beExpected value if E’ is revealed and then we act:Value of information: how much MEU goes upby revealing E’ first:P(s | e){e}aU{e, e’}aP(s | e, e’)U{e}P(e’ | e){e, e’}VPI Example: WeatherWeatherForecastUmbrellaUA W Uleave sun 100leave rain 0take sun 20take rain 70MEU with no evidenceMEU if forecast is badMEU if forecast is goodF P(F)good 0.59bad 0.41Forecast distribution13[Demo]VPI PropertiesNonnegativeNonadditive ---consider, e.g., obtaining Ej twiceOrder-independent14Quick VPI QuestionsThe soup of the day is either clam chowder or split pea, but you wouldn’t order either one. What’s the value of knowing which it is?There are two kinds of plastic forks at a picnic. It must be that one is slightly better. What’s the value of knowing which?You have $10 to bet double-or-nothing and there is a 75% chance that Berkeley will beat Stanford. What’s the value of knowing the outcome in advance?You must bet on Cal, either way. What’s the value now?Reasoning over TimeOften, we want to reason about a sequence of observationsSpeech recognitionRobot localizationUser attentionMedical monitoringNeed to introduce time into our modelsBasic approach: hidden Markov models (HMMs)More general: dynamic Bayes’ nets16Markov ModelsA Markov model is a chain-structured BNEach node is identically distributed (stationarity)Value of X at a given time is called the stateAs a BN:Parameters: called transition probabilities or dynamics, specify how the state evolves over time (also, initial probs)X2X1X3X4[DEMO: Ghostbusters]Conditional IndependenceBasic conditional independence:Past and future independent of the presentEach time step only depends on the previousThis is called the (first order) Markov propertyNote that the chain is just a (growing) BNWe can always use generic BN reasoning on it if we truncate the chain at a fixed lengthX2X1X3X418Example: Markov ChainWeather:States: X = {rain, sun}Transitions:Initial distribution: 1.0 sunWhat’s the probability distribution after one step?rain sun0.90.90.10.1This is a CPT, not a BN!19Mini-Forward AlgorithmQuestion: probability of being in state x at time t?Slow answer:Enumerate all sequences of length t which end in sAdd up their probabilities…20Mini-Forward AlgorithmBetter way: cached incremental belief updatesAn instance of variable elimination!sunrainsunrainsunrainsunrainForward simulationExampleFrom initial observation of sunFrom initial observation of rainP(X1) P(X2) P(X3) P(X)P(X1) P(X2) P(X3) P(X)22Stationary DistributionsIf we simulate the chain long enough:What happens?Uncertainty accumulatesEventually, we have no idea what the state is!Stationary distributions:For most chains, the distribution we end up in is independent of the initial distributionCalled the stationary distribution of the chainUsually, can only predict a short time out[DEMO: Ghostbusters]Web Link AnalysisPageRank over a web graphEach web page is a stateInitial distribution: uniform over pagesTransitions:With prob. c, uniform jump to arandom page (dotted lines)With


View Full Document

Berkeley COMPSCI 188 - Lecture 18: Decision Diagrams

Documents in this Course
CSP

CSP

42 pages

Metrics

Metrics

4 pages

HMMs II

HMMs II

19 pages

NLP

NLP

23 pages

Midterm

Midterm

9 pages

Agents

Agents

8 pages

Lecture 4

Lecture 4

53 pages

CSPs

CSPs

16 pages

Midterm

Midterm

6 pages

MDPs

MDPs

20 pages

mdps

mdps

2 pages

Games II

Games II

18 pages

Load more
Download Lecture 18: Decision Diagrams
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture 18: Decision Diagrams and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 18: Decision Diagrams 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?