CS 188: Artificial Intelligence Fall 2009AnnouncementsDecision NetworksSlide 4Example: Decision NetworksDecisions as Outcome TreesEvidence in Decision NetworksSlide 8Slide 9Value of InformationSlide 12VPI Example: WeatherVPI PropertiesQuick VPI QuestionsReasoning over TimeMarkov ModelsConditional IndependenceExample: Markov ChainMini-Forward AlgorithmSlide 21ExampleStationary DistributionsWeb Link AnalysisCS 188: Artificial IntelligenceFall 2009Lecture 18: Decision Diagrams10/29/2009Dan Klein – UC BerkeleyAnnouncementsMid-Semester EvaluationsLink is in your emailAssignmentsW3 out later this weekP4 out next weekContest!2Decision NetworksMEU: choose the action which maximizes the expected utility given the evidenceCan directly operationalize this with decision networksBayes nets with nodes for utility and actionsLets us calculate the expected utility for each actionNew node types:Chance nodes (just like BNs)Actions (rectangles, cannot have parents, act as observed evidence)Utility node (diamond, depends on action and chance nodes)WeatherForecastUmbrellaU3[DEMO: Ghostbusters]Decision NetworksAction selection:Instantiate all evidenceSet action node(s) each possible wayCalculate posterior for all parents of utility node, given the evidenceCalculate expected utility for each actionChoose maximizing actionWeatherForecastUmbrellaU4Example: Decision NetworksWeatherUmbrellaUW P(W)sun 0.7rain 0.3A W U(A,W)leave sun 100leave rain 0take sun 20take rain 70Umbrella = leaveUmbrella = takeOptimal decision = leaveDecisions as Outcome TreesAlmost exactly like expectimax / MDPsWhat’s changed?6U(t,s)Weather Weathertakeleave{}sunU(t,r)rainU(l,s) U(l,r)rainsunEvidence in Decision NetworksFind P(W|F=bad)Select for evidenceFirst we join P(W) and P(bad|W)Then we normalizeWeatherForecastW P(W)sun 0.7rain 0.3F P(F|rain)good 0.1bad 0.9F P(F|sun)good 0.8bad 0.2W P(W)sun 0.7rain 0.3W P(F=bad|W)sun 0.2rain 0.9W P(W,F=bad)sun 0.14rain 0.27W P(W | F=bad)sun 0.34rain 0.66UmbrellaUExample: Decision NetworksWeatherForecast=badUmbrellaUA W U(A,W)leave sun 100leave rain 0take sun 20take rain 70W P(W|F=bad)sun 0.34rain 0.66Umbrella = leaveUmbrella = takeOptimal decision = take8Decisions as Outcome Trees9U(t,s)W | {b} W | {b}takeleavesunU(t,r)rainU(l,s) U(l,r)rainsun{b}Value of InformationIdea: compute value of acquiring evidenceCan be done directly from decision networkExample: buying oil drilling rightsTwo blocks A and B, exactly one has oil, worth kYou can drill in one locationPrior probabilities 0.5 each, & mutually exclusiveDrilling in either A or B has MEU = k/2Question: what’s the value of information?Value of knowing which of A or B has oilValue is expected gain in MEU from new infoSurvey may say “oil in a” or “oil in b,” prob 0.5 eachIf we know OilLoc, MEU is k (either way)Gain in MEU from knowing OilLoc?VPI(OilLoc) = k/2Fair price of information: k/2OilLocDrillLocUD O Ua a ka b 0b a 0b b kO Pa 1/2b 1/211Value of InformationAssume we have evidence E=e. Value if we act now:Assume we see that E’ = e’. Value if we act then:BUT E’ is a random variable whose value isunknown, so we don’t know what e’ will beExpected value if E’ is revealed and then we act:Value of information: how much MEU goes upby revealing E’ first:P(s | e){e}aU{e, e’}aP(s | e, e’)U{e}P(e’ | e){e, e’}VPI Example: WeatherWeatherForecastUmbrellaUA W Uleave sun 100leave rain 0take sun 20take rain 70MEU with no evidenceMEU if forecast is badMEU if forecast is goodF P(F)good 0.59bad 0.41Forecast distribution13[Demo]VPI PropertiesNonnegativeNonadditive ---consider, e.g., obtaining Ej twiceOrder-independent14Quick VPI QuestionsThe soup of the day is either clam chowder or split pea, but you wouldn’t order either one. What’s the value of knowing which it is?There are two kinds of plastic forks at a picnic. It must be that one is slightly better. What’s the value of knowing which?You have $10 to bet double-or-nothing and there is a 75% chance that Berkeley will beat Stanford. What’s the value of knowing the outcome in advance?You must bet on Cal, either way. What’s the value now?Reasoning over TimeOften, we want to reason about a sequence of observationsSpeech recognitionRobot localizationUser attentionMedical monitoringNeed to introduce time into our modelsBasic approach: hidden Markov models (HMMs)More general: dynamic Bayes’ nets16Markov ModelsA Markov model is a chain-structured BNEach node is identically distributed (stationarity)Value of X at a given time is called the stateAs a BN:Parameters: called transition probabilities or dynamics, specify how the state evolves over time (also, initial probs)X2X1X3X4[DEMO: Ghostbusters]Conditional IndependenceBasic conditional independence:Past and future independent of the presentEach time step only depends on the previousThis is called the (first order) Markov propertyNote that the chain is just a (growing) BNWe can always use generic BN reasoning on it if we truncate the chain at a fixed lengthX2X1X3X418Example: Markov ChainWeather:States: X = {rain, sun}Transitions:Initial distribution: 1.0 sunWhat’s the probability distribution after one step?rain sun0.90.90.10.1This is a CPT, not a BN!19Mini-Forward AlgorithmQuestion: probability of being in state x at time t?Slow answer:Enumerate all sequences of length t which end in sAdd up their probabilities…20Mini-Forward AlgorithmBetter way: cached incremental belief updatesAn instance of variable elimination!sunrainsunrainsunrainsunrainForward simulationExampleFrom initial observation of sunFrom initial observation of rainP(X1) P(X2) P(X3) P(X)P(X1) P(X2) P(X3) P(X)22Stationary DistributionsIf we simulate the chain long enough:What happens?Uncertainty accumulatesEventually, we have no idea what the state is!Stationary distributions:For most chains, the distribution we end up in is independent of the initial distributionCalled the stationary distribution of the chainUsually, can only predict a short time out[DEMO: Ghostbusters]Web Link AnalysisPageRank over a web graphEach web page is a stateInitial distribution: uniform over pagesTransitions:With prob. c, uniform jump to arandom page (dotted lines)With
View Full Document