1CS 188: Artificial IntelligenceFall 2007Lecture 19: Decision Diagrams11/01/2007Dan Klein – UC BerkeleyRecap: Inference Example Find P(W|F=bad) Restrict all factors No hidden vars to eliminate (this time!) Just join and normalizeWeatherForecast0.3rain0.7sunP(W)W0.9bad0.1goodP(F|rain)F0.2bad0.8goodP(F|sun)F0.3rain0.7sunP(W)W0.9rain0.2sunP(F=bad|W)W0.27rain0.14sunP(W,F=bad)W0.66rain0.34sunP(W | F=bad)W2Decision Networks MEU: choose the action which maximizes the expected utility given the evidence Can directly operationalize this with decision diagrams Bayes nets with nodes for utility and actions Lets us calculate the expected utility for each action New node types: Chance nodes (just like BNs) Actions (rectangles, must be parents, act as observed evidence) Utilities (depend on action and chance nodes)WeatherForecastUmbrellaUDecision Networks Action selection: Instantiate all evidence Calculate posterior joint over parents of utility node Set action node each possible way Calculate expected utility for each action Choose maximizing actionWeatherForecastUmbrellaU3Example: Decision NetworksWeatherUmbrellaU0.3rain0.7sunP(W)W0rainleave100sunleavetaketakeA70rain20sunU(A,W)WUmbrella = leaveUmbrella = takeOptimal decision = leaveExample: Decision NetworksWeatherForecast=badUmbrellaU0rainleave100sunleavetaketakeA70rain20sunU(A,W)W0.66rain0.34sunP(W|F=bad)WUmbrella = leaveUmbrella = takeOptimal decision = take4Value of Information Idea: compute value of acquiring each possible piece of evidence Can be done directly from decision network Example: buying oil drilling rights Two blocks A and B, exactly one has oil, worth k Prior probabilities 0.5 each, mutually exclusive Current price of each block is k/2 Probe gives accurate survey of A. Fair price? Solution: compute value of information= expected value of best action given the information minus expected value of best action without information Survey may say “oil in A” or “no oil in A,” prob 0.5 each= [0.5 * value of “buy A” given “oil in A”] +[0.5 * value of “buy B” given “no oil in A”] – 0= [0.5 * k/2] + [0.5 * k/2] - 0 = k/2OilLocDrillLocUValue of Information Current evidence E=e, utility depends on S=s Potential new evidence E’: suppose we knew E’ = e’ BUT E’ is a random variable whose value is currently unknown, so: Must compute expected gain over all possible values (VPI = value of perfect information)5VPI ExampleWeatherForecastUmbrellaU0rainleave100sunleavetaketakeA70rain20sunUWMEU with no evidenceMEU if forecast is badMEU if forecast is good0.41bad0.59goodP(F)FForecast distributionVPI Properties Nonnegative in expectation Nonadditive ---consider, e.g., obtaining Ejtwice Order-independent6VPI Scenarios Imagine actions 1 and 2, for which U1> U2 How much will information about Ejbe worth?Little – we’re sure action 1 is better.Little – info likely to change our action but not our utilityA lot – either could be much betterReasoning over Time Often, we want to reason about a sequence of observations Speech recognition Robot localization User attention Medical monitoring Need to introduce time into our models Basic approach: hidden Markov models (HMMs) More general: dynamic Bayes’ nets7Markov Models A Markov model is a chain-structured BN Each node is identically distributed (stationarity) Value of X at a given time is called the state As a BN: Parameters: called transition probabilities or dynamics, specify how the state evolves over time (also, initial probs)X2X1X3X4Conditional Independence Basic conditional independence: Past and future independent of the present Each time step only depends on the previous This is called the (first order) Markov property Note that the chain is just a (growing) BN We can always use generic BN reasoning on it (if we truncate the chain)X2X1X3X48Example: Markov Chain Weather: States: X = {rain, sun} Transitions: Initial distribution: 1.0 sun What’s the probability distribution after one step?rain sun0.90.90.10.1This is a CPT, not a BN!Mini-Forward Algorithm Question: probability of being in state x at time t? Slow answer: Enumerate all sequences of length t which end in s Add up their probabilities…9Mini-Forward Algorithm Better way: cached incremental belief updatessunrainsunrainsunrainsunrainForward simulationExample From initial observation of sun From initial observation of rainP(X1) P(X2) P(X3) P(X∞)P(X1) P(X2) P(X3) P(X∞)10Stationary Distributions If we simulate the chain long enough: What happens? Uncertainty accumulates Eventually, we have no idea what the state is! Stationary distributions: For most chains, the distribution we end up in is independent of the initial distribution Called the stationary distribution of the chain Usually, can only predict a short time outWeb Link Analysis PageRank over a web graph Each web page is a state Initial distribution: uniform over pages Transitions: With prob. c, uniform jump to arandom page (dotted lines) With prob. 1-c, follow a randomoutlink (solid lines) Stationary distribution Will spend more time on highly reachable pages E.g. many ways to get to the Acrobat Reader download page! Somewhat robust to link spam Google 1.0 returned the set of pages containing all your keywords in decreasing rank, now all search engines use link analysis along with many other factors11Most Likely Explanation Question: most likely sequence ending in x at t? E.g. if sun on day 4, what’s the most likely sequence? Intuitively: probably sun all four days Slow answer: enumerate and score…Mini-Viterbi Algorithm Better answer: cached incremental updates Define: Read best sequence off of m and a vectorssunrainsunrainsunrainsunrain12Mini-ViterbisunrainsunrainsunrainsunrainExample13ExampleExample14ExampleHidden Markov Models Markov chains not so useful for most agents Eventually you don’t know anything anymore Need observations to update your beliefs Hidden Markov models (HMMs) Underlying Markov chain over states S You observe outputs (effects) at each time step As a Bayes’ net:X5X2E1X1X3X4E2E3E4E515Example An HMM is Initial distribution: Transitions: Emissions:Conditional Independence HMMs have two important
View Full Document