CS 188: Artificial Intelligence Fall 2006AnnouncementsRecap: SamplingLikelihood WeightingLikelihood SamplingSlide 6Slide 7Decision NetworksSlide 9Example: Decision NetworksSlide 11Value of InformationGeneral FormulaVPI PropertiesVPI ExampleVPI ScenariosReasoning over TimeMarkov ModelsConditional IndependenceExampleMini-Forward AlgorithmSlide 23Stationary DistributionsWeb Link AnalysisCS 188: Artificial IntelligenceFall 2006Lecture 18: Decision Diagrams10/31/2006Dan Klein – UC BerkeleyAnnouncementsOptional midtermOn Tuesday 11/21 in class unless conflictsWe will count the midterms and final as 1 / 1 / 2, and drop the lowest (or halve the final weight)Will also run grades as if this midterm did not happenYou will get the better of the two gradesProjects3.2 up today, grades on previous projects ASAPTotal of 3 checkpoints left (including 3.2)Recap: SamplingExact inference can be slowBasic idea: samplingDraw N samples from a sampling distribution SCompute an approximate posterior probabilityShow this converges to the true probability PBenefits:Easy to implementYou can get an (approximate) answer fastLast time:Rejection sampling: reject samples disagreeing with evidenceLikelihood weighting: use evidence to weight samplesLikelihood WeightingProblem with rejection sampling:If evidence is unlikely, you reject a lot of samplesYou don’t exploit your evidence as you sampleConsider P(B | A=true)Idea: fix evidence variables and sample the restProblem: sample distribution not consistent!Solution: weight by probability of evidence given parentsBurglary AlarmBurglary AlarmLikelihood SamplingCloudySprinklerRainWetGrassCloudySprinklerRainWetGrassLikelihood WeightingSampling distribution if z sampled and e fixed evidenceNow, samples have weightsTogether, weighted sampling distribution is consistentCloudyRainCSRWLikelihood WeightingNote that likelihood weighting doesn’t solve all our problemsRare evidence is taken into account for downstream variables, but not upstream onesA better solution is Markov-chain Monte Carlo (MCMC), more advancedWe’ll return to sampling for robot localization and tracking in dynamic BNsCloudyRainCSRWDecision NetworksMEU: choose the action which maximizes the expected utility given the evidenceCan directly operationalize this with decision diagramsBayes nets with nodes for utility and actionsLets us calculate the expected utility for each actionNew node types:Chance nodes (just like BNs)Actions (rectangles, must be parents, act as observed evidence)Utilities (depend on action and chance nodes)WeatherReportUmbrellaUDecision NetworksAction selection:Instantiate all evidenceCalculate posterior over parents of utility nodeSet action node each possible wayCalculate expected utility for each actionChoose maximizing actionWeatherReportUmbrellaUExample: Decision NetworksWeatherUmbrellaUW P(W)sun 0.7rain 0.3A W U(A,W)leave sun 100leave rain 0take sun 20take rain 70Example: Decision NetworksWeatherReportUmbrellaUA W U(A,W)leave sun 100leave rain 0take sun 20take rain 70W P(W)sun 0.7rain 0.3R P(R|rain)clear 0.2cloud 0.8R P(R|sun)clear 0.5cloudy 0.5Value of InformationIdea: compute value of acquiring each possible piece of evidenceCan be done directly from decision networkExample: buying oil drilling rightsTwo blocks A and B, exactly one has oil, worth kPrior probabilities 0.5 each, mutually exclusiveCurrent price of each block is k/2Probe gives accurate survey of A. Fair price?Solution: compute value of information= expected value of best action given the information minus expected value of best action without informationSurvey may say “oil in A” or “no oil in A,” prob 0.5 each= [0.5 * value of “buy A” given “oil in A”] + [0.5 * value of “buy B” given “no oil in A”] – 0= [0.5 * k/2] + [0.5 * k/2] - 0 = k/2OilLocDrillLocUGeneral FormulaCurrent evidence E=e, possible utility inputs sPotential new evidence E’: suppose we knew E’ = e’BUT E’ is a random variable whose value is currently unknown, so:Must compute expected gain over all possible values(VPI = value of perfect information)VPI PropertiesNonnegative in expectationNonadditive ---consider, e.g., obtaining Ej twiceOrder-independentVPI ExampleWeatherReportUmbrellaUVPI ScenariosImagine actions 1 and 2, for which U1 > U2How much will information about Ej be worth?Little – we’re sure action 1 is better.Little – info likely to change our action but not our utilityA lot – either could be much betterReasoning over TimeOften, we want to reason about a sequence of observationsSpeech recognitionRobot localizationUser attentionNeed to introduce time into our modelsBasic approach: hidden Markov models (HMMs)More general: dynamic Bayes’ netsMarkov ModelsA Markov model is a chain-structured BNEach node is identically distributed (stationarity)Value of X at a given time is called the stateAs a BN:Parameters: called transition probabilities, specify how the state evolves over time (also, initial probs)X2X1X3X4Conditional IndependenceBasic conditional independence:Past and future independent of the presentEach time step only depends on the previousThis is called the (first order) Markov propertyX2X1X3X4ExampleWeather:States: X = {rain, sun}Transitions:Initial distribution: 1.0 sunWhat’s the probability distribution after one step?rain sun0.90.90.10.1This is a CPT, not a BN!Mini-Forward AlgorithmQuestion: probability of being in state x at time t?Slow answer:Enumerate all sequences of length t which end in sAdd up their probabilitiesBetter answer: cached incremental belief updateForward simulationExampleFrom initial observation of sunFrom initial observation of rainP(X1) P(X2) P(X3) P(X)P(X1) P(X2) P(X3) P(X)Stationary DistributionsIf we simulate the chain long enough:What happens?Uncertainty accumulatesEventually, we have no idea what the state is!Stationary distributions:For most chains, the distribution we end up in is independent of the initial distributionCalled the stationary distribution of the chainUsually, can only predict a short time outWeb Link AnalysisPageRank over a web graphEach web page is a stateInitial distribution: uniform over pagesTransitions:With
View Full Document