Berkeley COMPSCI 188 - Bayes Nets III - D867101

Home> Schools> University of California, Berkeley> Computer Science (COMPSCI) > COMPSCI 188> Bayes Nets III

DOC PREVIEW

Berkeley COMPSCI 188 - Bayes Nets III

School name University of California, Berkeley

Course Compsci 188- Introduction to Artificial Intelligence

Pages 36

This preview shows page 1-2-17-18-19-35-36 out of 36 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 36 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 36 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 36 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 36 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 36 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 36 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 36 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 36 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

CS 188: Artificial Intelligence Fall 2007AnnouncementsInferenceReminder: Alarm NetworkNormalization TrickInference by Enumeration?Nesting SumsEvaluation TreeVariable Elimination: IdeaBasic ObjectsBasic OperationsSlide 12ExampleSlide 14General Variable EliminationSlide 16Slide 17Variable EliminationSamplingPrior SamplingSlide 21Slide 22Rejection SamplingLikelihood WeightingLikelihood SamplingSlide 26Slide 27Decision NetworksSlide 29Example: Decision NetworksSlide 31Value of InformationGeneral FormulaVPI PropertiesVPI ExampleVPI ScenariosCS 188: Artificial IntelligenceFall 2007Lecture 18: Bayes Nets III10/30/2007Dan Klein – UC BerkeleyAnnouncementsProject shift:Project 4 moved back a littleInstead, mega-mini-homework, worth 3x, gradedContest is liveInferenceInference: calculating some statistic from a joint probability distributionExamples:Posterior probability:Most likely explanation:RTBDLT’Reminder: Alarm NetworkNormalization TrickNormalizeInference by Enumeration?Nesting SumsAtomic inference is extremely slow!Slightly clever way to save work:Move the sums as far right as possibleExample:Evaluation TreeView the nested sums as a computation tree:Still repeated work: calculate P(m | a) P(j | a) twice, etc.Variable Elimination: IdeaLots of redundant work in the computation treeWe can save time if we cache all partial resultsJoin on one hidden variable at a timeProject out that variable immediatelyThis is the basic idea behind variable eliminationBasic ObjectsTrack objects called factorsInitial factors are local CPTsDuring elimination, create new factorsAnatomy of a factor:Variables introducedVariables summed outArgument variables, always non-evidence variables4 numbers, one for each value of D and EBasic OperationsFirst basic operation: join factorsCombining two factors:Just like a database joinBuild a factor over the union of the domainsExample:Basic OperationsSecond basic operation: marginalizationTake a factor and sum out a variableShrinks a factor to a smaller oneA projection operationExample:ExampleExampleGeneral Variable EliminationQuery:Start with initial factors:Local CPTs (but instantiated by evidence)While there are still hidden variables (not Q or evidence):Pick a hidden variable HJoin all factors mentioning HProject out HJoin all remaining factors and normalizeExampleChoose AExampleChoose EFinishNormalizeVariable EliminationWhat you need to know:VE caches intermediate computationsPolynomial time for tree-structured graphs!Saves time by marginalizing variables ask soon as possible rather than at the endWe will see special cases of VE laterYou’ll have to implement the special casesApproximationsExact inference is slow, especially when you have a lot of hidden nodesApproximate methods give you a (close) answer, fasterSamplingBasic idea:Draw N samples from a sampling distribution SCompute an approximate posterior probabilityShow this converges to the true probability POutline:Sampling from an empty networkRejection sampling: reject samples disagreeing with evidenceLikelihood weighting: use evidence to weight samplesPrior SamplingCloudySprinklerRainWetGrassCloudySprinklerRainWetGrassPrior SamplingThis process generates samples with probability…i.e. the BN’s joint probabilityLet the number of samples of an event beThenI.e., the sampling procedure is consistentExampleWe’ll get a bunch of samples from the BN:c, s, r, wc, s, r, wc, s, r, wc, s, r, wc, s, r, wIf we want to know P(W)We have counts <w:4, w:1>Normalize to get P(W) = <w:0.8, w:0.2>This will get closer to the true distribution with more samplesCan estimate anything else, tooWhat about P(C| r)? P(C| r, w)?CloudySprinklerRainWetGrassCSRWRejection SamplingLet’s say we want P(C)No point keeping all samples aroundJust tally counts of C outcomesLet’s say we want P(C| s)Same thing: tally C outcomes, but ignore (reject) samples which don’t have S=sThis is rejection samplingIt is also consistent (correct in the limit)c, s, r, wc, s, r, wc, s, r, wc, s, r, wc, s, r, wCloudySprinklerRainWetGrassCSRWLikelihood WeightingProblem with rejection sampling:If evidence is unlikely, you reject a lot of samplesYou don’t exploit your evidence as you sampleConsider P(B|a)Idea: fix evidence variables and sample the restProblem: sample distribution not consistent!Solution: weight by probability of evidence given parentsBurglary AlarmBurglary AlarmLikelihood SamplingCloudySprinklerRainWetGrassCloudySprinklerRainWetGrassLikelihood WeightingSampling distribution if z sampled and e fixed evidenceNow, samples have weightsTogether, weighted sampling distribution is consistentCloudyRainCSRWLikelihood WeightingNote that likelihood weighting doesn’t solve all our problemsRare evidence is taken into account for downstream variables, but not upstream onesA better solution is Markov-chain Monte Carlo (MCMC), more advancedWe’ll return to sampling for robot localization and tracking in dynamic BNsCloudyRainCSRWDecision NetworksMEU: choose the action which maximizes the expected utility given the evidenceCan directly operationalize this with decision diagramsBayes nets with nodes for utility and actionsLets us calculate the expected utility for each actionNew node types:Chance nodes (just like BNs)Actions (rectangles, must be parents, act as observed evidence)Utilities (depend on action and chance nodes)WeatherReportUmbrellaUDecision NetworksAction selection:Instantiate all evidenceCalculate posterior over parents of utility nodeSet action node each possible wayCalculate expected utility for each actionChoose maximizing actionWeatherReportUmbrellaUExample: Decision NetworksWeatherUmbrellaUW P(W)sun 0.7rain 0.3A W U(A,W)leave sun 100leave rain 0take sun 20take rain 70Example: Decision NetworksWeatherReportUmbrellaUA W U(A,W)leave sun 100leave rain 0take sun 20take rain 70W P(W)sun 0.7rain 0.3R P(R|rain)clear 0.2cloud 0.8R P(R|sun)clear 0.5cloudy 0.5Value of InformationIdea: compute value of acquiring each possible piece of evidenceCan be done directly from decision networkExample: buying oil drilling rightsTwo blocks A and B, exactly one has oil, worth kPrior

View Full Document