New version page

# Berkeley CS 188 - Exam Prep 8 Solutions

Pages: 5
Documents in this Course

## This preview shows page 1-2 out of 5 pages.

View Full Document

End of preview. Want to read all 5 pages?

View Full Document
Unformatted text preview:

CS 188Fall 2019 Exam Prep 8 SolutionsQ1. Argg! Sampling for the Legendary TreasureLittle did you know that Michael and John are actually infamous pirates. One day, they go treasure hunting inthe Ocean of Bayes, where rumor says a great treasure lies in wait for explorers who dare navigate in the roughwaters. After navigating about the ocean, they are within grasp of the treasure. Their current configuration isrepresented by the boat in the figure below. They can only make one move, and must choose from the actions:(North, South, East, West). Stopping is not allowed. They will land in either a whirlpool (W), an island witha small treasure (S), or an island with the legendary treasure (T). The utilities of the three types of locationsare shown below:State U(State)T (Legendary Treasure) 100S (Small Treasure) 25W (Whirlpool) -50The success of their action depends on the random variable Movement (M), which takes on one of two values:(+m, -m). The Movement random variable has many relationships with other variables: Presence of EnemyPirates (E), Rain (R), Strong Waves (W), and Presence of Fishermen (F). The Bayes’ net graph that representsthese relationships is shown below:RE WM FR P(R)+r 0.4-r 0.6E R P(E | R)+e +r 0.3-e +r 0.7+e -r 0.6-e -r 0.4W R P(W | R)+w +r 0.9-w +r 0.1+w -r 0.2-w -r 0.8F W P(F | W)+f +w 0.15-f +w 0.85+f -w 0.75-f -w 0.25M E W P(M | E,W)+m +e +w 0.1-m +e +w 0.9+m +e -w 0.45-m +e -w 0.55+m -e +w 0.35-m -e +w 0.65+m -e -w 0.9-m -e -w 0.1In the following questions we will follow a two-step process:– (1) Michael and John observed the random variables R = −r and F = +f . We then determine the distributionfor P (M|− r, +f) via sampling.– (2) Based on the estimate for P (M| − r, +f), after committing to an action, landing in the intended lo-cation of an action successfully occurs with probability P (M = +m|−r, +f). The other three possible landingpositions occur with probabilityP (M=−m|−r,+f)3each. Use this transition distribution to calculate the optimal1action(s) to take and the expected utility of those actions.2(a) (i) Rejection Sampling: You want to estimate P (M = +m| − r, +f) by rejection sampling. Below isa list of samples that were generated using prior sampling. Cross out those that would be rejectedby rejection sampling.+r + e + w − m − f−r − e + w − m − f−r + e − w − m + f+r − e − w + m − f−r − e − w − m + f−r + e − w − m + f−r − e + w − m + f+r − e + w + m − f−r − e − w + m + f+r − e − w + m + f−r + e + w − m + f−r + e − w − m + fAll samples without the conditioning −r, +f are rejected.(ii) What is the approximation for P (M = +m| − r, +f) using the remaining samples?17, the fraction of accepted samples with +m instantiated.(iii) What are the optimal action(s) for Michael and John based on this estimate of P (M = +m|−r, +f )?South, West. As p(+m| − r, +f) =17, p(−m| − r, +f) =67. Michael and John will succeed in theselected action17of the time, or take one of the other 3 actions with equal probability of27. Inthis case, p(+m| − r, +f ) is so low that deciding to head in the direction of the whirlpool actuallydecreases the chances of landing in it.(iv) What is the expected utility for the optimal action(s) based on this estimate of P(M = +m|−r, +f )?17∗ (−50) +27∗ (−50) +27∗ (25) +27∗ (100) =1007, the weighted sum of all four outcomes.(b) (i) Likelihood Weighting: Suppose instead that you perform likelihood weighting on the followingsamples to get the estimate for P (M = +m| − r, +f ). You receive 4 samples consistent with theevidence.Sample Weight−r − e + w + m + f P (−r)P (+f| + w) = 0.6 ∗ 0.15 = 0.09−r − e − w + m + f P (−r)P (+f| − w) = 0.6 ∗ 0.75 = 0.45−r − e + w − m + f P (−r)P (+f| + w) = 0.6 ∗ 0.15 = 0.09−r + e − w − m + f P (−r)P (+f| − w) = 0.6 ∗ 0.75 = 0.45(ii) What is the approximation for P (M = +m| − r, +f) using the samples above?0.09+0.450.09+0.45+0.09+0.45=12(iii) What are the optimal action(s) for Michael and John based on this estimate of P (M = +m|−r, +f )?East(iv) What is the expected utility for the optimal action(s) based on this estimate of P (M = +m|−r, +f)?16∗ (−50) +16∗ (−50) +16∗ (25) +12∗ (100) =7523Here is a copy of the Bayes’ Net, repeated for your convenience.RE WM FR P(R)+r 0.4-r 0.6E R P(E | R)+e +r 0.3-e +r 0.7+e -r 0.6-e -r 0.4W R P(W | R)+w +r 0.9-w +r 0.1+w -r 0.2-w -r 0.8F W P(F | W)+f +w 0.15-f +w 0.85+f -w 0.75-f -w 0.25M E W P(M | E,W)+m +e +w 0.1-m +e +w 0.9+m +e -w 0.45-m +e -w 0.55+m -e +w 0.35-m -e +w 0.65+m -e -w 0.9-m -e -w 0.1(c) (i) Gibbs Sampling. Now, we tackle the same problem, this time using Gibbs’ sampling. We startout with initializing our evidence: R = −r , F = +f. Furthermore, we start with this random sample:−r + e −w + m + f.We select variable E to resample. Calculate the numerical value for:P (E = +e|R = −r, W = −w, M = +m, F = +f ).P (E = +e|R = −r, W = −w, M = +m, F = +f ) =P (+e|−r)P (+m|+e,−w)P (+e|−r)P (+m|+e,−w)+P (−e|−r)P (+m|−e,−w)=0.6∗0.450.6∗0.45+0.4∗0.9=37We resample for a long time until we end up with the sample:−r −e + w + m + f.Michael and John are happy for fixing this one sample, but they do not have enough time leftto compute another sample before making a move. They will let this one sample approximate thedistribution: P (M = +m| − r, +f).(ii) What is the approximation for P (M = +m| − r, +f), using this one sample?1(iii) What are the optimal action(s) for Michael and John based on this estimate of P (M = +m|−r, +f )?East(iv) What is the expected utility for the optimal action(s) based on this estimate of P (M = +m|−r, +f)?1004Q2. UtilitiesDavis is on his way to a final exam planning meeting. He is already running late (the meeting is starting now)and he’s trying to determine whether he should wait for the bus or just walk.It takes 20 minutes to get to Cory Hall by walking, and only 5 minutes to get to there by bus. The bus willeither come in 10, 20, or 30 minutes, each with probability 1/3.(a) Davis hates being late; his utility for being late as a function of t, the number of minutes late he is, isUD(t) =0 : t ≤ 0−2t/5: t > 0What is the expected utility of each action? Should he wait for the bus or …

View Full Document Unlocking...