**Unformatted text preview:**

CS 188Fall 2019 Section Handout 9 SolutionsSamplingSuppose we want to evaluate P (Q|E) where Q are the query variables and E are the evidence variables.Prior Sampling: Draw samples from the Bayes net by sampling the parents and then sampling the childrengiven the parents. P (Q|E) ≈count(QandE)count(E).Rejection Sampling: Like prior sampling, but ignore all samples that are inconsistent with the evidence.Likelihood Weighting: Fix the evidence variables, and weight each sample by the probability of the evidencevariables given their parents.Gibbs Sampling:1. Fix evidence.2. Initialize other variables randomly3. Repeat:(a) Choose non-evidence variable X.(b) Resample X from P (X|markovblanket(X))Decision Networks• Chance nodes - Chance nodes in a decision network behave identically to Bayes’ nets. Each outcomein a chance node has an associated probability, which can be determined by running inference on theunderlying Bayes’ net it belongs to. We’ll represent these with ovals.• Action nodes - Action nodes are nodes that we have complete control over; they’re nodes representinga choice between any of a number of actions which we have the power to choose from. We’ll representaction nodes with rectangles.• Utility nodes - Utility nodes are children of some combination of action and chance nodes. They outputa utility based on the values taken on by their parents, and are represented as diamonds in our decisionnetworks.The expected utility of taking an action A = a given evidence E = e and n chance nodes is computed withthe following formula:EU(A = a|E = e) =Xx1,...,xnP (X1= x1, ..., Xn= xn|E = e)U(A = a, X1= x1, ..., Xn= xn)where each xirepresents a value that the ithchance node can take on. The maximum expected utility isthe expected utility of the action that has the highest expected utility:MEU(E = e) = maxaEU(A = a|E = e).1Value of Perfect InformationValue of perfect information (VPI) quantifies the amount an agent’s maximum expected utility is expectedto increase if it were to observe some new evidence. Usually observing new evidence comes at a cost. If weobserved some new evidence E0= e0before acting, the maximum expected utility of our action at that pointwould becomeMEU(E = e, E0= e0) = maxaXxP (X = x|E = e, E0= e0)U(X = x, A = a).However, note that we don’t know what new evidence we’ll get. Because we don’t know what what new evidencee0we’ll get, we must represent it as a random variable E0. We will compute the expected value of the maximumexpected utility:MEU(E = e, E0) =Xe0P (E0= e0|E = e)MEU(E = e, E0= e0).Observing a new evidence variable yields a different MEU with probabilities corresponding to the probabilitiesof observing each value for the evidence variable, and so by computing MEU (E = e, E0) as above, we computewhat we expect our new MEU will be if we choose to observe new evidence. The VPI is the expected maximumexpected utility if we were to observe the new evidence, minus the maximum expected utility if we were not toobserve the new evidence:V P I(E0|E = e) = MEU(E = e, E0) − MEU(E = e).Properties of VPIThe value of perfect information has several very important properties, namely:• Nonnegativity. ∀E0, e V P I(E0|E = e) ≥ 0Observing new information always allows you to make a more informed decision, and so your maximumexpected utility can only increase (or stay the same if the information is irrelevant for the decision youmust make).• Nonadditivity. V P I(Ej, Ek|E = e) 6= V P I(Ej|E = e) + V P I(Ek|E = e) in general.This is probably the trickiest of the three properties to understand intuitively. It’s true because generallyobserving some new evidence Ejmight change how much we care about Ek; therefore we can’t simply addthe VPI of observing Ejto the VPI of observing Ekto get the VPI of observing both of them. Rather,the VPI of observing two new evidence variables is equivalent to observing one, incorporating it into ourcurrent evidence, then observing the other. This is encapsulated by the order-independence property ofVPI, described more below.• Order-independence. V P I(Ej, Ek|E = e) = V P I(Ej|E = e) + V P I(Ek|E = e, Ej) = V P I(Ek|E =e) + V P I(Ej|E = e, Ek)Observing multiple new evidences yields the same gain in maximum expected utility regardless of theorder of observation. This should be a fairly straightforward assumption - because we don’t actually takeany action until after observing any new evidence variables, it doesn’t actually matter whether we observethe new evidence variables together or in some arbitrary sequential order.2Q1. Bayes’ Nets SamplingAssume the following Bayes’ net, and the corresponding distributions over the variables in the Bayes’ net:BAC DP (B|A)a b 2/3a +b 1/3+a b 4/5+a +b 1/5P (A)a 3/4+a 1/4P (C|B)b c 1/4b +c 3/4+b c 1/2+b +c 1/2P (D|C)c d 1/8c +d 7/8+c d 5/6+c +d 1/6(a) You are given the following samples:+a + b − c − d+a − b + c − d−a + b + c − d−a − b + c − d+a − b − c + d+a + b + c − d−a + b − c + d−a − b + c − d(i) Assume that these samples came from performing Prior Sampling, and calculate the sample estimateof P (+c).5/8(ii) Now we will estimate P (+c | +a, −d). Above, clearly cross out the samples that would not be usedwhen doing Rejection Sampling for this task, and write down the sample estimate of P (+c | +a, −d)below.2/3(b) Using Likelihood Weighting Sampling to estimate P (−a | +b, −d), the following samples were obtained.Fill in the weight of each sample in the corresponding row.Sample Weight−a + b + c − d P (+b | −a)P (−d | +c) = 1/3 ∗ 5/6 = 5/18 = 0.277+a + b + c − d P (+b | +a)P (−d | +c) = 1/5 ∗ 5/6 = 5/30 = 1/6 = 0.17+a + b − c − d P (+b | +a)P (−d | −c) = 1/5 ∗ 1/8 = 1/40 = 0.025−a + b − c − d P (+b | −a)P (−d | −c) = 1/3 ∗ 1/8 = 1/24 = 0.042(c) From the weighted samples in the previous question, estimate P (−a | +b, −d).5/18+1/245/18+5/30+1/40+1/24= 0.625(d) Which query is better suited for likelihood weighting, P (D | A) or P (A | D)? Justify your answer in onesentence.P (D | A) is better suited for likelihood weighting sampling, because likelihood weighting conditions onlyon upstream evidence.3(e) Recall that during Gibbs Sampling, samples are generated through an iterative process.Assume that the only evidence that is available is A = +a. Clearly fill in the circle(s) of the sequence(s)below that could have been generated by Gibbs

View Full Document