Sequential Importance Sampling ResamplingArnaud DoucetDepartments of Statistics & Computer ScienceUniversity of British ColumbiaA .D. () 1 / 30Sequential Importance SamplingWe use a structured IS distributionqn(x1:n)= qn1(x1:n1)qn(xnjx1:n1)= q1(x1)q2(x2jx1)qn(xnjx1:n1)so if X(i)1:n1 qn1(x1:n1)then we only need to sampleX(i)nX(i)1:n1 qnxnjX(i)1:n1to obtain X(i)1:n qn(x1:n)The importance weights are updated according town(x1:n)=γn(x1:n)qn(x1:n)= wn1(x1:n1)γn(x1:n)γn1(x1:n1)qn(xnjx1:n1)| {z }αn(x1:n)A .D. () 2 / 30Sequential Importance SamplingWe use a structured IS distributionqn(x1:n)= qn1(x1:n1)qn(xnjx1:n1)= q1(x1)q2(x2jx1)qn(xnjx1:n1)so if X(i)1:n1 qn1(x1:n1)then we only need to sampleX(i)nX(i)1:n1 qnxnjX(i)1:n1to obtain X(i)1:n qn(x1:n)The importance weights are updated according town(x1:n)=γn(x1:n)qn(x1:n)= wn1(x1:n1)γn(x1:n)γn1(x1:n1)qn(xnjx1:n1)| {z }αn(x1:n)A .D. () 2 / 30Sequential Importance SamplingAt time n = 1, sample X(i)1 q1()and set w1X(i)1=γ1X(i)1q1X(i)1.At time n 2sample X(i)n qnjX(i)1:n1compute wnX(i)1:n= wn1X(i)1:n1αnX(i)1:n.It follows thatbπn(dx1:n)=N∑i =1W(i)nδX(i)1:n(dx1:n),bZn=1NN∑i =1wnX(i)1:n.A .D. () 3 / 30Sequential Importance SamplingAt time n = 1, sample X(i)1 q1()and set w1X(i)1=γ1X(i)1q1X(i)1.At time n 2sample X(i)n qnjX(i)1:n1compute wnX(i)1:n= wn1X(i)1:n1αnX(i)1:n.It follows thatbπn(dx1:n)=N∑i =1W(i)nδX(i)1:n(dx1:n),bZn=1NN∑i =1wnX(i)1:n.A .D. () 3 / 30Sequential Importance SamplingAt time n = 1, sample X(i)1 q1()and set w1X(i)1=γ1X(i)1q1X(i)1.At time n 2sample X(i)n qnjX(i)1:n1compute wnX(i)1:n= wn1X(i)1:n1αnX(i)1:n.It follows thatbπn(dx1:n)=N∑i =1W(i)nδX(i)1:n(dx1:n),bZn=1NN∑i =1wnX(i)1:n.A .D. () 3 / 30Sequential Importance SamplingAt time n = 1, sample X(i)1 q1()and set w1X(i)1=γ1X(i)1q1X(i)1.At time n 2sample X(i)n qnjX(i)1:n1compute wnX(i)1:n= wn1X(i)1:n1αnX(i)1:n.It follows thatbπn(dx1:n)=N∑i =1W(i)nδX(i)1:n(dx1:n),bZn=1NN∑i =1wnX(i)1:n.A .D. () 3 / 30Sequential Importance SamplingAt time n = 1, sample X(i)1 q1()and set w1X(i)1=γ1X(i)1q1X(i)1.At time n 2sample X(i)n qnjX(i)1:n1compute wnX(i)1:n= wn1X(i)1:n1αnX(i)1:n.It follows thatbπn(dx1:n)=N∑i =1W(i)nδX(i)1:n(dx1:n),bZn=1NN∑i =1wnX(i)1:n.A .D. () 3 / 30Sequential Importance Sampling for State-Space ModelsState-space modelsHidden Markov process: X1 µ, Xkj(Xk 1= xk 1) f(jxk 1)Observation process: Ykj(Xk= xk) g(jxk)Assume we have received y1:n, we are interested in sampling fromπn(x1:n)= p(x1:njy1:n)=p(x1:n, y1:n)p(y1:n)and estimating p(y1:n)whereγn(x1:n)= p(x1:n, y1:n)= µ(x1)n∏k =2f(xkjxk 1)n∏k =1g(ykjxk),Zn= p(y1:n)=ZZµ(x1)n∏k =2f(xkjxk 1)n∏k =1g(ykjxk)dx1:n.A .D. () 4 / 30Sequential Importance Sampling for State-Space ModelsState-space modelsHidden Markov process: X1 µ, Xkj(Xk 1= xk 1) f(jxk 1)Observation process: Ykj(Xk= xk) g(jxk)Assume we have received y1:n, we are interested in sampling fromπn(x1:n)= p(x1:njy1:n)=p(x1:n, y1:n)p(y1:n)and estimating p(y1:n)whereγn(x1:n)= p(x1:n, y1:n)= µ(x1)n∏k =2f(xkjxk 1)n∏k =1g(ykjxk),Zn= p(y1:n)=ZZµ(x1)n∏k =2f(xkjxk 1)n∏k =1g(ykjxk)dx1:n.A .D. () 4 / 30Locally Optimal Importance DistributionThe optimal IS distribution qn(xnjx1:n1)at time n minimizing thevariance of wn(x1:n)is given byqoptn(xnjx1:n1)= πn(xnjx1:n1)and yields an incremental importance weight of the formαn(x1:n)=γn(x1:n1)γn1(x1:n1)For state-space models, we haveqoptn(xnjx1:n1)= p(xnjyn, xn1)=g(ynjxn)f(xnjxn1)p(ynjxn1),αn(x1:n)= p(ynjxn1).A .D. () 5 / 30Locally Optimal Importance DistributionThe optimal IS distribution qn(xnjx1:n1)at time n minimizing thevariance of wn(x1:n)is given byqoptn(xnjx1:n1)= πn(xnjx1:n1)and yields an incremental importance weight of the formαn(x1:n)=γn(x1:n1)γn1(x1:n1)For state-space models, we haveqoptn(xnjx1:n1)= p(xnjyn, xn1)=g(ynjxn)f(xnjxn1)p(ynjxn1),αn(x1:n)= p(ynjxn1).A .D. () 5 / 30SummarySequential Importance Sampling is a special case of ImportanceSampling.Importance Sampling only works decently for moderate size problems.Today, we discuss how to partially …x this problem.A .D. () 6 / 30SummarySequential Importance Sampling is a special case of ImportanceSampling.Importance Sampling only works decently for moderate size problems.Today, we discuss how to partially …x this problem.A .D. () 6 / 30SummarySequential Importance Sampling is a special case of ImportanceSampling.Importance Sampling only works decently for moderate size problems.Today, we discuss how to partially …x this problem.A .D. () 6 / 30ResamplingIntuitive KEY idea: As the time index n increases, the variance of theunnormalized weightsnwnX(i)1:notend to increase and all the massis concentrated on a few random samples/particles. We propose toreset the approximation by getting rid in a principled way of theparticles with low weights W(i)n(relative to 1/N) and multiply theparticles with high weights W(i)n(relative to 1/N).The main reason is that if a particle at time n has a low weight thentypically it will still have a low weight at time n + 1 (though I caneasily give you a counterexample).You want to focus your computational e¤orts on the “promising”parts of the space.A .D. () 7 / 30ResamplingIntuitive KEY idea: As the time index n increases, the variance of theunnormalized weightsnwnX(i)1:notend to increase and all the massis concentrated on a few random samples/particles. We propose toreset the approximation by getting rid in a principled way of theparticles with low weights W(i)n(relative to 1/N) and multiply theparticles with high weights W(i)n(relative to 1/N).The main reason is that if a particle at time n has a low weight thentypically it will still have a low weight at time n + 1 (though I caneasily give you a counterexample).You want to focus your computational e¤orts on the “promising”parts of the space.A .D. () 7 /
View Full Document