DOC PREVIEW
MIT 6 454 - The Cross-Entropy Method

This preview shows page 1-2-14-15-29-30 out of 30 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 30 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 30 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 30 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 30 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 30 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 30 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 30 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

The Cross-Entropy Method6.454 Seminar17 September 2003Guy Weichenberg1Outline• Introduction.• CE method for rare-event simulation (RES).– Example.• CE method for combinatorial optimization problems (COPs).– Example.• Enhancements.• Convergence.• Conclusions.2Introduction• CE method broadened from a rare-event simulation techniqueto generic tool for solving different NP-hard problems.• CE method a global iterative search algorithm comprisingfollowing two steps:1. Generate random data samples using set of dynamicparameters.2. Update parameters governing random data generation usingdata samples themselves with objective of improving futuredata samples.3CE Method forRare-Event Simulation4Preliminaries• Let X =(X1,...,Xn) be random vector taking values in somespace X .• Let S be real-valued function on X .• Let f(·; u) be probability density function on X parameterizedby u.Suppose we are interested in probability of occurrence of a rareevent. Specifically, we are interested in very small probability l thatS(x) is greater than or equal to real number γ under f(·; u):l = Pu(S(X) ≥ γ)=EuI{S(X)≥γ}.5Crude Monte CarloSimulation• In CMC approach to estimating l, we simply draw randomsample X1,...,XNfrom f(·; u) and compute:1NNi=1I{S(Xi)≥γ}to arrive at unbiased estimate of l.• For rare events, most terms in above summation will be zero,thus requiring very large value of N to obtain meaningfulestimate of l.6Importance Sampling• Using importance sampling density g, we represent l as:l =I{S(x)≥γ}f(x; u)g(x)g(x)dx = EgI{S(X)≥γ}f(X; u)g(X).• Hence, unbiased estimator of l is: l =1NNi=1I{S(Xi)≥γ}W (Xi),where W (x)=f(x; u)/g(x)islikelihood ratio (LR), andX1,...,XNare i.i.d. vectors drawn from g.7Importance Sampling(continued)• Optimal importance sampling density is:g∗(x)=I{S(x)≥γ}f(x; u)l→ estimate has zero variance and only one sample is required.• Problem with IS approach is obtaining g∗, as it depends on lwhich we are attempting to estimate. In addition, it is oftenconvenient to choose importance sampling density of same formas f(·; u).8Importance Sampling(continued)• When choosing importance sampling density of form f(·; u), wemust find density f(·; v) which is “closest” to g∗.Convenientmeasure of “closeness” is Kullback-Leibler (K-L) distance:D(g, h)=Eglng(X)h(X)=g(x)lng(x)dx −g(x)lnh(x)dx.• Minimizing K-L distance between g∗and f(·; v) is equivalentto following maximization problem:maxvEuI{S(X)≥γ}ln f(X; v)=maxvEwI{S(X)≥γ}W (X; u, w)lnf(X; v),where W (x; u, w)=f(x; u)/f(x; w) is the LR between f(·; u)and f(·; w)atx.9Importance Sampling(continued)• Maximizing v∗can be estimated by solving stochasticcounterpart:v∗= argmaxv1NNi=1I{S(Xi)≥γ}W (Xi; u, w)lnf(Xi; v), (1)where X1,...,XNare iid vectors drawn from f(·; w).• When function to be maximized in (1) is differentiable w.r.t. v,solution may be analytically obtained by solving following system of equations:1NNi=1I{S(Xi)≥γ}W (Xi; u, w)∇ ln f(Xi; v)=0. (2)10Importance Sampling(continued)• Approaches implied by (1) and (2) to estimating v∗only yieldmeaningful estimates when not too many of I{S(Xi)≥γ}termsare zero (i.e. when the event of interest is not rare).11CE Algorithm forRare-Event SimulationFor rare events, we estimate v∗using following iterative algorithmin which vtand γtassociated with each iteration progressivelyapproach v∗and γ, respectively:1. Define vo= u.Sett =1.2. Generate samples X1,...,XNfrom density f(·; vt−1)andcompute sample (1 − ρ)-quantile of γtof performances (i.e. γt= S((1−ρ)))provided γtis less than γ. Otherwise, set γt= γ.3. Use same sample X1,...,XNto solve stochastic program: vt= argmaxv1NNi=1I{S(Xi)≥ γt}W (Xi; u, vt−1)lnf(Xi; v).12CE Algorithm forRare-Event Simulation(continued)4. If γt<γ,sett = t + 1 and reiterate from Step 2. Otherwise,proceed with Step 5.5. Estimate rare-event probability l using LR estimate: l =1N1N1i=1I{S(Xi)≥γ}W (Xi; u, vT),where T denotes final number of iterations.13Shortest PathExampleSuppose the edge weights X =(X1,...,X5) are independent andexponentially distributed with means u =(u1,...,u5).A B X1 X2 X3 X4 X5 Let S(X) be the shortest path length from A to B. We wish toestimate:l = Pu(S(X) ≥ γ)=EuI{S(X)≥γ}.14Example(continued)Step 1: Set t = 1, and the initial density to:f(x; v0)=exp−5j=1xjuj5j=11ujStep 2: Generate samples X1,...,XNfrom the density:f(x; vt−1)=exp−5j=1xjvt−1,j5j=11vt−1,jStep 3: Solve the stochastic program: vt= argmaxv1NNi=1I{S(Xi)≥ γt}W (Xi; u, vt−1)lnf(Xi; v).15Example(continued)where:W (Xi; u, vt−1)=exp−5j=1xj1uj−1 vt−1,j5j=1 vt−1,jujby solving the following set of equations (obtained from (2)):Ni=1I{S(Xi)≥ γt}W (Xi; u, vt−1)Xijv2j−1vj=0,j=1,. . . ,5.Thus, vt,j=Ni=1I{S(Xi)≥ γt}W (Xi; u, vt−1)XijNi=1I{S(Xi)≥ γt}W (Xi; u, vt−1).16Example(continued)Performance comparison:• CMC: Using 108samples, we obtain l =1.30 × 10−5withrelative error (Var( l)1/2/ l) of 0.03 in 6350 seconds.• CE: Using N = 1000, N1=105and ρ =0.1, we obtain l =1.34 × 10−5with relative error of 0.03 in 3 seconds (5iterations).17CE Method forCombinatorial Optimization18Preliminaries• Let X be a finite set of states.• Let S be a real-valued performance function we wish tomaximize over X .• Let γ∗be maximum value of S.• Let x∗be state at which this maximum occurs .• Let f(·; v) be probability mass function on X , parameterizedby vector v.19Recasting the COPMain idea: Recast deterministic COP into probabilistic framework,where RES technique can be used.• For certain parameter vector u and threshold γ,definel(γ)asprobability that performance function S exceeds γ:l(γ)=Pu(S(X) ≥ γ)=xI{S(x)≥γ}f(x; u)=EuI{S(X)≥γ}.• In typical COPs, |X | is very large and l(γ∗)=f(x∗; u) ≈ 1/|X |is consequently very small → l(γ∗)arare event.20CE Algorithm forCombinatorial Optimization1. Define vo= u.Sett =1.2. Generate samples X1,...,XNfrom density f(·; vt−1)andcompute sample (1 − ρ)-quantile of γtof performances (i.e. γt= S((1−ρ)))provided γtis less than γ. Otherwise, set γt= γ.3. Use same sample X1,...,XNto solve stochastic program: vt= argmaxv1NNi=1I{S(Xi)≥ γt}ln f(Xi; v). (3)4. If for some t ≥ d: γt= γt−1= ···=


View Full Document

MIT 6 454 - The Cross-Entropy Method

Documents in this Course
Load more
Download The Cross-Entropy Method
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view The Cross-Entropy Method and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view The Cross-Entropy Method 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?