Simulated AnnealingSimulated annealing improves the greedy local search in the following way.Assume the task is to minimize a function f(x).• At each iteration we choose a point randomly from the neighborhoodof the current point. Let x denote the current point and let y be therandomly chosen neighbor.• If f (y) < f(x), that is, a better point is found, then we accept it andy will be the next position.• If y is not better or even worse than x, we still accept it with a certainprobability. This probability is computed as follows. Define a quantity∆E by∆E = f(y) − f(x).This shows by how much y is worse than x. The notation is based ona physical analogy in which the function value represents the energy ofa state in the state space and then ∆E is the energy difference. Withthis the acceptance probability is defined asProbaccept= e−∆EkTwhere k is a constant and T is a parameter called temperature (thesecome again from the physical analogy, where k is called the Boltzmannconstant, but this playes no role in the optimization, here we can sim-ply take k = 1). Thus, the new point y is accepted with probabilityProbacceptor we stay in x with probability 1 − Probaccept.• As the iteration proceeds the temperature is gradually decreased. Thus,instead of a constant T , we use a so called cooling schedule, which is asequence Tn, such that in the nthiteration Tnis used as the temperatureparameter. The cooling schedule is chosen such that Tn→ 0 holds. Atypical choice is the logarithmic cooling schedule:Tn=ab + log nwhere a, b are positive constants.• As a stopping rule we can again say that if no improvement was ob-tained over a certain number of iterations than the algorithm stops.Or, we can continue the iterations until the temperature drops belowa small ² > 0, that is, the system “freezes”.Interpretation of the AlgorithmAn essential feature of simulated annealing that it can climb out from a localminimum, since it can accept worse neighbors as the next step. Such anacceptance happens with a probability that is smaller if the neighbor’s qualityis worse. That is, with larger ∆E the acceptance probability gets smaller.At the same time, with decreasing temperature all acceptance probabilitiesget smaller. This means, initially the system is “hot”, makes more randommovement, it can relatively easily jump into worse states, too. On the otherhand, over time it “cools down”, so that worsening moves are accepted lessand less frequently. Finally, the system gets “frozen”.This process can be interpreted such that initially, when the achieved qualityis not yet too good, we are willing to accept worse positions in order to be ableto climb out of local minimums, but later, with improving quality we tendto reject the worsening moves more and more frequently. These tendenciescan be balanced and controlled by the choice of the cooling schedule. If itis chosen carefully, then convergence to a global optimum can be guaranteed(proven mathematically).Advantages of Simulated Annealing• Improves greedy local search by avoiding getting trapped in a localoptimum.• With appropriately chosen cooling schedule the algorithm converges tothe global optimum.Disadvantages of Simulated Annealing• Convergence to the optimum can be slow.• There is no general way of estimating the number of iterations neededfor a given problem, so we cannot easily guarantee that the result iswithin a certain error bound of the global optimum.ExerciseAssume that in a Simulated Annealing algorithm we define the neighbor-hood of an n-dimensional binary vector x as follows. Let w(x) denote thenumber of 1 bits in x. The binary vector y is a neighbor of x if and onlyif |w(x) − w(y)| = α min{w(x), w(y)} holds, where α is a constant that wecan choose. We want to choose the value of α, such that the state spacebecomes connected, that is, any vector can b e reached from any other onethrough some sequence of moves, going from neighbor to neighbor. Whichof the following is correct?1. If α = 1, then the state space will be connected.2. If α ≥ 2, then the state space will be connected.3. The state space will be connected for any α > 0.4. No matter how α is chosen, this state space will never be connected.5. None of the
View Full Document