^{1}

^{2}

^{*}

^{2}

Simulated annealing (SA) has been a very useful stochastic method for solving problems of multidimensional global optimization that ensures convergence to a global optimum. This paper proposes a variable cooling factor (VCF) model for simulated annealing schedule as a new cooling scheme to determine an optimal annealing algorithm called the Powell-simulated annealing (PSA) algorithm. The PSA algorithm is aimed at speeding up the annealing process and also finding the global minima of test functions of several variables without calculating their derivatives. It has been applied and compared with the SA algorithm and Nelder and Mead Simplex (NMS) methods on Rosenbrock valleys in 2 dimensions and multiminima functions in 3, 4 and 8 dimensions. The PSA algorithm proves to be more reliable and always able to find the optimum or a point very close to it with minimal number of iterations and computational time. The VCF compares favourably with the Lundy and Mees, linear, exponential and geometric cooling schemes based on their relative cooling rates. The PSA algorithm has also been programmed to run on android smartphone systems (ASS) that facilitates the computation of combinatorial optimization problems.

Simulated annealing (SA) is a random search optimization technique that is very useful for solving global optimization problems involving several variables and ensures convergence to a global optimum [

This paper seeks to propose a variable cooling factor model for SA schedule to speed up the annealing process. It further formulates the Powell-simulated annealing (PSA) algorithm [

SA, as noted in the previous section, is a powerful search algorithm for solving global optimization problems. The rate at which the SA reaches its global minimum is determined by the cooling schedule parameters, which include the starting temperature, cooling factor, number of transitions at each temperature and other termination conditions [

T k + 1 = T k − Δ T (1)

where T k + 1 is the new temperature value obtained from the old T k and Δ T is the amount of temperature reduction, kept constant and chosen from the interval [0.1, 0.2] while the initial temperature strongly depends on the problem being considered. van Laarhoven and Aarts [

T k + 1 = β k T k (2)

where T k + 1 and T k denote the new and old temperature values, respectively, and β is the cooling factor assumed to be a fixed value in the interval [0.8, 0.99] [

Lundy and Mees [

T k + 1 = T k ( 1 + β T k ) − 1 (3)

where β is a very small constant. Aarts and van Laarhoven [

In this paper, we claim that the cooling factor should be a variable and not a fixed value as proposed by many authors hereinabove. A variable cooling factor (VCF) is a factor that decreases the temperature at each transition state during cooling process. Starting with a low value, the VCF increases gradually in value with number of transition states until the frozen point is reached. This section develops a variable cooling factor model that relates the cooling rates to transition states that a substance undergoes during the cooling process and size of a cost function. First, we consider the following theorems:

・ Theorem A:

If x ∈ ℝ , the set of real numbers, where | x | denotes non-negative real number, then | ∑ k = 1 n x k | ≤ ∑ k = 1 n | x k | and which follows that x ≤ | x | , the modulus of a sum of x k is not greater than the sum of the separate moduli [

・ Theorem B:

Suppose Z 1 and Z 2 are continuous random variables with joint probability density function f ( z 1 , z 2 ) and let Y = Z 1 / Z 2 (ratio of the two random variables). Then Y has the probability density function [

f Y ( y ) = ∫ − ∞ ∞ | z | f ( z , y ( z ) ) d z (4)

However, if Z 1 and Z 2 are independent, we obtain Equation (5):

f Y ( y ) = ∫ − ∞ ∞ | z | f z 1 ( z ) f z 2 ( y ( z ) ) d z (5)

Suppose Q 1 , Q 2 , ⋯ , Q N are independent and identically distributed (iid) random particles of a substance that varies with temperature variations with standard deviation σ and mean number of particles μ. For simplicity, we let the random variables Q i , for i = 1 , 2 , ⋯ , N , be independent standard normal random variables with marginal probability density function for each i be given by Equation (6):

f i ( Q i ) = e − 1 2 Q i 2 2 π , − ∞ < Q i < ∞ (6)

Then the joint probability density function of the two independent random variables Q 1 and Q 2 is given by Equation (7):

f 12 ( Q 1 , Q 2 ) = e − 1 2 ( Q 1 2 + Q 2 2 ) 2 π , − ∞ < Q 1 < ∞ ; − ∞ < Q 2 < ∞ (7)

Let us consider the 2 × 2 transformation in Equation (8):

u = g 1 ( Q 1 , Q 2 ) = Q 1 / Q 2 , and v = g 2 ( Q 1 , Q 2 ) = Q 2 , (8)

which is a one-to-one transformation from u = { ( u 1 , u 2 ) | − ∞ < u 1 < ∞ ; − ∞ < u 2 < ∞ } to v = { ( v 1 , v 2 ) | − ∞ < v 1 < ∞ ; − ∞ < v 2 < ∞ } with inverses, Q 1 = g 1 − 1 ( u , v ) = u v and Q 2 = g 2 − 1 ( u , v ) = v and the Jacobian,

J = ∂ ( Q 1 , Q 2 ) ∂ ( u , v ) = | ∂ Q 1 ∂ u ∂ Q 1 ∂ v ∂ Q 2 ∂ u ∂ Q 2 ∂ v | = | v u 0 1 | = v (9)

Therefore, using the transformation technique (Theorem B), the joint probability density function of Q 1 and Q 2 is:

f u v ( u , v ) = f 12 ( g 1 − 1 ( u , v ) g 2 − 1 ( u , v ) ) | J | , − ∞ < u < ∞ ; − ∞ < v < ∞ (10)

Using integration by parts, the probability density function of U is given by Equation (11):

f u ( u ) = ∫ − ∞ ∞ f u v ( u , v ) d v = ∫ − ∞ ∞ v e − 1 2 v 2 ( u 2 + 1 ) d v = 1 π ( u 2 + 1 ) , − ∞ < u < ∞ (11)

which is the joint probability density function of random particles of the substance that varies with temperature variations. This joint distribution behaves as Cauchy distribution, which has flatter tails, making it is easier to escape from local minima. Now if we let Q i assumes a Cauchy distribution, μ be any real number and σ > 0, then Q = μ + σ u [

f ( Q , μ , σ ) = 1 π ( σ 2 ( Q − μ ) 2 + σ 2 ) , − ∞ < Q < ∞ (12)

where σ and μ are the standard deviation and mean, respectively. From Equation (12), we let

1 π ( σ 2 ( Q − μ ) 2 + σ 2 ) = 1 π ( 1 + ( Q − μ ) 2 σ 2 ) − 1 (13)

Then by using Equation (13) and Theorem A, we have Equation (14):

1 π ( 1 + ( Q − μ ) 2 σ 2 ) − 1 < ( 1 + ( Q − μ ) 2 σ 2 ) − 1 (14)

Let the number of variables (size) of the cost function be v, where v ≥ 2 . Then putting

( ( Q − μ ) 2 σ 2 ) = ( 1 k ( v + 1 ) + v ) , (15)

where k ( ν + 1 ) is the total number of iterations in the system, we obtain the Equation (16):

( Q − μ ) 2 = ( 1 k ( ν + 1 ) + ν ) (16)

for σ = 1 ; k = 1 , 2 , ⋯ , N , the number of cycle counts in the system; and ( v + 1 ) , the fixed number of iterations in each transition state before the temperature is updated. That is, for σ = 1 , the square of deviations are inversely proportional to the square root of sum of the number of variables in the cost function and number of iterations at each transition [

Φ k = ( 1 + 1 k ( ν + 1 ) + ν ) − 1 , (17)

which is the rule by which the random particles of a substance loose energy to the surroundings. Hence, the new cooling scheme for the annealing process is given by Equation (17) and the temperature-update formula is by Equation (18):

T k + 1 = Φ k T k (18)_{ }

where Φ k ∈ [ 0.60 , 0.99 ] is the variable cooling factor, which depends on the number of transition states k.

A simulated annealing cooling scheme that is relatively slow has the ability to avoid premature convergence of local minima and thus provides good solution of optimization problem. The VCF model as developed in Equation (17) will be tested for this property against the various known cooling schemes in literature. Three of these cooling schemes, namely, Lundy and Mees (L & M), geometric and linear (or arithmetic) as represented by Equations (19), (20) and (21), respectively, were chosen and compared their relative cooling rates with the VCF in Equation (17):

T j + 1 = T j 1 + β T j , where β = T max − T min ( n − 1 ) T max T min , n > 1 (19)

T j + 1 = α j T j , where α = ( T min T max ) 1 ( n − 1 ) , n > 1 (20)

T j + 1 = T j − γ , where γ = T max − T min ( n − 1 ) , n > 1 (21)

where the parameter n is the number of iterations in each transition state k = 1 , 2 , ⋯ , N . In each of the three cooling schemes in Equations (19)-(21), the temperature is updated at each iteration. Hence, for each temperature T j ; j = 1 , 2 , ⋯ , n , T 1 = T max , and T n = T min , the parameters β , α and γ are the fixed cooling factors associated with the Lundy and Mess, geometric and linear cooling schemes, respectively. Three main settings were made by putting T max = 1000 , T min = 25 and number of iterations = 25 ( ν = 24 ) and applied to Equations (17), (19)-(21). The temperature declination of each cooling scheme is obtained as illustrated in

From the settings, the values of cooling factors selected for L&M, geometric, and linear cooling schemes were α = 0.001 , β = 0.833 and γ = 39.58 , respectively and kept constant, whilst that for VCF was varied gradually from 0.875 to 0.890 during the cooling process.

As it can be seen from

from the initial temperatures to the minimum temperature. Its rate of cooling is more preferred to the other three cooling schedules, which presupposes that the VCF will fit well into the SA algorithm.

As indicated in Section 1, SA is a random search optimization technique that is very useful for solving global optimization problems involving several variables without calculating derivatives and also ensures convergence to a global optimum [

In this study some steps of the Powell’s method will be modified and combined with the simulated annealing algorithm to produce a new algorithm as presented in the next section. It is expected that the new algorithm can be used to search for global solutions of combinatorial optimization problems in a very good time.

The SA algorithm and Powell’s method are summarized and presented in

SA ALGORITHM: | POWELL’S METHOD: |
---|---|

1) Select an initial solution S. 2) Select an initial temperature ^{1} of S (ii) | 1) Select a starting point p_{0} (in N-dimension) and set |

Some of the parameters of the Powell’s algorithm were modified and incorporated into the annealing algorithm to form a new SA algorithm called Powell-simulated annealing (PSA) algorithm. These parameters included the direction vectors, initial guess, and cycle count techniques. The proposed PSA algorithm template for finding global minimum of function of several variables is presented below.

PSA ALGORITHM:

Step 1:

a) Select a starting point Q 0 (in N dimensions) and calculate f 0 = f ( Q 0 ) [need stopping criteria].

Set i = 0 .

b) Choose a random point on the surface of a unit N-dimensional hyper- sphere to establish a search direction, S.

c) Select an initial temperature T > 0

Step 2:

While T > lower bound, do the following:

a) Minimize f in each of N-normalized search directions. The function value at and at each search is Q i j . Note: Q i + 1 = Q i N

b) Compute Δ f = f ( Q i + 1 ) − f ( Q i ) ;

c) If Δ f ≤ 0 (downhill move); Set Q i + 1 = Q i

d) If Δ f > 0 (uphill move); Set Q i + 1 = Q i with probability, exp ( − Δ f Φ k )

Step 3:

a) Compute the average direction vector ( Q i + 1 − Q i ) and minimize f in the direction

( Q i + 1 − Q i ) ‖ Q i + 1 − Q i ‖

Set Q i + 1 = Q i

b) Set T k + 1 = Φ k T k (reduce temperature by cooling factor Φ k ) and go to step (4).

Step 4:

Repeat steps 2 and 3 until the algorithm converges.

The steps 3 and 4 of the Powell’s method were not included in the PSA algorithm. The detailed description of PSA algorithm is presented under the implementation in Section 4.2.

The move generation model for minimizing function of single variable problems is presented. Let the new solution Q i ∈ N ( Q 0 ) and α ∈ [ 0 , 1 ] , then compute the α-value using the formula in Equation (22):

α = ( 1 + 1 k ( j + 1 ) ) − 1 (22)

where Q 0 is the initial solution; N ( Q 0 ) is the neighbourhood of the current solution Q k ; k = 1 , 2 , ⋯ , n are the transition states, j = 2 , 3 , ⋯ , ( n + 1 ) and the current solution Q k = arg [ min { f ( ϕ ) , f ( ω ) } ] , where ω = b α + ( 1 − α ) Q 0 ; ϕ = a α + ( 1 − α ) Q 0 ; and a and b are the lower and upper of the domain, respectively. Thus, in our SA algorithm, instead of selecting a random number, α-value from tables, we estimate it using Equation (22). The random move generation strategy in the SA algorithm is replaced by the α-model in the proposed PSA algorithm.

The PSA algorithm is aimed at searching for global minimum of multi-dimensional functions in combinatorial optimization problems. The proposed PSA algorithm is implemented via the VCF model to speed up the annealing process and also to find the global minima of given cost functions of several variables. The cost function f is defined over an N-dimensional continuous variable space, where

f ( Q ) = f ( Q 1 , Q 2 , ⋯ , Q N ) (23)

The problem is how to use the PSA to find the global minimize Q o p t , satisfying the condition in Equation (24), called the global minima.

f ( Q o p t ) = min { f ( Q ) | ∈ ℝ n on [ a , b ] } (24)

A detailed implementation of the PSA algorithm for finding global minimize Q o p t followed by the following steps:

Step 1:

The initial data (or parameters) of the function and PSA are stated together with the floating point representation of the data, f l = 1.0 E − 05 . A random starting point Q 0 ∈ Ω n (domain) is selected and f 0 = f ( Q 0 ) computed. The domain Ω n is defined by Ω n = { x ∈ ℝ n : a ≤ Q 1 ≤ a 1 , ⋯ , ≤ a n ≤ Q n ≤ b ; a , b ∈ ℝ n } . The Ω n is a rectangular subdomain of ℝ , centered at the origin and whose width, along each coordinate direction is determined by the corresponding component of the vector a. The value of f ( Q 0 ) is checked to be feasible or exits within the tolerance level of 1.0 E − 05 (stopping criterion). This is repeated several times till the feasible value of f ( Q 0 ) is obtained. Set S = { s 1 , s 2 , ⋯ , s n } to be linearly independent coordinate directions (in n-dimensions) and assume that the directions are normalized to unit length so that ‖ s i ‖ = 1 , i = 1 , 2 , ⋯ , n , where ‖ . ‖ denotes the Euclidean norm. The initial temperature T 0 = 0.6 L together with the temperature decrement formula, T k + 1 = Φ k T k are stated, where L = b − a .

Step 2:

In each N k (transition states k), where k = 1 , 2 , ⋯ , n , , starting from the point Q 0 , a random point Q i is generated along a given n-normalized direction vectors, u i from S such that

Q i = Q 0 + γ i u i (25)

where γ i is the component of the step vector along the coordinate direction. Equation (25) is then used to transform or reduce f ( Q ) to a single variable problem as:

f ( γ i ) = f ( Q 0 + γ i u i ) (26)

Equation (22), also known as the golden search method, can be applied to calculate the value of γ i to minimize Equation (26) along the n-unidirectional search directions. That is, at each iteration, the factor γ i u i is used to perturb the present configuration Q i into a new configuration Q i + 1 such that:

Q i + 1 = Q i + γ i u i (27)

The probability that the current solution Q i + 1 will be accepted or rejected is given by the Metropolis criterion:

M p = P ( Q i + 1 | Q i ) = min { 1 , exp [ − ( f ( Q i + 1 ) − f ( Q i ) ) / T k + 1 ] } = exp ( − Φ k Δ f ) (28),

where Δ f = f ( Q i + 1 ) − f ( Q i ) . If Δ f ≤ 0 , then M p = 1 and Q i + 1 is accepted, return to Step 3.

Else, compute M p = exp ( − Φ k Δ f ) and generate a random number ρ ∈ [ 0 , 1 ] . If ρ < M p , then the iteration step is accepted and the next direction vector in the cycle is selected. Else, no change is made to the vector. Go to Step 1.

Step 3:

This step completes the iterations in a transition by computing the average direction vector u d = Q i + 1 − Q i , normalized to u = u d / ‖ u d ‖ and λ i + 1 determined as before to minimize f ( Q i + λ i + 1 u ) . The current solution is set to Q i + 1 = Q i + λ i + 1 u and the temperature is updated using T k + 1 = Φ k T k . This completes the iterations in the first transition state k = 1 and number of iterations at this state is given by I = v + 1 .

Step 4:

Steps 2 and 3 are repeated until the algorithm converges to the global solution (minimum).

Unlike the Powell’s iterations, the direction vectors are not replaced one after the other, in this case. The worse solutions or vectors are accepted or rejected based on the Metropolis criterion (Step 2). The normalized average direction vector is computed to preserve the linear independence of the directions vectors at each iteration and termination of a transition.

To examine the performance of the PSA algorithm, it was tested against the NMS method and SA algorithm with the geometric cooling scheme. The proposed test functions included the Rosenrock functions of 2 dimensions in Equation (29), the Powell quartic function of 4-D in Equation (30) and other multi-minima functions of N dimensions in Equation (31):

f ( x 1 , x 2 ) = 100 ( x 2 − x 1 2 ) 2 + ( 1 − x 1 ) 2 (29)

f ( x 1 , x 2 , x 3 , x 4 ) = ( x 1 − 10 x 2 ) 2 + 5 ( x 3 − x 4 ) 2 + ( x 2 − 2 x 3 ) 4 + 10 ( x 1 − x 4 ) 4 (30)

f ( x 1 , x 2 , ⋯ , x N ) = ∑ i = 1 N { ∑ j = 1 N k i ( A i + C j x j m j ) } n i (31)

These methods have been reported to be simple, reliable, and efficient global optimizers [

The SA algorithm is, so far, commonly available in FORTRAN, PASCAL, MATLAB, C and C^{++} programs. This makes the SA algorithm inaccessible by most of the mobile phones due to the incompatible operating systems. One achievement of this work is the development of an optimal SA cooling schedule model which can portably be programmed to make running of the SA algorithm on android smartphone systems possible. The running proceeds as follows:

・ Accessing PSA on Android Platform and Internet:

There are two options to access the App, by mobile application and Web access. The mobile application will download and install the application (Scudd PSA) from Google Play Store while the Web access follows the link: http://www.scudd-psa.byethost7.com.

・ Running PSA Pseudo Code on ASS:

Given the optimization problem in Equation (32):

Minimize F = Min ∑ i = 1 N ( ∑ j = 1 N k i ( A i + C j x j m j ) ) n i (32)

(i) Input the parameters: k i , A i , C i , x i , m j and n i .

(ii) Press the run key for the output (results).

The first test was performed using six different cooling schemes one after the other on the Rosenbrock function in 2 dimensions (Equation (29)) to find the computational time of the PSA algorithm. The standard cooling factors of the five search schemes were selected for the test while the VCF was made to vary gradually from 0.875 to 0.890. The schemes were all started from the same point (24, 20), randomly chosen from the domain. Each test was run 25 times over the interval [−200, 200]. The mean optimal results obtained for the six test schemes are reported in the

In the 2 dimensions with a tolerance level of 10 − 5 , the geometric, exponential and VCF schemes slightly deviated from the global minimum at the origin (see

Cooling Scheme | Mean Parameters: | ||
---|---|---|---|

No. of transition states | Min value of cost function | CPU | |

Lundry & Mees | 25 | 9.60383 | 1.976 |

Geometric | 20 | 1.000E−05 | 1.300 |

Linear | 10 | 2.250E−04 | 1.410 |

Exponential | 5 | 1.330E−05 | 0.910 |

VCF | 3 | 1.000E−05 | 0.600 |

Arts et al. | 21 | 2.103E−03 | 1.830 |

Thus, on Rosenbrock function in 2 dimensions, the PSA algorithm performed better than the geometric and exponential schemes. The speeds of the Lundy & Mess, linear and Aart et al. schemes were relatively slow and found not too ideal for use in temperature-dependent perturbation schemes.

A second test was performed on the Powell quartic function (PQF) in 4 dimensions (Equation (30)). The PSA, SA and NMS algorithms were started from the same points randomly chosen from the domain. Each test was run 20 times, starting from the points located at a distance from the origin. The mean optimal results on the test function for each algorithm are reported in

The results in 4 dimensions show that SA algorithm and NMS method never fail to find the global minimum within the tolerance level of 1.00E−05 and the cost of running the PSA algorithm is the cheapest with just 22 iterations, compared with 2781 and 185 iterations for the SA and NMS algorithms, respectively (see

Further tests were made using the parabolic multi-minima functions in 3, 4 and 8 dimensions in Equation (31). In this case, the PSA web program using a pseudo code on ASS was again used to search for global minimum of the selected functions as solution to the problem in Equation (30) and the best results are as presented in

From

Mean Parameter | Algorithm | ||
---|---|---|---|

SA | NMS | PSA | |

Number of Iterations | 2781 | 185 | 22 |

Optimal functional value | 2.325E−05 | 1.391E−06 | 6.837E−05 |

Total running time (CPU) | 1.3572 | 1.0163 | 0.1672 |

Distance from optimal value | 2.324E−05 | 1.390E−06 | 6.837E−05 |

Mean Parameter | Test Functions | ||
---|---|---|---|

Multi 3-D | Multi 4-D | Multi 8-D | |

Number of Transitions | 3 | 27 | 25 |

Number of Iterations | 14 | 135 | 219 |

Final temperature | 0.25540 | 1.73859 | 3.88080 |

Optimal functional value | 1.0323E−04 | 1.420E−03 | 2.746E−05 |

Total running time (CPU) | 0.0210 | 0.5802 | 0.8325 |

Algorithm | Parameters | |||||
---|---|---|---|---|---|---|

N | T_{max} | Mean (T_{min}) | SD (T_{min}) | Fval. | Mean (CPU) | |

Multi 4-D: SA | 288 | 24 | 2.65050 | 1.2536 | 2.16E−05 | 3.2117 |

PSA | 185 | 24 | 1.25434 | 1.0122 | 1.10E−05 | 2.0424 |

Multi 8-D: SA | 6202 | 54 | 8.1052 | 2.6234 | 0.0494 | 5.3053 |

PSA | 218 | 54 | 1.5461 | 1.2245 | 0.0021 | 4.9305 |

Definitions of parameters in _{max} denotes initial temperature; T_{min} is the final temperature; Fval denotes the optimal value of objective function; CPU is the total execution time.

In Section 5.1, it was observed that the cooling rate of VCF compared favourably with that of the geometric than the Lundy and Mees. To ascertain this observation further, the geometric cooling scheme was incorporated into PSA and SA algorithms, and each tested 20 times on the multi-minima functions using different initial temperatures [

In this paper, we have proposed a variable cooling factor (VCF) model for simulated annealing schedule to speed up an annealing process and also determine its effectiveness by comparing with five other cooling schemes, being Lundy and Mees (L&M), geometric, linear, exponential, and Arts et al. The geometric and exponential cooling schemes produced the faster cooling rates (super cooling) than the VCF scheme, which showed reasonably slower cooling rate, suitable for the annealing process.

For the first time in literature, the Powell’s method has been emerged with the SA process to form a new SA algorithm called Powell-simulated annealing (PSA) algorithm via the variable cooling factor scheme. The PSA has successfully been used to find the global minima of test functions of several variables without calculating their derivatives. It was tested against the geometric SA algorithm and, Nelder and Mead Simplex (NMS) method on Rosenbrock valleys in 2 dimensions and multiminima functions in 3, 4 and 8 dimensions. The PSA algorithm proves to be more reliable and always able to find the optimum solution or a point very close to it with a minimal number of iterations and computational time.

The PSA algorithm has been programmed, which can be run on android smartphone systems and on the Web to facilitate the computation of combinatorial optimization problems faced by computer and electrical engineering practitioners. The scheme has only been tested on multiminima function problems. Therefore, further work is required to optimize the SA schedule parameters further to make it more suitable to address all optimization problems which cannot be solved by the conventional algorithms.

The authors are grateful to all editors and anonymous reviewers for their criticisms and useful comments, which helped to improve this work.

Peprah, A.K., Appiah, S.K. and Amponsah, S.K. (2017) An Optimal Cooling Schedule Using a Simulated Annealing Based Approach. Applied Mathematics, 8, 1195-1210. https://doi.org/10.4236/am.2017.88090