A Retrospective Filter Trust Region Algorithm For Unconstrained Optimization ∗

In this paper, we propose a retrospective filter trust region algorithm for unconstrained optimization, which is based on the framework of the retrospective trust region method and associated with the technique of the multi-dimensional filter. The new algorithm gives a good estimation of trust region radius, relaxes the condition of accepting a trial step for the usual trust region methods. Under reasonable assumptions, we analyze the global convergence of the new method and report the preliminary results of numerical tests. We compare the results with those of the basic trust region algorithm, the filter trust region algorithm and the retrospective trust region algorithm, which shows the effectiveness of the new algorithm.


Introduction
Consider the following unconstrained optimization problem   min f x (1) where , : is a twice continuously differentiable function.
The trust region method for unconstrained optimization is first presented by Powell [1], which, in some sense, is equivalent to the Levenberg-Marquardt method which is used to solve the least square problems and which was given by Levenberg [2] and Marquardt [3].The basic idea of trust region methods works as follows.
In the neighborhood of the current iterate (which is called the trust region), we define a model function that approximates the objective function in the trust region and compute a trial step within the trust region for which we obtain a sufficient model decrease.Then we compare the achieved reduction in f(x) to the predicted reduction in the model for the trial step.If the ratio of achieved versus predicted reduction is sufficiently positive, we define our next guess to be the trial point.If this ratio is not sufficiently positive, we decrease the trust region radius in order to make the trust region smaller.Otherwise, we may increase it or possibly keep it unchanged.
Since the trust region method is of the naturalness, the strong convergence and robustness, it has been concerned by many people, such as Powell [1,4,5], Schultz et al. [6], Sorensen [7], Moŕe [8], Yuan [9] and so on.In recent years, the trust region method has been applied to the optimization problems with equality constraints [10], simple bound constraints [11], convex constraints [12] and so on.Many of convergence results have been obtained, which can be seen in [13].
In Fletcher and Leyffer [14] a new technique for globalizing methods for nonlinear programming (NLP) is presented.The idea is referred to as an NLP filter, motivated by the aim of avoiding the need to choose penalty parameters, and considered the relationship between the objective function and the constraint violation in the view of multi-objective optimization.They make the values of the objective function and the constraint violation to be a pair (which is called the filter), construct a sophisticated filter mechanism by comparing the relationship between the pairs, and control the algorithm to converge to the stationary point of the problem (1).The results of numerical tests show that the filter methods are very effective.Fletcher et al. [14,15], Toint et al. [16], Ulbrich et al. [17], Wächter et al. [18,19] have combined the idea with SQP method, trust region method, interior-point method, line search methods, respectively, and obtained some interesting results about the filter method.
Fletcher, Leyffer and Toint [20] review the ideas above and mention the application of the filter method in practice.In [14], they study the problem of the following form is continuously differentiable function.Define the measure of the constraint violation x .Now, we give the following definitions about the filter methods.
and .
f h such that no pair dominates any other.
We use k F to denote the set of iteration indices f h is said to be acceptable for inclusion in the filter if it is not dominated by any pair in the filter, that is, for any pair   In order to obtain the global convergence of the algorithm, we should make f, h satisfy the sufficient reduction condition, so we strengthen the acceptable rule (2) as where (0,1) is very small, there is negligible difference in practice between (2) and (3).
is added to the list of pairs in the filter, any pairs in the filter that are dominated by the new pair are removed, that is, we remove the pair   and .
This is called the modification of the filter.Gould et al. [21] and Miao et al. [22] applies the filter technique to unconstrained optimization, whose characteristic is to relax the condition of accepting a trial step for the usual trust region method, which improves the effectiveness of the algorithm in some sense.The nonmonotonic algorithm also has the algorithm in some sense.The nonmonotonic algorithm also has the characteristic [23,24].
Recently, Bastin et al. [25] presents a retrospective trust region method for unconstrained optimization.Comparing their algorithm with the basic trust region algorithm, the updating way of the trust region radius is different, and the retrospective ratio plays the following two roles.1) Determine the trial step to be accepted by the algorithm or not.
2) Adjust the trust region radius correspondingly.
In the retrospective trust region method, the two roles are played by the ratio  and   , respectively.In the basic trust region algorithm, the determination of trust region radius is an important and difficult problem.Sartenaer [26] and Zhang et al. [27] present the self-adaptive trust region methods and give some discussions about the determination of trust region radius.Bastin et al. [25] presents a retrospective trust region method for unconstrained optimization.The retrospective ratio in this method uses the information at the current iterate and the last iterate point, which can give the more effective estimation of trust region radius.Hence, the number of solving trust region subproblem may be decreased, which improves the effectiveness of the method.
In this paper we present a new algorithm for unconstrained optimization, which is based on the framework of the retrospective trust region method [25] and associated with the technique of the multi-dimensional filter [21,22].Under reasonable assumptions, we analyze the global convergence of the new method and report the preliminary results of numerical tests.We compare the results with those of the basic trust region algorithm, the filter trust region algorithm and the retrospective trust region algorithm, which shows the effectiveness of the new algorithm.This paper is organized as follows.The new algorithm is described in Section 2. Basic assumptions and some lemmas are given in Section 3. The analysis of the first order and second order convergence is given in Section 4 and Section 5, respectively.Section 6 reports the numerical results.Finally, we give some final remarks on this approach.

Algorithm
In this paper, we define ( ) ( ) . At the current iterate , ( ), ( ), denotes the ith component of the vector k g .Throughout this paper,  denotes the Euclidean norm.Now, we review the basictrust region algorithm as follows.

Algorithm BTR (Basic Trust Region
Algorithm) Step 0. (Initialization) Given an initial point 0 n x R  and an initial trust-region radius 0 0.
Step 3. (Updating iterate point) Compute ( )   , and go to Step 1.In the algorithm BTR, we do not give a formal stopping criterion.In practice, the stopping criterion can be installed in Step 1, such as max or , where eps denotes the precision, and max k denotes the maximal number of iterations.
If x  is a local minimizer of the problem (1), then   0 g x   .Motivated by the filter method, we set   g x to be the measure of the iterate.Now we intro-duce some definitions about the multi-dimensional filter.Definition 2.1 We say that a point k x dominates another point l x , if for all 1, 2, .
and .
Definition 2. 3 The iterate point k x is acceptable for the filter F if and only if for all l g F  , there exists x is acceptable, then it is added to the filter and any iterates in the filter that are dominated by the new iterate are removed, which is called the modification of the filter.
Combining with the filter technique and the retrospective idea, we describe our algorithm as follows.

Algorithm RFTR (Retrospective Filter
Trust-Region Algorithm) Step 0. (Initialization) An initial point 0 n x R  and an initial trust-region radius 0 0   are given.
  Set the initial filter F to be the empty set and set : 0 k  .
Step 1. (Model definition) Define a model function Step 3. (Step calculation) Compute a trial step k s for solving trust region subproblem (4) and and k x  is accepted by the filter F , then and k x  is not accepted by the filter F , then   .In the algorithm RFTR, the retrospective idea and the filter technique are two important characteristics.The retrospective ratio uses the information at the current iterate and the last iterate to adjust the trust-region radius, which can give the more effective estimation of trust region radius.The filter technique relaxes the condition of accepting a trial step comparing with the usual trust region method, which improves the effectiveness of the algorithm in some sense.From the algorithm RFTR, if the trial point is not accepted (Case 3 in Step 5 occurs), then the algorithm is similar to the basic trust-region algorithm, whose difference is just that we use the retrospective idea in the algorithm RFTR.However, if the trial point is accepted by the algorithm (Case 1 or Case 2 in Step 5 occurs), the retrospective idea and the filter technique all play the roles.
At the iterate k x , if then the iterate is called the successful iterate and the iteration index k x is called successful iteration.

Basic Assumptions and Lemmas
In this section, we present the global convergence analysis of the algorithm RFTR.We make the following assumptions.
A1 The all iterates k x remain in a closed and bounded convex set A3 The model function k m is first-order coherent with the function f at the iterate k x , i.e., their values and gradients are equal at k x for all k , A4 The Hessian matrix of the model function xx k m  is uniformly bounded, i.e., there exists a constant 1 holds for all n x R  and all k .Generally speaking, we do not need the global solution of the trust region subproblem.We only expect to decrease the model at least as much as at the Cauchy point.Therefore, we make the following assumption on the solution k s of the trust region subproblem (4).A5 There exists a constant mdc k , for all k , By the assumptions A1 and A2, the Hessian matrix is uniformly bounded on  , i.e., there exists a positive constant ufh k such that, for all x   ,   .
Now we study the global convergence of the algorithm RFTR.First we give a bound on the difference between the objective function and the model function k m at the iterate 1 k x  and k x .The proof of the following result is similar to Theorem 3.1 in [25].
Lemma 3.1 [25] Suppose A1-A4 hold, then exists a positive constant ubh k , and if iteration m  , we need to compare their difference, which is provided by the next Lemma.
Lemma 3.2 [25] Suppose A1-A4 hold, then for every successful iteration We conclude from this result that the denominators in the expression of k   and k  are the same order as the error between the objective function and the model function.Similar to Theorem 6.4.2 in [13], we obtain the next result.Lemma 3.3 Suppose A1-A5 hold, Then iteration 1 k  is successful and and the assumptions A3 and A5 that . By ( 7), By the assumption A5, we have that On the other hand, it follows from ( 5) and ( 7) that The conditions ( 7), ( 9) and the definition of Combining (8) and Lemma 3.2, we can conclude that By ( 6) and (10), As a consequence of this property, we may now prove that the trust region radius cannot become too small as long as a first-order critical point is not approached.The technique of the proof is similar to Theorem 3.4 in [25] and Theorem 6.4.3 in [13].

First Order Convergence
Assume that   k x is an infinite sequence generated by Algorithm RFTR.Under the assumptions (A1)-(A5), we discuss the first order convergence of the sequence   k x .At first, we define the following sets.
The set of successful iteration index The set of the iteration index which is added to the filter or the iterate is added to the filter .
The set of the iteration index which satisfies sufficient descent condition S denotes the cardinal number of the set S. We now establish the criticality of the limit point of the sequence of the iterates when there are only finitely many successful iterations.
Theorem 4.1 Suppose A1-A5 hold and S   , then there exists an index 0 k such that 0 k x x   and x  is a first-order critical point.
From Step 2 of the algorithm RFTR, we have that It follows from Lemma 3.4 that 0 0 k g  .Next, we consider the case that there are infinitely many successful iteration.From the algorithm RFTR, we know that A S  .Therefore we consider the following two cases.
First, we have the following result.Theorem 4.2 Suppose A1-A5 hold and A S   , then lim inf 0.
Proof.Suppose, by contradiction, that the result is not true, then there exists a positive constant lbg k such that holds for all k .Denote the index set So there exists a subsequence   By the definition of l i k , the iterate point x  is accepted by the filter F , and for every 1 l  there ex- ists Since there is only finite choices of l j , without loss of generality, we set l j j  .In (12), we follows from l   and ( 11) which is a contradiction.Thus the result is proved.Now, we give the result when It follows from the assumption A5, Lemma 3.4 and of iterates generated by the algorithm RFTR which is a first-order critical point.

Second Order Convergence
We now exploit second-order information on the objective function to discuss the second order convergence of the sequence.We therefore introduce the following additional assumptions.
A6 The model is asymptotically second-order coherent with the objective function near first-order critical points, i.e.,
for all , .
k x y   Lemma 5.1 Suppose that A1-A7 hold.Suppose also that there exists a sequence   i k and a constant 0 for all i sufficiently large.Finally, suppose that lim and for all i sufficiently large.
Proof.We first deduce that every iterations i k is successful for i sufficiently large.By the mean value theorem and ( 13), for some k  and k  in the segment When i goes to infinity, by our assumption that i k s tends to 0, and the bounds and .
Combining the assumption A2 and A7, the first and third terms of the last right-hand side tend to 0. Meanwhile, the second tends to 0 because of the assumption A6 and Theorem 4.1, 4.2, 4.3.As a consequence, i k  tends to 1. when i goes to infinity, and thus larger than 1  for i sufficiently large.
The residual proof is similar to Lemma 3.8 in [25].Theorem 5.2 Suppose that A1-A7 hold and that the complete sequence of iterates   k x converges to the unique limit point x  .Then x  is a second order critical point of (1). 

Proof. By
hold for all 0 k k  .Meanwhile, by the assumption we know that tends to 0. Thus, there exists and therefore x  is a second-order critical point of (1).

Numerical Experiments
In this section, a preliminary numerical test of the algorithm BTR, the algorithm FTR [22], the algorithm RTR [25] and the algorithm FTR are given.The Matlab codes (Version 7.4.0.287R2007a) were written corresponding to those algorithms.For the numerical tests, we use the following trust-region radius updated form which is proposed in Conn et al. [13].
where max k denotes the maximal iteration number.We choose 24 test problems from [29], where "S201" means problem 201 in Schittkowski (1987) collection [29], 12 test problems from CUTE [25,30] and the famous Extended Rosenbrock test problem.In the following tables, "n" means the test problem's dimension, "nBTR, nFTR, nRTR, nRFTR" mean the number of iterations of the algorithm BTR, the algorithm FTR, the algorithm RTR and the algorithm RFTR, respectively."ng1, ng2, ng3, ng4" mean the number of gradient evaluations of the algorithm BTR, the algorithm FTR, the algorithm RTR and the algorithm RFTR, respectively.
"r" means the rank of the number of iterations of the algorithm RFTR among the four algorithms, whose values is in {1, 2, 3, 4}, where "1" means that the number of iterations of the algorithm RFTR is the smallest among the four algorithms, so the algorithm RFTR is the best one among the four algorithms."4" means that the number of iterations of the algorithm RFTR is the largest among the four algorithms, so the algorithm RFTR is the worst one among the four algorithms."F" means that the algorithm does not stop when the maximal iteration number is achieved.
In Table 1, there are 20 test problems whose iteration number is the smallest, 2 test problems whose rank is second, 2 test problems whose iteration number is the largest among 24 test problems.The numerical results  show that the number for the algorithm RFTR to solve trust region subproblem is the smallest in total.
In Table 2, There are only 2 cases whose rank is second, the others all are the best.Moreover, the algorithm RFTR is more and more effective as the increase of the problem's dimension.
In Table 3, There are only 1 case whose rank is second, the others all are the best.Moreover, The retrospective idea takes effects on the Problems COSINE, ERRINROS, LOGHAIRY clearly.

Conclusions and Perspectives
Trust region method is very reliable and robust and has very strong convergence properties.It is a class of very effective algorithms for solving unconstrained optimization now.The basic trust region algorithm is the monotone descent algorithm, i.e., the value of the object function in the iterate sequence   k x strictly decreases monotonically.If the iterates follow the bottom of curved narrow valleys, then the monotone descent algorithm converges very slowly.The idea of non-monotone method [23,24] abandons the restriction of the descent property of the value of the object function, which allows the sequence of iterates to follow the bottom of curved narrow valleys much more loosely, which hopefully results in longer and more efficient steps.
Trust region method combines with the filter technique, which, in some sense, relaxes the monotonicity condition which accepts the trial step.The filter technique improves the numerical effect for some problems.
The new algorithm RFTR presented in this paper combines with the filter technique and the retrospective idea, which the number of the algorithm RFTR to solve trust region subproblem is decreased in total.On the other hand, our algorithm also looks like a self-adaptive method based on the trust-region framework.Meanwhile, our algorithm is not like the other algorithms about self-adaptive method [26,27] which need to compute the gradient value and function value at the auxiliary point, but may measure the acceptance of the previous iterate and the current iterate for the new and old model function, respectively, which keep the robustness property of the trust-region method.
ratio k   uses the reduction in k m instead of the reduction in 1 k Theorem 4.1, 4.2, 4.3,   0 g x   .We suppose, by contradiction, that the assumption A6, there exists 0 k such that, for all   to Step 1. Similar to the algorithm BTR, the stopping criterion can be installed in Step 1, such as A   .
as , p k is sufficiently large.S   and A   imply that k k x