A Scaled Conjugate Gradient Method Based on New BFGS Secant Equation with Modified Nonmonotone Line Search

In this paper, we provide and analyze a new scaled conjugate gradient method and its performance, based on the modified secant equation of the Broy-den-Fletcher-Goldfarb-Shanno (BFGS) method and on a new modified nonmonotone line search technique. The method incorporates the modified BFGS secant equation in an effort to include the second order information of the objective function. The new secant equation has both gradient and function value information, and its update formula inherits the positive definite-ness of Hessian approximation for general convex function. In order to improve the likelihood of finding a global optimal solution, we introduce a new modified nonmonotone line search technique. It is shown that, for non-smooth convex problems, the proposed algorithm is globally convergent. Numerical results show that this new scaled conjugate gradient algorithm is promising and efficient for solving not only convex but also some large scale nonsmooth nonconvex problems in the sense of the Dolan-Moré performance profiles.


Introduction
The conjugate gradient method (CG) and Quasi-Newton method are two major popular iterative methods for solving smooth unconstrained optimization problems, and Broyden-Fletcher-Goldfarb-Shanno (BFGS) method is one of the most efficient quasi-Newton methods for solving small and medium sized unconstrained optimization problems [1] [2] [3] [4]. The iterative method is computed where k α is a step size and k d is a search direction. For continuously differentiable function : n h R R → , the minimization problem: (2) has been well studied for several decades. Conjugate gradient method is among the preferable methods for solving problem (2) with search direction k d given where k h ∇ is the gradient of an objective function ( ) h x at k iterate and k β is a scalar describing the attributes of the CG methods.
Some well-known formulas for the scalar k β are the Hestenes-Stiefel (HS) [5], Fletcher-Reeves (FR) [6], Polak-Ribière and Polak (PRP) [7], and Dai-Yuan (DY) [8] given by T T where k s is defined as if T 0 k k y s > , which is known as the curvature condition. The BFGS method has very interesting properties and remains one of the most respectable quasi-Newton methods for unconstrained optimization [11]. The theory of BFGS method and its global convergence have been established by many researchers (see [12]). For convex objective function, using some special inexact line search, it has been proved that the BFGS method is globally convergent (see [13] [14] [15]). However, when the objective function is nonconvex, the BFGS method under exact line search may fail to converge [16]. Moreover, Dai [17] proved that the BFGS method may fail for nonconvex functions with Wolfe line search techniques given in (4) and (5) and [ ] 0,1 τ ∈ . It is not difficult to notice that the denominator of (11) is the convex combination of the denominator of the conjugate parameters in HS and PRP conjugate gradient methods. The choice of spectral parameter given (10) ensures the sufficient descent property of the search direction without dependence of line search. The convergence property of their method analyzed under a new modified nonmonotone line search with some mild conditions. However, this spectral CG method has only first order information, and excludes second order information. When the number of dimension is large, the CG methods are more effective compared to the BFGS methods in term of the CPU-time but in term of the number of iterations and the number of function evaluations, the BFGS methods are better. In order to incorporate the remarkable properties of the CG and BFGS methods and to overcome their drawbacks, many hybrid of CG and BFGS methods are introduced for unconstrained smooth optimization [27] [28] [29]. However, the usage of these methods is mainly restricted to solve smooth optimization problems. Recently, Yuan et al. [30] [31] [32] [33] introduced some CG approaches to solve nonsmooth convex large scale problems using the smoothing regularization, and under some assumptions, the global convergence properties of these approaches are analyzed. Yuan and Wei [34] proposed the Barzilai and Borwein (BB) gradient method with nonmonotone line search to solve nonsmooth convex optimization problems. Some implementable quasi-Newton methods are also introduced for solving the same problem (see [35] [36] [37] [38]). More recently, Ou and Zhou [39] introduced a modified scaled BFGS preconditioned CG algorithm, and under appropriate assumptions, the method is proven to possess global convergence for nonsmooth convex functions.
Motivated by the work of Ou and Zhou [39], in this paper, we propose a hybrid approach of the a scaled CG method and a modified BFGS method to combine the simplicity of CG method and the Hessian approximation of BFGS method. Our work is mainly focused in developing the scaled conjugate search direction that includes the second order information of the objective function by incorporating the modified secant equation of BFGS method. Opposing the work of Ou and Zhou [39], our method has both the function and gradient value information of the objective function. Moreover, our method leads to better descent direction than the CG methods proposed so far. To the best of our knowledge, this is the first work to incorporate the scaled CG algorithm with the The paper is organized as follows. In the next section, we consider a nonsmooth convex problem and review their basic results. In Section 3, we propose

Nonsmooth Convex Problems and Their Basic Results
In this section, we consider the unconstrained optimization problem where : n f R R → is a possibly nonsmooth convex function. This problem is equivalent to the following problem is the Moreau-Yosida regularization of f [40], which is de- where λ is a positive parameter. The function F is a finite-valued, continuously differentiable convex function even though the function f is nondifferentiable (see [40]). Let ( ) p x be the unique solution of (14). In what follows, we can express ( ) Moreover, the gradient of F is globally Lipschitz continuous, i.e., The point n x R ∈ is an optimal solution to (12) if and only if ( ) 0 g x = (see [40]). Furthermore, under reasonable conditions the gradient of F is semismooth and some of its remarkable properties are given in [41] [42].
Several methods have been proposed to solve (13) by incorporating bundle methods and quasi-Newton methods ideas [43] [44] [45], but it is burdensome to evaluate the exact value of ( ) p x at any given point x [46]. Luckily, for each n x R ∈ and any 0 ε > , we can have , Therefore, we can approximate (20) respectively. Implementable algorithms to define such a ( ) , p x α ε for nonsmooth convex model can be seen in [47]. The noticeable attributes of ( ) , g x α ε be defined by (19) and (20), respectively. Then we obtain and ( ) ( ) Proposition 1 shows that the approximations of ( ) can be made arbitrarily close to the exact values of ( ) F x and ( ) g x respectively.

A Scaled CG Method Based on New BFGS Secant Equation
In this section, we introduce the new scaled CG search direction that incorporates the modified BFGS secant equation, and then describe the new algorithm for solving nonsmooth problems. We make use of a modified nonmonotone line search technique introduced by [23] to compute a step size. Based on the above approximations, we redefine the search direction of CG method (3) to solve problem (13) as follows: where ε is an appropriately chosen positive number. Ou and Zhou [39] provided a search direction defined by The vector k y * and k t in (27) are defined as It is easy to observe that the (27) has only gradient value information. In order to have both gradient and function value information, we replace (27) and (29) by respectively. Thus, the BFGS method with the secant equation and the update formula has both gradient and function value information, and the matrix and ( ) Now, based on the above search direction, we describe our new scaled CG al- Step 0. Given , and a point 0 Step 1. If

( )
, k k g x α ε <  , then stop, else go to the next step.
Step 4. Set If (38) does not holds, define k k α βα = and go to step 5.
It can be observed that the line search technique in step 5 of Algorithm 1 is a nonmonotone line search technique with some modifications.

Convergence Analysis
In this subsection, we establish the global convergence of our method for nonsmooth convex problem (12). To prove the global convergence of Algorithm 1, the following Lemmas are needed.
Lemma 1. Assume that the search direction k d is generated by Algorithm 1, then for all 0 k ≥ , we have and ( ) Proof. If 0 k = , then , .
, , Proof. If Using mean value theorem, we have Combining the above inequality with (42), we have Thus, the proof is completed.
By (40), (41) and (44) and this contradicts our assumption on F. Hence the theorem is proved. Theorem 2. Let the conditions in Lemma 1 and Theorem 1 hold, then Algorithm 1 converges for nonsmooth problem (12 ( ) x p x * * = . Therefore x * is an optimal solution of nonsmooth problem (12).

Numerical Experiments for Large Scale Nonsmooth Problems
In this section, we present some numerical experiments to examine the efficien- ( ) 0. ( ) for other three methods are chosen as in [30] [31] and [39] respectively. Table 1 shows the numerical results of SCG-MBFGS, MPRP, MHS and MSBFGS-CG on the given test problems. The columns in Table 1 have the following mean- ( ) f x : the value of ( ) f x at the final iteration.
From the numerical results in Table 1, it is not difficult to see that T. G. Woldu et al.  From Figure 1 and Figure 2, we also notice that SCG-MBFGS performs better than the other methods do in terms of the numbers of iterations and function evaluations. Figure 3 indicates that MHS is comparable to SCG-MBFGS in terms of CPU time, and since the search direction of MHS is developed with only first order information while SCG-MBFGS, MPRP and MSBFGS-CG are with second order information, it is reasonable to need less CPU time for MHS.

Conclusion
In this paper, we propose a new scaled conjugate gradient method which incorporates a modified secant equation of BFGS method. This modified secant equation contains both function value and gradient information of the objective function, and its Hessian approximation update generates positive definite matrix. Under a modified nonmonotone line search and some mild conditions, the strong global convergence of the proposed method is analyzed for nonsmooth convex problems. The search direction of our new method generates sufficiently descent condition and belongs to a trust region. Compared with existing nonsmooth CG methods, the search direction of our approach is more descent direction. Numerical results and related comparisons show that the proposed method is effective for solving large scale nonsmooth optimization problems.