Scientific Research

An Academic Publisher

**New Parameter of CG-Method with Exact Line Search for Unconstraint Optimization** ()

Keywords

Share and Cite:

*Open Access Library Journal*,

**7**, 1-8. doi: 10.4236/oalib.1106236.

1. Introduction

The conjugate gradient method is one of the important ways to find the minimum value of a function for unconstrained optimization.

The conjugate gradient method is widespread because its requirements are a small memory. Unconstrained optimization problem can be expressed as follows:

${\mathrm{min}}_{x\in {R}^{n}}f\left(x\right)$ (1)

where $f:{R}^{n}\to R$ is a continuous and derivative function. The CG method generates frequent updates in this format.

${x}_{k+1}={x}_{k}+{\alpha}_{k}{d}_{k},\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}k=1,2,3,4$ (2)

where x_{k} is the current iteration point,
${\alpha}_{k}>0$ is the positive step size using the “exact line search” as shown by the following:

${\alpha}_{k}={\mathrm{min}}_{\alpha >0}f\left({x}_{k}+{\alpha}_{k}{d}_{k}\right)$ (3)

and d_{k} is the search direction, which we get as follows:

${d}_{k}=\{\begin{array}{l}-{g}_{k}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{for}\text{\hspace{0.17em}}k=0\\ -{g}_{k}+{\beta}_{k}{d}_{k-1}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.05em}}\text{\hspace{0.17em}}\text{for}\text{\hspace{0.17em}}k\ge 1\end{array}$ (4)

where k is integer and that g_{k} is the gradient of the function f(x) and that β_{k} is the coefficient of the conjugate gradient associated with the function f(x) at the point x_{k}.

Some of the known conjugation methods are:

${\beta}_{k}^{FR}=\frac{{g}_{k}^{\text{T}}{g}_{k}}{{\Vert {g}_{k-1}\Vert}^{2}},\text{\hspace{0.17em}}\text{\hspace{0.17em}}{\beta}_{k}^{PR}=\frac{{g}_{k}^{\text{T}}\left({g}_{k}-{g}_{k-1}\right)}{{\Vert {g}_{k-1}\Vert}^{2}}$

${\beta}_{k}^{HS}=\frac{{g}_{k}^{\text{T}}\left({g}_{k}-{g}_{k-1}\right)}{{\left({g}_{k}-{g}_{k-1}\right)}^{\text{T}}{d}_{k-1}},\text{\hspace{0.17em}}\text{\hspace{0.17em}}{\beta}_{k}^{DY}=\frac{{g}_{k}^{\text{T}}{g}_{k}}{{\left({g}_{k}-{g}_{k-1}\right)}^{\text{T}}{d}_{k-1}}$

${\beta}_{k}^{Ls}=\frac{{g}_{k}^{\text{T}}\left({g}_{k}-{g}_{k-1}\right)}{-{g}_{k-1}^{\text{T}}{d}_{k-1}},\text{\hspace{0.17em}}\text{\hspace{0.17em}}{\beta}_{k}^{CD}=-\frac{{g}_{k}^{\text{T}}{g}_{k}}{{d}_{k-1}^{\text{T}}{g}_{k-1}}$

The coefficient gradient coefficient ${\beta}_{k}\in R$ is a numerical constant, which determines the difference in different CG methods when ${g}_{k-1},{g}_{k}$ denote the gradient of a function f(x) at points ${x}_{k-1},{x}_{k}$ , respectively.

The above methods are known as:

Fletcher and Reeves (FR) [1] , Polka and Ribiere (PR) [2] , Hestenes and Steifel (HS) [3] , Dai and Yuan (DY) [4] , Liu and Story (LS) [5] , Conjugate Descent (CD) by Fletcher [6] .

These aforementioned methods behave strictly convex quadratic functions in a behavior that is completely different from what they do in non-quadratic general functions. In any case, most of these methods examine the properties of universal approach in the field of conjugated gradient.

However, in recent years, there have been many attempts that have been directed towards building new formulas for CG methods with good numerical performance and achieving the characteristics of global convergence.

2. The New Conjugate Gradient and Its Algorithm

It is well known that the methods of numerical optimization are iterative methods and there is no specific method suitable for all types of problems. Each method has its advantages and new features as well as some of the characteristics that are not good and are efficient for some types of problems and not efficient for other types of problems.

The new coefficient of gradient is

${\beta}_{k}^{ME}=\frac{{g}_{k}^{\text{T}}{g}_{k}}{{\left({g}_{k}+{d}_{k-1}\right)}^{\text{T}}{d}_{k-1}}$ (5)

New method algorithm

Step (1): Set
$\u03f5>0,{d}_{0}=-{g}_{0},k=0$ and choose an initial value X_{0}

Step (2): Calculate ${\beta}^{ME}$ from (5)

Step (3): Calculate ${d}_{k}=-{g}_{k}+{\beta}_{k}^{ME}{d}_{k-1}$

In the case if $\Vert {g}_{k}\Vert =0$ , stop

Step (4): Calculate ${\alpha}_{k}={\mathrm{min}}_{\alpha >0}f\left({x}_{k}+\alpha {d}_{k}\right)$

Step (5): Calculate the new point with the following iterative formula:

${x}_{k+1}={x}_{k}+{\alpha}_{k}{d}_{k}$ (6)

Step (6): Test if it is

$f\left({x}_{k+1}\right)<f(xk)$

And also

$\Vert {g}_{k}\Vert \le \u03f5$ Stop

Otherwise, go to step (1) with k = k + 1

The coefficient ${\beta}_{k}$ is chosen in such a way that ${d}_{k+1}$ is G-conjugate to ${d}_{0},{d}_{1},{d}_{2},\cdots ,{d}_{k}$ .

Lemma (1)

In the conjugate direction algorithm

${g}_{k+1}^{\text{T}}{d}_{i}=0\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{for}\text{\hspace{0.17em}}\text{all}\text{\hspace{0.17em}}\text{\hspace{0.05em}}k,\text{\hspace{0.17em}}0\le k\le n-1\text{\hspace{0.05em}}\text{\hspace{0.17em}}\text{and}\text{\hspace{0.17em}}\text{\hspace{0.05em}}0\le i\le k.$

Proposition: In the conjugate gradient algorithm the direction ${d}_{0},{d}_{1},\cdots ,{d}_{n-1}$ are G-conjugate.

Proof: By using induction

We first show

${d}_{0}^{\text{T}}G{d}_{1}=0$

$\begin{array}{c}{d}_{0}^{\text{T}}G{d}_{1}={d}_{0}^{\text{T}}G\left(-{g}_{1}+{\beta}_{1}{d}_{0}\right)\\ =-{d}_{0}^{\text{T}}G{g}_{1}+{\beta}_{1}{d}_{0}^{\text{T}}G{d}_{0}\end{array}$

by ELS

$=\frac{{\beta}_{1}{d}_{0}^{\text{T}}\left({g}_{1}-{g}_{0}\right)}{{\alpha}_{0}}$ when ${\alpha}_{0}$ in (3)

$=\frac{-{\beta}_{1}{d}_{0}^{\text{T}}{g}_{0}}{{\alpha}_{0}}$

$=-\frac{{g}_{1}^{\text{T}}{g}_{1}}{{\left({g}_{1}+{d}_{0}\right)}^{\text{T}}{d}_{0}}\cdot \frac{{d}_{0}^{\text{T}}{g}_{0}}{{\alpha}_{0}}$ by Lemma (1) and ELS we get =zero

Now we assume that ${d}_{k-1}^{\text{T}}G{d}_{k}=0$ is correct. And we prove that ${d}_{k}^{\text{T}}G{d}_{k+1}=0$

$\begin{array}{c}{d}_{k}^{\text{T}}G{d}_{k+1}={d}_{k}^{\text{T}}G\left(-{g}_{k+1}+{\beta}_{k+1}{d}_{k}\right)\\ =-{d}_{k}^{\text{T}}G{g}_{k+1}+{\beta}_{k+1}{d}_{k}^{\text{T}}G{d}_{k}\\ =\frac{{\beta}_{k+1}{d}_{k}^{\text{T}}\left({g}_{k+1}-{g}_{k}\right)}{{\alpha}_{k}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{when}\text{\hspace{0.17em}}{\alpha}_{k}\text{\hspace{0.17em}}\text{in}\text{\hspace{0.17em}}\left(\text{3}\right)\\ =-{\beta}_{k+1}\frac{{d}_{k}^{\text{T}}{g}_{k}}{{\alpha}_{k}}\\ =\frac{-{g}_{k+1}^{\text{T}}{g}_{k+1}}{{\left({g}_{k+1}+{d}_{k}\right)}^{\text{T}}{d}_{k}}\cdot \frac{{d}_{k}^{\text{T}}{g}_{k}}{{\alpha}_{k}}\end{array}$

By Lemma (1) and ELS we get ${d}_{k}^{\text{T}}G{d}_{k+1}=0$ .

The fulfillment of the descent condition ${g}_{k}^{\text{T}}{d}_{k}<0$ .

The new method is shown as follows:

${g}_{k}^{\text{T}}{d}_{k}=-{g}_{k}^{\text{T}}{g}_{k}+{\beta}^{M}{g}_{k}^{\text{T}}{d}_{k-1}$

By ELS, we get

${g}_{k}^{\text{T}}{d}_{k}=-{g}_{k}^{\text{T}}{g}_{k}=-{\Vert {g}_{k}\Vert}^{2}<0$

So ${g}_{k}^{\text{T}}{d}_{k}<0$ .

Thus the descent condition is held.

3. Global Convergence

An analysis of the overall convergence using the Exact Line search (ELS) demonstrates according to the following hypotheses:

1) In the neighborhood N of L the function f(x) is continuous, derivative, bound and defined at the level set
$L=\left\{x,f\left(x\right)\le f\left({x}_{0}\right)\right\}$ , when x_{0} is an initial point.

2) The gradient is Lipschitz condition when there is a constant number L > 0 so that

$\Vert g\left(x\right)-g\left(y\right)\Vert \le L\Vert x-y\Vert ,\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{for}\text{\hspace{0.17em}}\text{all}\text{\hspace{0.17em}}x,y\in N$

According to these assumptions we have the following taken by Zoutendijk [7] .

Lemma 2: Assuming assumption 1) is correct, we consider the conjugate regression methods formulated in formula (3), where d_{k} is the descent search direction,
${\alpha}_{k}$ fulfills the exact line search of the minimization rules, so the following condition defined by the Zoutendijk condition is held:

$\underset{k=0}{\overset{\infty}{{\displaystyle \sum}}}\frac{{\left({g}_{k}^{\text{T}}{d}_{k}\right)}^{2}}{{\Vert {d}_{k}\Vert}^{2}}<\infty $ (7)

From Lemma (2), we can obtain a convergence theorem of the conjugate gradient CG method using

${\beta}_{k}^{ME}=\frac{{g}_{k}^{\text{T}}{g}_{k}}{{\left({g}_{k}+{d}_{k-1}\right)}^{\text{T}}{d}_{k-1}}$ (8)

Theorem 1: Suppose that the assumption 1) is satisfied. Consider every CG method in the form (4), where ${\alpha}_{k}$ is obtained by the exact minimization rules. Then either

$\underset{k\to \infty}{\mathrm{lim}}\Vert {g}_{k}\Vert =0\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{or}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\underset{k=0}{\overset{\infty}{{\displaystyle \sum}}}\frac{{\left({g}_{k}^{\text{T}}{d}_{k}\right)}^{2}}{{\Vert {d}_{k}\Vert}^{2}}<\infty $ (9)

Proof. By contradiction, if theorem 1 is not true, there exists a constant $c>0$ such that

$\Vert {g}_{k}\Vert \ge c$ (10)

${d}_{k}=-{g}_{k}+{\beta}_{k}{d}_{k-1}$

${d}_{k}+{g}_{k}={\beta}_{k}{d}_{k-1}$

Squaring both sides

${\Vert {d}_{k}\Vert}^{2}+2{g}_{k}^{\text{T}}{d}_{k}+{\Vert {g}_{k}\Vert}^{2}={\left|{\beta}_{k}\right|}^{2}{\Vert {d}_{k-1}\Vert}^{2}$

${\Vert {d}_{k}\Vert}^{2}={\left|{\beta}_{k}\right|}^{2}{\Vert {d}_{k-1}\Vert}^{2}-2{g}_{k}^{\text{T}}{d}_{k}-{\Vert {g}_{k}\Vert}^{2}$ (11)

But ${g}_{k}^{\text{T}}{d}_{k}=-c{\Vert {g}_{k}\Vert}^{2}$ .

Dividing both sides of (11) by ${\left({g}_{k}^{\text{T}}{d}_{k}\right)}^{2}$ given

$\begin{array}{c}\frac{{\Vert {d}_{k}\Vert}^{2}}{{\left({g}_{k}^{\text{T}}{d}_{k}\right)}^{2}}=\frac{{\left|{\beta}_{k}\right|}^{2}{\Vert {d}_{k-1}\Vert}^{2}}{{\left({g}_{k}^{\text{T}}{d}_{k}\right)}^{2}}-\frac{2{g}_{k}^{T}{d}_{k}}{{\left({g}_{k}^{\text{T}}{d}_{k}\right)}^{2}}-\frac{{\Vert {g}_{k}\Vert}^{2}}{{\left({g}_{k}^{\text{T}}{d}_{k}\right)}^{2}}\le \frac{{\left|{\beta}_{k}\right|}^{2}{\Vert {g}_{k-1}\Vert}^{2}}{{\left({g}_{k}^{\text{T}}{d}_{k}\right)}^{2}}+\frac{1}{{\Vert {g}_{k}\Vert}^{2}}\\ \le {\left\{\frac{{g}_{k}^{\text{T}}{g}_{k}}{{\left({g}_{k}+{d}_{k-1}\right)}^{\text{T}}{d}_{k-1}}\right\}}^{2}\frac{{\Vert {d}_{k-1}\Vert}^{2}}{{\left({g}_{k}^{\text{T}}{d}_{k}\right)}^{2}}+\frac{1}{{\Vert {g}_{k}\Vert}^{2}}\\ \le {\left\{\frac{{\Vert {g}_{k}\Vert}^{2}}{{g}_{k}^{\text{T}}{d}_{k-1}+{\Vert {d}_{k-1}\Vert}^{2}}\right\}}^{2}\frac{{\Vert {d}_{k-1}\Vert}^{2}}{{\left({\Vert {g}_{k}\Vert}^{2}\right)}^{2}}+\frac{1}{{\Vert {g}_{k}\Vert}^{2}}\\ \le \frac{{\Vert {g}_{k}\Vert}^{4}}{{\Vert {d}_{k-1}\Vert}^{4}}\frac{{\Vert {d}_{k-1}\Vert}^{2}}{{\Vert {g}_{k}\Vert}^{4}}+\frac{1}{{\Vert {g}_{k}\Vert}^{2}}\le \frac{1}{{\Vert {d}_{k-1}\Vert}^{2}}+\frac{1}{{\Vert {g}_{k}\Vert}^{2}}\end{array}$ (12)

But note that $\frac{1}{{\Vert {d}_{0}\Vert}^{2}}=\frac{1}{{\Vert {g}_{0}\Vert}^{2}}$ , then from (12) we get

$\frac{{\Vert {g}_{k}\Vert}^{2}}{{\left({g}_{k}^{\text{T}}{d}_{k}\right)}^{2}}\le \frac{1}{{\Vert {g}_{k-1}\Vert}^{2}}+\frac{1}{{\Vert {g}_{k}\Vert}^{2}}\le \underset{i=0}{\overset{k}{{\displaystyle \sum}}}\frac{1}{{\Vert {g}_{i}\Vert}^{2}}$

$\therefore \frac{{\left({g}_{k}^{\text{T}}{d}_{k}\right)}^{2}}{{\Vert {d}_{k}\Vert}^{2}}\ge \frac{{c}^{2}}{k}$ (13)

From (10) and (13) we get

$\underset{k=0}{\overset{\infty}{{\displaystyle \sum}}}\frac{{\left({g}_{k}^{\text{T}}{d}_{k}\right)}^{2}}{{\Vert {d}_{k}\Vert}^{2}}=\infty $

This contradicts the Zoutendijk condition in lemma (2) which completes the proof. □

4. Numerical Results

In this section we consider the numerical solution for this research. The conjugate gradient method of ME, Dai and Yuan, and Fletcher and Reeves were tested. Some test problems considered in Andrei [8] . We are selected based on the number of iteration and number of function evaluation (Table 1 and Table 2).

Table 1. Comparison of the algorithms for n = 100.

Table 2. Comparison of the algorithms for n = 1000.

5. Conclusion

A new kind of parameter in the conjugate gradient method for large scale unconstrained optimization problems is proposed. Numerical results are detected that the new method is superior in practice with competitive DY and FR methods.

A List of Test Function

F_{1} Extended Trigonometric Function.

F_{2} Diagonal 2 function.

F_{3} Extended Tridiagonal −1 function.

F_{4} Extended Three Exponential Terms.

F_{5} Generalized PSC1 function.

F_{6} Extended PSC1 Function.

F_{7} Extended Block Diagonal BD1 function.

F_{8} Extended Quadratic Penalty QP1 function.

F_{9} Extended Tridiagonal −2 function.

F_{10} Nondquar (CUTE).

F_{11} DIXMAANC (CUTE).

F_{12} DIXMAANE (CUTE).

F_{13} EDENSCH function (CUTE).

F_{14} STAIRCASE S1/F_{52} VARDIM function (CUTE).

F_{15} ENGVAL1 (CUTE).

F_{16} DENSCHNA (CUTE).

F_{17} DENSCHNB (CUTE).

F_{18} DIGGSB1 (CUTE).

F_{19} Diagonal 7.

F_{20} SINCOS.

F_{21} HIMMELBG (CUTE).

Conflicts of Interest

The authors declare no conflicts of interest regarding the publication of this paper.

[1] | Fletcher, R. and Reeves, C.M. (1964) Function Minimization by Conjugate Gradients. The Computer Journal, 7, 149-154. https://doi.org/10.1093/comjnl/7.2.149 |

[2] | Polak, E. and Ribiere, G. (1969) Note sur la convergence de méthodes de directions conjuguées. ESAIM: Mathematical Modelling and Numerical Analysis-Modélisation Mathématique et Analyse Numérique, 3, 35-43. https://doi.org/10.1051/m2an/196903R100351 |

[3] |
Hestenes, M.R. and Stiefel, E. (1952) Methods of Conjugate Gradients for Solving Linear Systems. Journal of Research National Bureau Standards, 49, 409-436.
https://doi.org/10.6028/jres.049.044 |

[4] |
Dai, Y.-H. and Yuan, Y. (1999) A Nonlinear Conjugate Gradient Method with a Strong Global Convergence Property. SIAM Journal on Optimization, 10, 177-182.
https://doi.org/10.1137/S1052623497318992 |

[5] |
Liu, Y. and Storey, C. (1991) Efficient Generalized Conjugate Gradient Algorithms, Part 1: Theory. Journal of Optimization Theory and Applications, 69, 129-137.
https://doi.org/10.1007/BF00940464 |

[6] | Fletcher, R. (1987) Practical Methods of Optimization, Vol. 1, Unconstrained Optimization. Wiley, New York. |

[7] |
Sun, J. and Zhang, J. (2001) Global Convergence of Conjugate Gradient Methods without Line Search. Annals of Operations Research, 103, 161-173.
https://doi.org/10.1023/A:1012903105391 |

[8] | Andrei, N. (2008) An Unconstrained Optimization Test Functions Collection. Advanced Modeling and Optimization, 10, 147-161. |

Copyright © 2020 by authors and Scientific Research Publishing Inc.

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.