^{1}

^{1}

In this paper, a new CG method has been introduced to solve nonlinear equations systems. This method achieved the conditions of descent and global convergence, using the exact line search. The numerical results were good compared to other methods in terms of the number of iterations and the number of functions evaluation.

The conjugate gradient method is one of the important ways to find the minimum value of a function for unconstrained optimization.

The conjugate gradient method is widespread because its requirements are a small memory. Unconstrained optimization problem can be expressed as follows:

min x ∈ R n f ( x ) (1)

where f : R n → R is a continuous and derivative function. The CG method generates frequent updates in this format.

x k + 1 = x k + α k d k , k = 1 , 2 , 3 , 4 (2)

where x_{k} is the current iteration point, α k > 0 is the positive step size using the “exact line search” as shown by the following:

α k = min α > 0 f ( x k + α k d k ) (3)

and d_{k} is the search direction, which we get as follows:

d k = { − g k for k = 0 − g k + β k d k − 1 for k ≥ 1 (4)

where k is integer and that g_{k} is the gradient of the function f(x) and that β_{k} is the coefficient of the conjugate gradient associated with the function f(x) at the point x_{k}.

Some of the known conjugation methods are:

β k F R = g k T g k ‖ g k − 1 ‖ 2 , β k P R = g k T ( g k − g k − 1 ) ‖ g k − 1 ‖ 2

β k H S = g k T ( g k − g k − 1 ) ( g k − g k − 1 ) T d k − 1 , β k D Y = g k T g k ( g k − g k − 1 ) T d k − 1

β k L s = g k T ( g k − g k − 1 ) − g k − 1 T d k − 1 , β k C D = − g k T g k d k − 1 T g k − 1

The coefficient gradient coefficient β k ∈ R is a numerical constant, which determines the difference in different CG methods when g k − 1 , g k denote the gradient of a function f(x) at points x k − 1 , x k , respectively.

The above methods are known as:

Fletcher and Reeves (FR) [

These aforementioned methods behave strictly convex quadratic functions in a behavior that is completely different from what they do in non-quadratic general functions. In any case, most of these methods examine the properties of universal approach in the field of conjugated gradient.

However, in recent years, there have been many attempts that have been directed towards building new formulas for CG methods with good numerical performance and achieving the characteristics of global convergence.

It is well known that the methods of numerical optimization are iterative methods and there is no specific method suitable for all types of problems. Each method has its advantages and new features as well as some of the characteristics that are not good and are efficient for some types of problems and not efficient for other types of problems.

The new coefficient of gradient is

β k M E = g k T g k ( g k + d k − 1 ) T d k − 1 (5)

New method algorithm

Step (1): Set ϵ > 0 , d 0 = − g 0 , k = 0 and choose an initial value X_{0}

Step (2): Calculate β M E from (5)

Step (3): Calculate d k = − g k + β k M E d k − 1

In the case if ‖ g k ‖ = 0 , stop

Step (4): Calculate α k = min α > 0 f ( x k + α d k )

Step (5): Calculate the new point with the following iterative formula:

x k + 1 = x k + α k d k (6)

Step (6): Test if it is

f ( x k + 1 ) < f (xk)

And also

‖ g k ‖ ≤ ϵ Stop

Otherwise, go to step (1) with k = k + 1

The coefficient β k is chosen in such a way that d k + 1 is G-conjugate to d 0 , d 1 , d 2 , ⋯ , d k .

Lemma (1)

In the conjugate direction algorithm

g k + 1 T d i = 0 for all k , 0 ≤ k ≤ n − 1 and 0 ≤ i ≤ k .

Proposition: In the conjugate gradient algorithm the direction d 0 , d 1 , ⋯ , d n − 1 are G-conjugate.

Proof: By using induction

We first show

d 0 T G d 1 = 0

d 0 T G d 1 = d 0 T G ( − g 1 + β 1 d 0 ) = − d 0 T G g 1 + β 1 d 0 T G d 0

by ELS

= β 1 d 0 T ( g 1 − g 0 ) α 0 when α 0 in (3)

= − β 1 d 0 T g 0 α 0

= − g 1 T g 1 ( g 1 + d 0 ) T d 0 ⋅ d 0 T g 0 α 0 by Lemma (1) and ELS we get =zero

Now we assume that d k − 1 T G d k = 0 is correct. And we prove that d k T G d k + 1 = 0

d k T G d k + 1 = d k T G ( − g k + 1 + β k + 1 d k ) = − d k T G g k + 1 + β k + 1 d k T G d k = β k + 1 d k T ( g k + 1 − g k ) α k when α k in ( 3 ) = − β k + 1 d k T g k α k = − g k + 1 T g k + 1 ( g k + 1 + d k ) T d k ⋅ d k T g k α k

By Lemma (1) and ELS we get d k T G d k + 1 = 0 .

The fulfillment of the descent condition g k T d k < 0 .

The new method is shown as follows:

g k T d k = − g k T g k + β M g k T d k − 1

By ELS, we get

g k T d k = − g k T g k = − ‖ g k ‖ 2 < 0

So g k T d k < 0 .

Thus the descent condition is held.

An analysis of the overall convergence using the Exact Line search (ELS) demonstrates according to the following hypotheses:

1) In the neighborhood N of L the function f(x) is continuous, derivative, bound and defined at the level set L = { x , f ( x ) ≤ f ( x 0 ) } , when x_{0} is an initial point.

2) The gradient is Lipschitz condition when there is a constant number L > 0 so that

‖ g ( x ) − g ( y ) ‖ ≤ L ‖ x − y ‖ , for all x , y ∈ N

According to these assumptions we have the following taken by Zoutendijk [

Lemma 2: Assuming assumption 1) is correct, we consider the conjugate regression methods formulated in formula (3), where d_{k} is the descent search direction, α k fulfills the exact line search of the minimization rules, so the following condition defined by the Zoutendijk condition is held:

∑ k = 0 ∞ ( g k T d k ) 2 ‖ d k ‖ 2 < ∞ (7)

From Lemma (2), we can obtain a convergence theorem of the conjugate gradient CG method using

β k M E = g k T g k ( g k + d k − 1 ) T d k − 1 (8)

Theorem 1: Suppose that the assumption 1) is satisfied. Consider every CG method in the form (4), where α k is obtained by the exact minimization rules. Then either

lim k → ∞ ‖ g k ‖ = 0 or ∑ k = 0 ∞ ( g k T d k ) 2 ‖ d k ‖ 2 < ∞ (9)

Proof. By contradiction, if theorem 1 is not true, there exists a constant c > 0 such that

‖ g k ‖ ≥ c (10)

d k = − g k + β k d k − 1

d k + g k = β k d k − 1

Squaring both sides

‖ d k ‖ 2 + 2 g k T d k + ‖ g k ‖ 2 = | β k | 2 ‖ d k − 1 ‖ 2

‖ d k ‖ 2 = | β k | 2 ‖ d k − 1 ‖ 2 − 2 g k T d k − ‖ g k ‖ 2 (11)

But g k T d k = − c ‖ g k ‖ 2 .

Dividing both sides of (11) by ( g k T d k ) 2 given

‖ d k ‖ 2 ( g k T d k ) 2 = | β k | 2 ‖ d k − 1 ‖ 2 ( g k T d k ) 2 − 2 g k T d k ( g k T d k ) 2 − ‖ g k ‖ 2 ( g k T d k ) 2 ≤ | β k | 2 ‖ g k − 1 ‖ 2 ( g k T d k ) 2 + 1 ‖ g k ‖ 2 ≤ { g k T g k ( g k + d k − 1 ) T d k − 1 } 2 ‖ d k − 1 ‖ 2 ( g k T d k ) 2 + 1 ‖ g k ‖ 2 ≤ { ‖ g k ‖ 2 g k T d k − 1 + ‖ d k − 1 ‖ 2 } 2 ‖ d k − 1 ‖ 2 ( ‖ g k ‖ 2 ) 2 + 1 ‖ g k ‖ 2 ≤ ‖ g k ‖ 4 ‖ d k − 1 ‖ 4 ‖ d k − 1 ‖ 2 ‖ g k ‖ 4 + 1 ‖ g k ‖ 2 ≤ 1 ‖ d k − 1 ‖ 2 + 1 ‖ g k ‖ 2 (12)

But note that 1 ‖ d 0 ‖ 2 = 1 ‖ g 0 ‖ 2 , then from (12) we get

‖ g k ‖ 2 ( g k T d k ) 2 ≤ 1 ‖ g k − 1 ‖ 2 + 1 ‖ g k ‖ 2 ≤ ∑ i = 0 k 1 ‖ g i ‖ 2

∴ ( g k T d k ) 2 ‖ d k ‖ 2 ≥ c 2 k (13)

From (10) and (13) we get

∑ k = 0 ∞ ( g k T d k ) 2 ‖ d k ‖ 2 = ∞

This contradicts the Zoutendijk condition in lemma (2) which completes the proof. □

In this section we consider the numerical solution for this research. The conjugate gradient method of ME, Dai and Yuan, and Fletcher and Reeves were tested. Some test problems considered in Andrei [

FR | DY | ME | ||||
---|---|---|---|---|---|---|

Nof | NoI | Nof | NoI | Nof | NoI | F |

35 | 19 | 34 | 18 | 24 | 13 | F_{1 } |

2025 | 2001 | 114 | 68 | 70 | 37 | F_{2 } |

64 | 32 | 21 | 10 | 13 | 5 | F_{3 } |

25 | 15 | 28 | 17 | 30 | 19 | F_{4 } |

2103 | 2001 | 694 | 389 | 105 | 61 | F_{5 } |

31 | 15 | 17 | 8 | 21 | 11 | F_{6 } |

98 | 63 | 98 | 63 | 59 | 37 | F_{7 } |

26 | 11 | 22 | 9 | 29 | 13 | F_{8 } |

65 | 40 | 58 | 38 | 27 | 15 | F_{9 } |

2355 | 2001 | 3735 | 1977 | 526 | 295 | F_{10 } |

24 | 13 | 24 | 13 | 20 | 11 | F_{11 } |

218 | 121 | 134 | 86 | 72 | 41 | F_{12 } |

1202 | 69 | 150 | 28 | 104 | 53 | F_{13 } |

1066 | 671 | 962 | 619 | 430 | 285 | F_{14 } |

57 | 34 | 567 | 40 | 137 | 67 | F_{15 } |

33 | 20 | 19 | 10 | 23 | 14 | F_{16 } |

25 | 12 | 15 | 7 | 17 | 8 | F_{17} |

743 | 439 | 759 | 486 | 395 | 234 | F_{18 } |

7 | 3 | 9 | 3 | 7 | 32 | F_{19 } |

31 | 15 | 17 | 8 | 21 | 11 | F_{20 } |

11 | 9 | 11 | 9 | 7 | 5 | F_{21 } |

10,244 | 7604 | 7488 | 3906 | 2137 | 1267 | Total |

F/R | DY | ME | ||||
---|---|---|---|---|---|---|

Nof | NoI | Nof | NoI | Nof | NoI | F |

65 | 38 | 65 | 38 | 31 | 14 | F_{1 } |

2005 | 2001 | 292 | 179 | 210 | 137 | F_{2 } |

129 | 77 | 29 | 15 | 28 | 13 | F_{3 } |

3531 | 127 | 19 | 11 | 25 | 13 | F_{4 } |

2073 | 2001 | 878 | 436 | 229 | 139 | F_{5 } |

17 | 8 | 15 | 7 | 13 | 6 | F_{6 } |

105 | 67 | 105 | 67 | 66 | 41 | F_{7 } |

125 | 16 | 21 | 8 | 35 | 15 | F_{8 } |

68 | 43 | 59 | 39 | 42 | 27 | F_{9 } |

2066 | 2001 | 3773 | 2001 | 2012 | 2001 | F_{10 } |

25 | 14 | 25 | 14 | 21 | 12 | F_{11 } |
---|---|---|---|---|---|---|

634 | 345 | 345 | 220 | 190 | 112 | F_{12 } |

1967 | 98 | 642 | 45 | 136 | 66 | F_{13 } |

2897 | 2001 | 3140 | 2001 | 2240 | 1925 | F_{14 } |

3616 | 142 | 4477 | 157 | 130 | 63 | F_{15 } |

35 | 19 | 18 | 9 | 28 | 17 | F_{16 } |

23 | 11 | 15 | 7 | 17 | 8 | F_{17} |

2851 | 2001 | 3126 | 2001 | 2098 | 2001 | F_{18 } |

9 | 4 | 11 | 4 | 9 | 23 | F_{19 } |

17 | 8 | 15 | 7 | 13 | 6 | F_{20 } |

11 | 9 | 11 | 9 | 8 | 6 | F_{21 } |

22,269 | 11,031 | 17,081 | 7275 | 7581 | 6645 | Total |

A new kind of parameter in the conjugate gradient method for large scale unconstrained optimization problems is proposed. Numerical results are detected that the new method is superior in practice with competitive DY and FR methods.

The authors declare no conflicts of interest regarding the publication of this paper.

Hady, M.M.A. and Younis, M.S. (2020) New Parameter of CG-Method with Exact Line Search for Unconstraint Optimization. Open Access Library Journal, 7: e6236. https://doi.org/10.4236/oalib.1106236

F_{1} Extended Trigonometric Function.

F_{2} Diagonal 2 function.

F_{3} Extended Tridiagonal −1 function.

F_{4} Extended Three Exponential Terms.

F_{5} Generalized PSC1 function.

F_{6} Extended PSC1 Function.

F_{7} Extended Block Diagonal BD1 function.

F_{8} Extended Quadratic Penalty QP1 function.

F_{9} Extended Tridiagonal −2 function.

F_{10} Nondquar (CUTE).

F_{11} DIXMAANC (CUTE).

F_{12} DIXMAANE (CUTE).

F_{13} EDENSCH function (CUTE).

F_{14} STAIRCASE S1/F_{52} VARDIM function (CUTE).

F_{15} ENGVAL1 (CUTE).

F_{16} DENSCHNA (CUTE).

F_{17} DENSCHNB (CUTE).

F_{18} DIGGSB1 (CUTE).

F_{19} Diagonal 7.

F_{20} SINCOS.

F_{21} HIMMELBG (CUTE).