_{1}

In this paper, we suggest and analyze some new derivative free iterative methods for solving nonlinear equation using a trust-region method. We also, give several examples to illustrate the efficiency of these methods. Comparison with other similar method is also given. This tech-nique can be used to suggest a wide class of new iterative methods for solving optimization problem. For, solving linearly unconst rained optimi-zation problems without derivatives, a derivative-free Funnel method for unconstrained non-linear optimization is proposed. The study presents new interpolation-based techniques. The main work of this paper depends on some matrix computation techniques. A linear system is solved to obtain the required quadratic model at each iteration. Interpolation points are based on polynomial which is then minimized in a trust-region.

The main of this paper is to study methods to solve optimization problem whose objective function does not possess partial derivatives available at hard [

method starts with an initial data N = ( n + 1 ) ( n + 2 ) 2 points, where, n is the

dimension of the problem. An initial guess x 0 is provided first, then using random Householder’s and Given’s, matrices, the ( N − 1 ) other points are generated [

Many interpolation-based trust-region methods construct local polynomial interpolation-based models of the objective function and compute steps by minimizing these models inside a region using the standard trust-region methodology [

If the interpolation set Y is poised, the basis of Lagrange polynomials { ` l i ( x ) } p i = 1 exists and is uniquely defined by [

Given a set of interpolation points Y = { y 1 , y 2 , ⋯ , y p } a basis of p polynomials l j ( x ) , j = 1 , ⋯ , p in p n d is called a basis of Lagrange polynomials if

l j ( y i ) = δ i j = { 1 if i = j , 0 if i ≠ j . (1)

The unique polynomial m ( x ) which interpolates f ( x ) on Y using this basis of Lagrange polynomials can be expressed as

m ( x ) = ∑ i = 1 p f ( y i ) l i ( x ) . (2)

Moreover, for every poised set Y = { y 1 , y 2 , ⋯ , y p } , we have that

∑ i = 1 p l i ( x ) = 1 for all x ∈ I R n . (3)

The accuracy of m ( x ) as an approximation of the objective function f in some region B ⊂ I R n can be quantified using the following notion [

Λ ≥ max 1 ≤ i ≤ p max x ∈ B | l i ( x ) | . (4)

The right hand side of (3) is related to the Lebesgue constant Λ n of the set which is defined as

Λ n = max x ∈ B ∑ i = 1 n | l i ( x ) | , (5)

see for instance [

max 1 ≤ i ≤ n | l i ( x ) | ≤ ∑ i = 1 n | l i ( x ) | ≤ n max 1 ≤ i ≤ n | l i ( x ) | , (6)

we conclude that

Λ ≤ Λ n ≤ n Λ . (7)

It is a measure of the accuracy of the polynomial interpolation at the set of points below. This suggests to look for a set of interpolation points with a small Lebesgue constant. Hence, conversely, the smaller Λ, the better the geometry of the set Y, importantly for our purposes, Lagrange polynomial values and Λ-posedness can be used to bound the model function and model gradient error. In particular, it is shown in Ciarlet and Raviart [

Using the natural basis ∅ ¯ = { 1 , x 1 , x 2 , 1 2 x 1 2 , 1 2 x 2 , x 1 x 2 } and a sample set Y = { y 1 , y 2 , y 3 , y 4 } , with y 1 = ( 0 , 0 ) , y 2 = ( 0 , 1 ) , y 3 = ( 1 , 0 ) , y 4 = ( 1 , 1 ) .

The matrix M ( ∅ , Y )

M ( ∅ , Y ) = [ 1 0 0 0 0 0 1 0 1 0 0.5 0 1 1 0 0.5 0 0 1 1 1 0.5 0.5 1 ] .

Choosing now the first four columns of M ( ∅ , Y ) , the system is determined but not well defined since the matrix is singular. We see now that the set Y is not

poised with respect to the sub-basis ∅ = { 1 , x 1 , x 2 , 1 2 x 1 2 } , but if we selected the

sub-basis ∅ = { 1 , x 1 , x 2 , x 1 x 2 } , the set Y is well-poised and the corresponding matrix consisting of the first, the second, the third, and the sixth columns of M ( ∅ , Y ) is well-conditioned and a unique solution to this determined system exists. is given by [

We consider the bound-constrained optimization problem

min x ∈ I R n f ( x ) , (8)

where f is a nonlinear function from I R n into I R , which is bounded below, and where l and u are vectors of (possibly infinite) lower and upper bounds on x. We denote the feasible domain of this problem by F [

a model of the form

m k ( x k , s ) = f ( x k ) + g k T s + 1 2 s T H k s (9)

(where g k and H k are the function’s gradient and Hessian, respectively) is minimized inside a trust region B ∞ ( x k , Δ k ) , as derivatives are not given, g k and H k are approximated by determining its coefficients (here represented by the vector, α ) from the interpolation conditions [

m ( y i ) = ∑ j = 1 p α j ∅ j ( y j ) = f ( y i ) , i = 1 , ⋯ , p . (10)

The points y 1 , ⋯ , y p considered in the interpolation conditions (10) form the interpolation, set Y k . The set Y k contains in our case at least n + 1 points and is chosen as a subset of X k , the set of all points where the value of the objective function f is known. How to choose this interpolation set is of course one of the main issues we have to address below, as not every set Y k is suitable because of posedness issues [

In a trust-region algorithm for choosing the trust-region radius Δ k at each iteration. We base this choice on the agreement between the model function m k and the objective function f at previous iterations. Given a step p k we define the ratio:

ρ k = f ( x k ) − f ( x k + p k ) m k ( 0 ) − m k ( p k ) , (11)

The numerator is actual reduction, and the denominator is the predicted reduction (that is, the reduction in f predicted by the model function). The method has been developed and especially model-based trust-region methods have been shown to perform well. It improves the efficiency of method while maintaining its good theoretical convergence properties, furthermore, the unconstrained method applying a model-based derivative-free method [

The accuracy of the method is acceptable and relies on the contours of the objective function. The step p k is obtained by minimizing the model m k over a region that includes ρ k = 0 , the predicted reduction will always be nonnegative. Hence, if ρ k is negative, the new objective value f ( x k + p k ) is greater than the current value f ( x k ) , so the step must be rejected. On the other hand, if ρ k is close to 1, there is good agreement between the model m k and the function f over this step, so it is safe to expand the trust region for the next iteration. If ρ k is positive but significantly smaller than 1, do not alter the trust region, but if it is close to zero or negative, we shrink the trust region by reducing Δ k at the next iteration [

Here Δ ^ (radius) is an overall bound on the step lengths. That the radius is increased only if ‖ p k ‖ actually reaches the boundary of the trust region. If the step stays strictly inside the region, we infer that the current value of Δ k is not interfering with the progress of the algorithm, so leave its value unchanged for the next iteration. The shape of the points looks like a moving cone. To turn Algorithm into a practical algorithm, we need to focus on solving the trust-region sub-problem

min p ∈ I R n m k ( p ) = f k + g k T p + 1 2 p T B k p (12)

In discussing this matter, we sometimes drop the iteration subscript k and restate the problem as follows

min p ∈ I R n m ( p ) = f + g T p + 1 2 p T B p

A first step to exact solutions is given by the following:

Assume that at the current iterate x k we have a set of sample points

Y = { y 1 , y 2 , ⋯ y q } ,

with y i ∈ I R n , i = 1 , 2 , ⋯ , q again assume that x k is an element of this set and that no point in Y has a lower function value than x k . This point depends on Householder’s and Given’s matrices [

m k ( x k + p ) = c + g T p + 1 2 p T G p = f ( y ) (13)

In general q = ( n + 1 ) ( n + 2 ) 2 the matrix is given as the follows:

m ( x + p ) = c + [ g 1 ⋯ g n ] [ p 1 ⋮ p n ] + 1 2 [ p 1 ⋯ p n ] [ g 11 ⋯ g 1 n ⋮ ⋱ ⋮ g n 1 ⋯ g n n ] [ p 1 ⋮ p n ] = [ f ( y 1 ) ⋮ f ( y n ) ] (14)

[ 1 p 1 ⋯ p n 1 2 p 1 2 ⋯ 1 2 p n 2 p 1 p 2 p 1 p 3 ⋯ p 1 p n ⋮ ⋮ ⋮ ⋮ 1 ] [ c g 1 g 2 ⋮ g n G 11 ⋮ G 1 n ] = [ f ( y 1 ) ⋮ f ( y n ) ]

Algorithm (4.1)

1) Input the Y = [ y 1 , y 2 , ⋯ , y N ] ,

then set N = 1 2 ( n + 1 ) ( n + 2 ) size of Y.

2) Solve the model (3) to find c , g , G .

c = h 1 ;

g = h ( 2 : n + 1 ) ;

for i = 1 , 2 , ⋯ , n

The matrix G i i = h n + 1 + i ;

end

m = 2 n + 1 ;

1) for i = 1 , 2 , ⋯ , n − 1

i i = m + ( i − 1 ) ( 2 n − i ) 2 ;

2) for j = i + 1 , ⋯ , n

G ( i , j ) = h ( i i + j − i ) ;

G ( j , i ) = G ( i , j ) ;

end

Output c , g , G

If x * is a local minimize and f is continuously differentiable in an open neighborhood of x * , then ∇ f ( x * ) = 0 [

If x * is a local minimize of f and ∇ 2 f exists and is continuous in an open neighborhood of x * , then ∇ f ( x * ) = 0 and ∇ 2 f ( x * ) is positive semi definite [

Suppose that ∇ 2 f is continuous in an open neighborhood of x * and that ∇ f ( x * ) = 0 and ∇ 2 f ( x * ) is positive definite. Then x * is a strict local minimize of f [

Let f be twice Lipschitz continuously differentiable in a neighborhood of a point x * at which second-order sufficient conditions Theorem 4.4 are satisfied. Suppose the sequence { x k } converges to x * and that for all k sufficiently large, the trust-region algorithm based on B k = ∇ 2 f ( x k ) chooses steps p k that satisfy the Cauchy-point-based model reduction criterion and are asymptotically

similar to Newton steps p k N whenever ‖ p k N ‖ ≤ 1 2 Δ k , that is [

‖ p k − p k N ‖ = o ( ‖ p k N ‖ ) . (11)

Then the trust-region bound Δ k becomes inactive for all k sufficiently large and the sequence { x k } converges super-linearly to x * [

The following functions are used in testing method have been selected from [

1) f ( x ) = 100 ( x 2 − x 1 2 ) 2 + ( 1 − x 1 2 )

2) f ( x ) = ( x 1 − x 3 ) 4 + 7 ( x 2 − 1 ) 2 + 10 ( x 3 − 10 ) 4

The Output (Tables 1-12)xx = | |||||||||
---|---|---|---|---|---|---|---|---|---|

1.5129 | 1.5129 | 1.5129 | 1.5146 | 1.5160 | 1.5160 | 1.5173 | 1.5178 | 1.5187 | 1.5187 |

2.3718 | 2.3718 | 2.3718 | 2.3711 | 2.3701 | 2.3701 | 2.3709 | 2.3705 | 2.3702 | 2.3702 |

4.9805 | 4.9805 | 4.9805 | 4.9813 | 4.9807 | 4.9807 | 4.9790 | 4.9790 | 4.9791 | 4.9791 |

f_{x} = | |||||||||
---|---|---|---|---|---|---|---|---|---|

−0.6015 | −0.6015 | −0.6015 | −0.6977 | −0.7813 | −0.7813 | −0.8325 | −0.8578 | −0.9005 | −0.9005 |

xx = | |||||||||
---|---|---|---|---|---|---|---|---|---|

1.5194 | 1.5194 | 1.5194 | 1.5194 | 1.5194 | 1.5194 | 1.5194 | 1.5194 | 1.5194 | 1.5194 |

2.3689 | 2.3689 | 2.3689 | 2.3689 | 2.3689 | 2.3689 | 2.3689 | 2.3689 | 2.3689 | 2.3689 |

4.9789 | 4.9789 | 4.9789 | 4.9789 | 4.9789 | 4.9789 | 4.9789 | 4.9789 | 4.9789 | 4.9789 |

f_{x} = | |||||||||
---|---|---|---|---|---|---|---|---|---|

−0.9432 | −0.9443 | −0.9443 | −0.9443 | −0.9443 | −0.9443 | −0.9443 | −0.9452 | −0.9452 | −0.9452 |

xx = | |||||||||
---|---|---|---|---|---|---|---|---|---|

1.5281 | 1.5281 | 1.5281 | 1.5284 | 1.5285 | 1.5285 | 1.5285 | 1.5285 | 1.5285 | 1.5285 |

2.3693 | 2.3693 | 2.3693 | 2.3690 | 2.3690 | 2.3690 | 2.3690 | 2.3690 | 2.3690 | 2.3690 |

4.9708 | 4.9708 | 4.9708 | 4.9705 | 4.9704 | 4.9704 | 4.9704 | 4.9704 | 4.9704 | 4.9704 |

f_{x} = | |||||||||
---|---|---|---|---|---|---|---|---|---|

−1.2180 | −1.2180 | −1.2180 | −1.2264 | −1.2299 | −1.2299 | −1.2299 | −1.2299 | −1.2299 | −1.2299 |

xx = | |||||||||
---|---|---|---|---|---|---|---|---|---|

−0.0964 | −0.4549 | −1.0881 | 2.3169 | 2.3169 | 2.3290 | 2.4203 | 2.4203 | 2.1784 | 2.1784 |

−0.7355 | −2.3894 | −2.1112 | −0.7489 | −0.7475 | −0.7235 | −0.4007 | −0.4006 | 0.5426 | 0.5427 |

0.9289 | 3.8014 | 8.0268 | 9.3400 | 9.3408 | 9.3526 | 9.3281 | 9.3281 | 9.4005 | 9.4005 |

3.7442 | 4.0715 | 4.5942 | 1.3074 | 1.3083 | 1.2972 | 1.1319 | 1.1318 | 0.3410 | 0.3411 |

f_{x} = 1.0e + 004 | |||||||||
---|---|---|---|---|---|---|---|---|---|

6.7821 | 1.4986 | 0.0499 | 0.0024 | 0.0023 | 0.0023 | 0.0016 | 0.0016 | 0.0003 | 0.0003 |

xx = | |||||||||
---|---|---|---|---|---|---|---|---|---|

2.1816 | 2.1815 | 2.1815 | 2.1815 | 2.1623 | 2.1624 | 2.1624 | 2.1626 | 2.1737 | 2.1743 |

0.5507 | 0.5522 | 0.5508 | 0.5508 | 0.6030 | 0.6030 | 0.6031 | 0.6038 | 0.6257 | 0.6608 |

9.3986 | 9.4015 | 9.4050 | 9.4050 | 9.4486 | 9.4486 | 9.4486 | 9.4495 | 9.4679 | 9.5020 |

0.3479 | 0.3498 | 0.3511 | 0.3511 | 0.4060 | 0.4060 | 0.4060 | 0.4052 | 0.4156 | 0.4375 |

f_{x} = | |||||||||
---|---|---|---|---|---|---|---|---|---|

3.1698 | 3.1352 | 3.1143 | 3.1141 | 2.5203 | 2.5198 | 2.5197 | 2.5090 | 2.2481 | 1.8850 |

xx = | |||||||||
---|---|---|---|---|---|---|---|---|---|

2.1799 | 2.1799 | 2.1799 | 2.1843 | 2.1843 | 2.1850 | 2.1850 | 2.1851 | 2.1850 | 2.1850 |

0.6730 | 0.6730 | 0.6730 | 0.6900 | 0.6900 | 0.6912 | 0.6943 | 0.6943 | 0.6966 | 0.6966 |

9.5097 | 9.5097 | 9.5097 | 9.5167 | 9.5167 | 9.5174 | 9.5197 | 9.5198 | 9.5219 | 9.5219 |

0.4306 | 0.4306 | 0.4306 | 0.4480 | 0.4480 | 0.4481 | 0.4479 | 0.4479 | 0.4484 | 0.4484 |

f_{x} = | |||||||||
---|---|---|---|---|---|---|---|---|---|

1.7791 | 1.7790 | 1.7789 | 1.6613 | 1.6613 | 1.6512 | 1.6273 | 1.6270 | 1.6083 | 1.6082 |

In this paper it has been shown that using a randomly selected data set point, within an interpolation based method for derivative free optimization was adequate and practical. In the generating of these randomly selected points Householders and Given’s matrices were used. The randomness comes from the random seed vectors, which were then transformed by Householders and Given’s matrices into the required.

I would like to thank my supervisor, Dr. Muhsin Hassan Abdallah who was a great help to me.

The author declares no conflicts of interest regarding the publication of this paper.

Mu’lla, M.A.M. (2019) An Algorithm for the Derivative-Free Unconstrained Optimization Based on a Moving Random Cone Data Set. Open Access Library Journal, 6: e5652. https://doi.org/10.4236/oalib.1105652