A Gauss-Newton-Based Broyden’s Class Algorithm for Parameters of Regression Analysis

In this paper, a Gauss-Newton-based Broyden’s class method for parameters of regression problems is presented. The global convergence of this given method will be established under suitable conditions. Numerical results show that the proposed method is interesting.


Introduction
It is well known that the regression analysis often arises in economies, finance, trade, law, meteorology, medicine, biology, chemistry, engineering, physics, education, history, sociology, psychology, and so on (see [1][2][3][4][5][6][7]).The classical regression model is defined by where Y is the response variable, i X is predictor variable 1, 2, , , 0 i p p    is an integer constant, and  is the error.The function   , , , p X X X X   .If h is linear function, then we can get the following linear regression model which is the most simple regression model, where , , , p

  
 are regression parameters.On the other hand, the regression model is called nonlinear regression.We all know that there are many nonlinear regression could be linearization [8][9][10][11][12][13].Then many authors are devoted to the linear model [14][15][16][17][18][19].Now we will concentrate on the linear model to discuss the following problems.One of the most important work of the regress analysis is to estimate the parameters The least squares method is an important fitting method to determined the parameters where i h is the data valuation of the i th response variable, X X X  are p data valuation of the i th predictor variable, and m is the number of the data.If the dimension p and the number m is small, then we can obtain the parameters , , , p       from extreme value of calculus.From the definition of (1.2), it is not difficult to see that this problem (1.2) is the same as the following unconstrained optimization problem For regression problem (1.3), if the dimension n is large and the function f is complex, then it is difficult to solve this problem by the method of extreme value of calculus.In order to solve this problem, numerical methods are often used, such as steepest descent method, Newton method, and Guass-Newton method (see [5][6][7] et al.).Moreover many statical softwares are from this idea.Numerical method, i.e., the iterative method is to generates a sequence of points { } k x which will terminate or converge to a point x  in some sense.The line search method is one of the most effective numerical method, which is defined by where k  that is determined by a line search is the steplength, and k d which determines different line search methods [20][21][22][23][24][25][26][27][28][29][30] is a descent direction of f at k x .We give a line search method for regression problem and get good results (see [31] in detail).
In order to solve the problem (1.3), one main goal is to find some point x such that where     is the gradient of ( ) f x In this paper, we will concentrate on this equations problem (1.4) where 2 : n g    is continuously differentiable (linear or nonlinear).Assume that the Jacobian   Similar to (1.3), the following iterative formula is often used to solve the problem (1.4) where k d is a search direction, k  is a steplength along k d and k x is the k th iterative point.For (1.4), Griewank [32] first established a global convergence theorem for quasi-Newton method with a suitable line search.One nonmonotone backtracking inexact quasi-Newton algorithm [33] and the trust region algorithms [34,35] were presented.A Gauss-Newton-based BFGS (Broyden, Fletcher, Goldfar, and Shanno, 1970) method is proposed by Li and Fukushima [36] for solving symmetric nonlinear equations.Inspired by their ideas, Wei [37] and Yuan [38,39] made a further study.Recently, Yuan and Lu [40][41][42][43][44][45] got some new methods for symmetric nonlinear equations.
The authors [36] only discussed that the updated matrices were generated by the BFGS formula.Whether the updated matrices could be produced by the more extensive Broyden's class?This paper gives a positive answer, moreover, the presented method is used to regression analysis.The major contribution of this paper is an extension of the method in [36] to Broyden's class, moreover, to solving the regression problems.Numerical results of practically statistical problems show that this given method is effective.Throughout this paper, these notations are used:  is the Euclidean norm,   g  , respecttively.
In the next section, the method of Li and Fukushima [36] is stated.Our algorithm is proposed in Section 3.Under some reasonable conditions, the global convergence of the given algorithm is established in Section 4. In the Section 5, numerical results are reported.In the last section, a conclusion is stated.

A Gauss-Newton-Based BFGS Method [36]
Li and Fukushima [36] proposed a new BFGS update formula defined by: Where B is an initial symmetric positive definite matrix.By the secant equation s By solving the following linear equation to get the search direction k d .
  is sufficiently small and k B is positive definite, then they have the following approximate relation So, the solution of (2.2) is an approximate Gauss-Newton direction.Then the methods (2.1) and (2.2) are called Gauss-Newton-based BFGS method.In order to get the steplength where 1 2 , 0    are constants, and the positive se- (2.4) Li and Fukushima [36] only discussed that the updated matrices were generated by the BFGS formula.In this paper, we will prove that the updated matrices could be produced by the more extensive Broyden's class.Moreover, the presented method is used to regression analysis (1.3) Numerical results show that the given method is promising.

Algorithm
Now we give our algorithm as follows.
Algorithm 1 (Gauss-Newton-based Broyden's Class Algorithm) Step 0: Choose an initial point 0 n x R  ,an initial symmetric positive definite matrix 0 Step 3: Let the next iterative be Where Step 5: Let :  1 k k   Go to step 1.

Global Convergence
In this paper, we will establish the global convergence of Algorithm 1 under the condition about k  such that Let  be the level set defined by , where  is a positive con- Similar to [33,[36][37][38][39], the following Assumptions are needed.
Assumption A 1) g is continuously differentiable on an open convex set 1  containing  .
2) The Jaconbian of g is symmetric, bounded, and uniformly nonsingular on 1  , i.e., there exist positive constants 0 By Assumption A, similar to Lemma 2.2 in [36], it is not difficult to get the following lemma.So we only state it as follows but omit the proof. . where 2), then we have , By Equations (4.5)-(4.7),we have By Equation (4.10), we obtain Equation (4.9).The proof is complete.□Lemma 4.4 Proof.Omitted.For the proof can be seen from [21].□Let us denote cos .
The proof of the following lemma is motivated by the methods in [46,47].
Using this and Equation (4.6), we obtain From Equation (3.2), we have Consider the following two cases.1) From Equations (4.17 Combining this and Equation (4.6), we get [46]] we have Therefore, by Equation (4.23), we have, for all Since the term inside the brackets in Equation (4.22) is less than or equal to zero, we conclude from Equations (4.22) and (4.24) that for all According to Equations (4.22) and (4.24), for all is nonpositive for all 0 t  , achiexes its maximum value at 0 t  , and satisfies   w t   both as t   and 0 t  .Then it follows that for all For some constants ' 2  and 3  .By Equation (4.25), we get , we obtain for all Since ' k is a fixed integer and i B are positive definite, we can take smaller 1 2 ,   and larger 3  if necessary so that this lemma holds for all ' i k  Therefore Equation (4.12) holds for at least Similar to [36], it is not difficult to get the global convergence theorem of Algorithm 1.So we only state as follows but omit the proof.(4.27)

Numerical Results
In this section, we report results of some numerical experiments with the proposed method.We will test two practically statistical problems to show the efficiency of Algorithm 1. Problem 1.In Table 1, there is data of the age x and the average height H of a pine tree: Our objective is to find out the approximate function between the demand and the price, namely, we need to find the regression equation of x to the h .It is easy to see that the age x and the average height H are parabola relations.Denote the regression function by where Y is overall appraisal to supervisor, 1 X denotes to processes employee's complaining, 2 X refer to do not permit the privilege, 3 X is the opportunity about study, 4 X is promoted based on the work achievement, 5 X refer to too nitpick to the bad performance, and 6 X is the speed of promoting to the better work.
In the experiment, all codes were written in MATLAB 7. 5  , where k is the number of ite-ration.The initial matrix 0 B was always set to be the unit matrix.We will stop the program if the condition   5 g le    is satisfied.
In order to show the efficiency of these algorithms, the residuals of sum of squares is defined by are the parameters when the program is stopped or the solution is obtained from one way.Let where n is the number of terms in problems, and p is the number of parameters, if p RMS is smaller, then the corresponding method is better [48].In The columns of the Table 2 have the following meaning:   : the approximate solution from the method of extreme value of calculus or some software.
Here we also solve these two problems by Algorithm 1.These numerical results of Table 2 indicate that Algorithm 1 is better than those of these methods from extreme value of calculus or some software.Then we can conclude that the numerical method will outperform the method of extreme value of calculus in some sense, and some software for regression analysis could be further improved in the future.Moreover, the initial points don not influence that the sequence converges to one solution x  our proposed method.
a backtracking process, a new line search technique is defined by

Lemma 4 . 2
Let Assumption A be satisfied.Consider Equation (2.3), if 0 k s  , then there is a constant 1 0 m  such that for all k sufficiently large 2 1

. 8 ) 4 . 3
Lemma Let Assumption A be satisfied.Then we have Consider the line search (2.3), by Lemma 4.1 and Equation (2.4), we can get the following inequalities immediately

Theorem 4 . 1
Let Assumption A and Equation (4.1) hold.Then the sequence  

\
: the solution as the program is terminated.  : the initial point.NI: the total number of iterations.  