Decrease of the Penalty Parameter in Differentiable Penalty Function Methods

We propose a simple modification to the differentiable penalty methods for solving nonlinear programming problems. This modification decreases the penalty parameter and the ill-conditioning of the penalty method and leads to a faster convergence to the optimal solution. We extend the modification to the augmented Lagrangian method and report some numerical results on several nonlinear programming test problems, showing the effectiveness of the proposed approach.


Introduction
Solving nonlinear programming (NLP) problems via a penalty method was first introduced by Courant [1] in 1943.Fiacco and McCormick [2] developed barrier methods for solving NLP problem.Murray [3] show that the Hessian matrix of penalty method is ill-con-ditioned.Since then, many approaches for reducing the ill-conditioning of penalty method were proposed.To avoid too increasing of the penalty parameter, Zangwill [4] introduced exact nondifferentiable penalty functions and Fletcher [5] introduced continuously differentiable exact penalty functions.Another exact penalty methods have been studied in [6][7][8][9][10][11][12][13] and others.In addition, Mongeau [14] decreased the penalty parameter in exact penalty methods for solving linear programming problems.Here, Using general ideas of Mongeau, we propose an approach to reduce the penalty parameter in the differentiable penalty method for solving NLP problems.

The Basic Idea
Consider the following programming problem: where f and the j g are twice continuously differentiable functions.


, a common penalty function for (NLP) is: where,     A penalty problem for (NLP) is defined as follows: Gradient and hessian of 1 H  can be calculated as follows: Note that due to the continuity of second derivatives, Hessian matrices 2 f  , 2 P  and 2 j g  are symmetric.
The condition number of a square matrix A is given by

 
K A is large, then A is said to be ill-conditioned.For a symmetric matrix A , it can be shown that where, max  and min  are the largest and smallest eigenvalues of matrix A , respectively.
If we assume that there are r active constraints at * x , the optimal solution of (NLP), and The gradients of these constraints are linearly independent, Then V has rank equal to r and thus has r nonzero eigenvalues.
(2.1) implies that when    at least r eigenvalues of 2   1 tend to infinity.It has been shown in [15] that exactly r eigenvalues tend to infinity and n r  other eigenvalues tend to finite limits, which implies the illconditioning of the Hessian of penalty method.
To avoid the ill-conditioning, instead of usual penalty function we consider the following function: Its corresponding penalty problem for (NLP) is: is of full rank (for example, if P is a strictly convex function), then all eigenvalues of   The optimal solution is * = 1 x .We have: Therefore when    , the hessian matrix 2   1 tends to a fixed number.Although under some assumption the hessian of 2 2 is not ill-conditioned but there is a problem.For every feasible point x we have , and for too large  , the value of  is very close to zero.Thus, near the boundary of feasible region, is almost zero and this cause the termination of the penalty method.So the penalty method with 2 H  only gives a feasible point and does not converge to optimal solution or converges very slowly.
Thus, to have advantages of both 1 H  and 2 H  , we consider the following combined formula: This penalty function apply penalty two times, once by multiplying   P x by  and again by dividing   f x by  .In fact,

3
H  is equivalent to the following penalty function in which a  has been factorized: This leads to faster convergence of penalty method using 3 H  than that using 4 H  .We use the following general formula instead of 3 for (PEN), and that x  is obtained in a compact subsets of X .Then, any limit point of x  is a solution to (NLP).
Proof.Consider the following problem:

Extension to Augmented Lagrangian Methods
The augmented Lagrangian for Problem (NLP) is defined as follows:

It has been shown that if *
 is the Lagrange multiplier of (NLP) at the optimal solution * x Then for large enough  , minimization of gives the optimal solution of (NLP).Thus, 1 A  is said to be exact for solving (NLP).
Since at first the value of *  is not often available, the following formula is usually used for updating the values of j  :  .We can write it as fol- lows: Thus, from the discussion of previous section, instead of 1 A  we consider the following penalty function: Since the ordinary augmented Lagrangian method for solving (NLP) is exact and we also have clearly similar to the ordinary augmented Lagrangian method we have the following result.Lemma 3.1 Suppose that second order sufficient conditions for (NLP) are satisfied at * x , *  .Then there exists a 0  such that for any From (3.1), we can consider A  as an ordinary aug- mented lagrangian with penalty parameter     .Thus, new updating formula for the j  is as follows:

Algorithms
Consider the following augmented Lagrangian problem for (NLP): where,  is the average of the j  .For solving (NLP) via augmented Lagrangian method we apply the following algorithm where is similar to Algorithm 1 of [11] with the first order update rule of Lagrangian multipliers.

end(if) end(while)
For solving (NLP) via the penalty method, we refine Algorithm 1 by considering  as zero and removing the step of its updating.Also, we solve the following problem in line search method of the algorithm:

Test Results
Algorithms 1 is programmed in MATLAB 7.6 and run on a PC with 1.8 GHz and 1 GB RAM.For solving subproblems we use a line search algorithm.The step length is determined by the Goldstein test and the direction is determined by the BFGS formula with Powell's modifications [16] (the eigenvalues are considered as zero).The function  is considered as


. For each test problem we take a fixed initial point.
All the test problems with one or more constraints are selected from Hock and Schittkowski's set [17] and Schittkowski''s set [18] located in [19].The characteristics of test problems are listed in Table 1, where n is the number of variables, m the total number of constraints, NL m the number of nonlinear constraints and objective the type of the objective function (linear/ nonlinear).
The computational results for the penalty method and the augmented Lagrangian method are summarized in Tables 2 and 3     Note that tables rows corresponding to = 0  show numerical results for the ordinary methods and other rows show numerical results for the new methods.
As seen in Tables 2 and 3 the performance of the penalty methods with new formulas is significantly better than that with the usual formulas.The new penalty me- thods decrease number of iterations and number of function evaluations and as we expect the penalty method notably reduce the penalty parameter.We observed in computational results that although for larger  the convergence is faster, for some test problems use of larger  increase the distance of the obtained solution and the optimal solution.Note that using > 1


sometimes makes the first term of H  converges to zero faster than the second term and this causes termination of the penalty method at the boundary of feasible region.Thus, we suggest use of   such that 1   .That is, use of the  with the order greater than   O  is not recommended.Here, for having more efficiency we suggest   =    for the penalty method and   =    for the augmented Lagrangian method. In

Conclusions
We proposed a simple modification to the penalty methods and showed that the new penalty methods has better performance than the usual penalty methods.Computational results on several test problems showed that number of iterations decreases and calculations significantly reduce.

a positive and increasing function in terms of  . Lemma 2 . 1
Consider the following problem: , respectively.The following symbols are used in these tables:val * = optimal value of the test problem.val = the obtained optimal value.iter=number of iterations.eval = number of function evaluations.eval 0 = eval for the ordinary penalty method ( CPU time (seconds) to reach the solution.

Figure 1 1 
number of function evaluations for the ordinary penalty method ( = 0) and new penalty method ( = ) is compared.The comparison of evaluations of ordinary augmented Lagrangian ( clearly the problem is equivalent to (PEN).Since