Share This Article:

A New Global Scalarization Method for Multiobjective Optimization with an Arbitrary Ordering Cone

Full-Text HTML XML Download Download as PDF (Size:318KB) PP. 154-163
DOI: 10.4236/am.2017.82013    412 Downloads   564 Views  

ABSTRACT

We propose a new scalarization method which consists in constructing, for a given multiobjective optimization problem, a single scalarization function, whose global minimum points are exactly vector critical points of the original problem. This equivalence holds globally and enables one to use global optimization algorithms (for example, classical genetic algorithms with “roulette wheel” selection) to produce multiple solutions of the multiobjective problem. In this article we prove the mentioned equivalence and show that, if the ordering cone is polyhedral and the function being optimized is piecewise differentiable, then computing the values of a scalarization function reduces to solving a quadratic programming problem. We also present some preliminary numerical results pertaining to this new method.

1. Introduction

Scalarization is one of the most commonly used methods of solving multiobjective optimization problems. It consists in replacing the original multiobjective problem by a scalar optimization problem, or a family of scalar optimization problems, which is, in a certain sense, equivalent to the original problem. The existing scalarization methods can be divided into two groups:

1) Methods that use some representation of a given multiobjective problem as a parametrized family of scalar optimization problems. Such scalarization methods should have the following two properties (see [1] , p. 77): (i) an optimal solution of each scalarized problem is efficient (in some sense) for the original multiobjective problem, (ii) every efficient solution of the multiobjective problem can be obtained as an optimal solution of an appropriate scalarized problem by adjusting the parameter value. Some examples of possible scalarizations of this kind are given, for instance, in [1] (pp. 77-78) and [2] .

2) Methods that use local equivalence of a multiobjective optimization problem and some scalar optimization problem whose formulation depends on a given point. Such equivalence enables one to solve the multiobjective problem locally by using necessary and/or sufficient optimality conditions formulated for the scalar problem (for examples of such an approach, see [3] , Thm. 1 and [4] , Prop. 2.1 and 2.2).

There are also scalarization approaches which combine properties of both groups such as the Pascoletti-Serafini scalarization [5] (for a survey of different scalarization methods, see [6] , Chapter 2; for adaptive algorithms using different scalarizations, see [6] , Chapter 4; for scalarizations in the context of variable ordering structures, see [7] , Chapters 4 and 5).

In this paper, we propose a new scalarization method different from the above-mentioned ones. It consists in constructing, for a given multiobjective optimization problem, a single scalarization function, whose global minimum points are exactly vector critical points in the sense of [8] for the original problem. This equivalence holds globally and enables one to use global optimization algorithms designed for scalar-valued problems (for example, classical genetic algorithms with “roulette wheel” selection) to solve the original multiobjective problem. We also show that, if we consider an order defined by a polyhedral cone and the function being optimized is piecewise differentiable, then computing the values of a scalarization function reduces to solving a quadratic programming problem.

So far, the term “scalarization function” has been used for a scalar-valued function defined on the image space of an optimization problem, which transforms a vector-valued objective function into a scalar-valued one (see [9] , Thm. 1.1). However, by using such a scalarization, we are able to find only some (usually a small part of) Pareto solutions, or efficient points, of the original multiobjective optimization problem, while the other Pareto solutions are lost. Contrary to this approach, our scalarization function is defined on the space of feasible solutions of the original problem and attains the minimum (zero) value on the set of vector critical points for this problem. The set of vector critical points is larger than the set of efficient solutions and can serve as an approximation of the latter one.

The purpose of this research is to describe the idea of our new scalarization method and to present some underlying theory for the case of an unconstrained multiobjective optimization problem. The extension to constrained optimization is also possible and will be the subject of further investigations.

2. A Global Scalarization Function for an Arbitrary Ordering Cone

Let Ω be an open set in n , and let f = ( f 1 , , f p ) : Ω p be a locally Lipschitzian vector function. Suppose that C is a closed convex pointed cone in p with nonempty interior. We denote by C+ the positive polar cone to C, i.e.,

C + : = { z p : z , y 0 , y C } , (1)

where , is the usual inner product in p . The partial order relation in p is defined by

y z ifandonlyif z y C , (2)

for all y , z p . We consider the following multiobjective optimization problem:

minimize f ( x ) subject to x Ω . (3)

Definition 1 [10] We define the (Clarke’s) generalized Jacobian of f at x ¯ Ω as follows:

f ( x ¯ ) : = co { lim n J f ( x n ) : x n x ¯ , J f ( x n ) exists } , (4)

where J f ( x ) denotes the usual Jacobian matrix of f at x whenever f is Fréchet differentiable at x, and “co” denotes the convex hull of a set.

We will denote by p × n the vector space of all p × n real matrices. It follows from ( [10] , Prop. 2.6.2(a)) that f ( x ¯ ) is a nonempty convex compact subset of p × n . The calculation of Clarke’s generalized Jacobian in the general case can be quite difficult due to the lack of exact calculus rules. For piecewise differentiable functions, however, there is a representation of the generalized Jacobian as the convex hull of a finite number of Jacobian matrices, which was obtained by Scholtes in [11] . To formulate this result, we need some additional definitions.

Definition 2 Let Ω be an open subset of n and let f i : Ω p , Math_22#, be a collection of continuous functions.

(i) A function f : Ω p is said to be a continuous selection of the functions f 1 , , f k on the set U Ω if f is continuous on U and f ( x ) { f 1 ( x ) , , f k ( x ) } for every x U .

(ii) A function f : Ω p is called a PC1-function if for every x ¯ Ω there exists an open neighborhood U Ω and a finite number of C1-functions f i : U p , i = 1 , , k , such that f is a continuous selection of f 1 , , f k on U. In this case, we call f 1 , , f k the selection functions for f at x ¯ .

(iii) Let f : Ω p be a PC1-function and let x ¯ U Ω (U open). Suppose that f is a continuous selection of f 1 , , f k on U. We define the set of essentially active indices for f at x ¯ as follows:

I f e ( x ¯ ) : = { i { 1, , k } : x ¯ cl ( int { x U : f ( x ) = f i ( x ) } ) } . (5)

Proposition 3 ( [11] , Prop. 4.3.1) If Ω is an open subset of n and f : Ω p is a PC1-function with C1 selection functions f i : U p , i = 1 , , k , where x ¯ U Ω , then

f ( x ¯ ) = co { J f i ( x ¯ ) : i I f e ( x ¯ ) } . (6)

Definition 4 [8] Let x ¯ Ω . We say that

(i) x ¯ is a vector critical point for problem (3) if there exist z C + \ { 0 p } and A f ( x ¯ ) such that

z T A = 0 n , (7)

where 0 n is the zero vector in n ;

(ii) x ¯ is an efficient solution for (3) if

( f ( Ω ) f ( x ¯ ) ) ( C ) = { 0 p } ; (8)

(iii) x ¯ is a weakly efficient solution for (3) if

( f ( Ω ) f ( x ¯ ) ) ( int C ) = ; (9)

(iv) x ¯ is a local weakly efficient solution for (3) if there exists a neighborhood U of x ¯ such that

( f ( Ω U ) f ( x ¯ ) ) ( int C ) = . (10)

It is obvious that implications ( ii ) ( iii ) ( iv ) hold in Definition 4. The implication ( iv ) ( i ) (for locally Lipschizian f) follows from [12] (Thm. 5.1 (i)(b)). Some opposite implications can be obtained under additional assumptions of generalized convexity type. In particular, Gutiérrez et al. [8] have identified the class of pseudoinvex functions for which ( i ) ( iii ) holds, and the class of strong pseudoinvex functions for which ( i ) ( ii ) holds.

Definition 5 [13] Let C be a nontrivial convex cone in p . A nonempty convex subset B of C is called a base for C if each nonzero element z C has a unique representation of the form z = λ b with λ > 0 and b B .

Remark 6 If B is a base of the nontrivial convex cone C, then 0 p B .

Lemma 7 (a finite-dimensional version of [13] , Lemma 2.2.17) Let C be a nontrivial closed convex cone in p with int C . If y ¯ int C , then the set

B : = { z C + : z , y ¯ = 1 } (11)

is a compact base for C + .

In the sequel, we consider a fixed vector y ¯ int C and a base B for C + defined by (11). In order to define a global scalarization function for problem (3), we first consider the following mapping h : p × p × n n :

h ( y , A ) : = y T A . (12)

Lemma 8 A point x ¯ Ω is a vector critical point for problem (3) if and only if

0 n h ( B × f ( x ¯ ) ) . (13)

Proof. If x ¯ Ω is a vector critical point for problem (3), then equality (7) holds for some z C + \ { 0 p } and A f ( x ¯ ) . Since B is a base for C + , there exist λ > 0 and b B such that z = λ b . Then, by (7),

h ( b , A ) = b T A = 0 n , (14)

so that (13) holds. Conversely, if (14) is true for some b B and A f ( x ¯ ) , then by Definition 5 and Remark 6, we have b C + \ { 0 p } . Taking z = b in Definition 4, we see that x ¯ is a vector critical point for (3).

For a nonempty subset S of n , let d ( , S ) : n be the distance function of S, defined as follows:

d ( x , S ) : = inf { x u : u S } , (15)

where denotes the Euclidean norm. We now introduce the following scalari- zation function s : Ω [ 0 , + ) :

s ( x ) : = d ( 0 n , h ( B × f ( x ) ) ) . (16)

Note that s depends on the choice of y ¯ . The name “scalarization function” is justified by the following.

Theorem 9 A point x ¯ Ω is a vector critical point for problem (3) if and only if s ( x ¯ ) = 0 .

Proof. If x ¯ is a vector critical point for (3), then by Lemma 8, condition (13) holds, which gives s ( x ¯ ) = 0 . Conversely, suppose that s ( x ¯ ) = 0 . Since h is continuous and the sets B and f ( x ¯ ) are compact in p and p × n , respectively, the set h ( B × f ( x ¯ ) ) is also compact; hence it is closed. Therefore, the equality s ( x ¯ ) = 0 implies condition (13).

Having defined the scalarization function s, we can now replace problem (3) by the following scalar optimization problem:

minimize s ( x ) subject to x Ω . (17)

Obviously, problems (3) and (17) are not equivalent because there may exist vector critical points which are not (weakly) efficient solutions for (3). Nevertheless, by solving problem (17) we can obtain some approximation of the set of solutions to (3).

Computing the distance function in (16) is not easy in the general case, but under additional assumptions on both C and f, it is possible to apply some existing algorithms to perform this task. The details are described below.

Definition 10 ( [14] , p. 170) A convex set D in p is called polyhedral if it can be expressed as the intersection of some finite collection of closed half- spaces, that is, there exist vectors b i p and numbers β i such that

D = { y p : y , b i β i , i = 1 , , m } . (18)

A convex cone which is a polyhedral set is called a polyhedral cone.

Theorem 11 Suppose that the ordering cone C in p is polyhedral and the function f : Ω p is PC1. Let y ¯ int C , let B be a base for C + defined by (11) and let h be the function defined by (12). Then, for each x Ω , the set h ( B × f ( x ) ) is polyhedral, or equivalently, it can be represented as the convex hull of a finite number of points in n .

Proof. It follows from ( [14] , Thm. 19.1) that a convex set D in p is polyhedral if and only if it is finitely generated, which means that there exist vectors a 1 , , a l such that, for a fixed integer k, 0 k l , D consists of all the vectors of the form

x = λ 1 a 1 + + λ k a k + λ k + 1 a k + 1 + + λ l a l , (19)

where

λ 1 + + λ k = 1 , λ i 0 for i = 1 , , l . (20)

In particular, if D is bounded, then no λ i can be arbitrarily large, which implies that k = l , and conditions (19) - (20) reduce to

x co { a 1 , , a k } .

By assumption, C is polyhedral, hence, by [14] (Corollary 19.2.2), C + is also a polyhedral cone, which implies that its base B is a polyhedral set. By Proposition 3, f ( x ) is the convex hull of a finite collection of p × n matrices, so it is a polyhedral set in p × n . It is easy to prove that the Cartesian product of two polyhedral sets is a polyhedral set and that the image of a polyhedral set under a linear transformation is a polyhedral set (see [15] , Proposition A.3.4). Therefore, h ( B × f ( x ) ) is a polyhedral set in n .

Theorem 11 reduces the problem of computing the values s ( x ) given by (16) to the problem of computing the Euclidean projection of 0 n onto the polyhedron h ( B × f ( x ) ) . This is a particular case of a quadratic programming problem (see [16] , p. 398). There are also specialized algorithms designed for computing such projections (see [17] [18] ).

3. The Case of Two Objectives

For two objectives, under differentiability assumptions, it is possible to find some representation of the scalarization function s in terms of the gradients f 1 and f 2 . Let p = 2 and suppose that the mapping f = ( f 1 , f 2 ) is continuously differentiable on n . Denote by f i ( x ) the gradient of fi at x (i = 1, 2). Then (4) implies

f ( x ¯ ) = { J f ( x ¯ ) } = [ f 1 ( x ¯ ) f 2 ( x ¯ ) ] . (21)

The following theorem will help to compute the scalarization function (16) for bi-objective problems.

Theorem 12 Let p = 2, y ¯ int C , and let B be the compact base for C + defined by (8). Then there exist vectors b i = ( b 1 i , b 2 i ) B , i = 1 , 2 , such that

h ( B × f ( x ¯ ) ) = co { b 1 1 f 1 ( x ¯ ) + b 2 1 f 2 ( x ¯ ) , b 1 2 f 1 ( x ¯ ) + b 2 2 f 2 ( x ¯ ) } . (22)

Proof. It follows from (8) that B is a subset of some line in 2 . Moreover, by Lemma 7, B is compact and convex, so it must be a closed line segment. Denote by b ( 1 ) and b ( 2 ) the endpoints of B. Using (21) and the linearity of h with respect to the first argument, we obtain

h ( B × f ( x ¯ ) ) = h ( co { b 1 , b 2 } × { J f ( x ¯ ) } ) = h ( { ( λ b 1 + ( 1 λ ) b 2 , J f ( x ¯ ) ) : 0 λ 1 } ) = { λ h ( b 1 , J f ( x ¯ ) ) + ( 1 λ ) h ( b 2 , J f ( x ¯ ) ) : 0 λ 1 } = co { h ( b 1 , J f ( x ¯ ) ) , h ( b 2 , J f ( x ¯ ) ) } = co { b 1 1 f 1 ( x ¯ ) + b 2 1 f 2 ( x ¯ ) , b 1 2 f 1 ( x ¯ ) + b 2 2 f 2 ( x ¯ ) } .

Pareto Optimization

We now consider the case of classical Pareto optimization, i.e., when C = + 2 . We have C + = C . Let y ¯ = ( 1 , 1 ) int C , then by Lemma 7 the set

B : = { z C + : z 1 + z 2 = 1 }

is a compact base for C + , and B is the closed line segment joining the two points b ( 1 ) = ( 1 , 0 ) and b ( 2 ) = ( 0 , 1 ) . According to Theorem 12, we have

h ( B × f ( x ¯ ) ) = co { f 1 ( x ) , f 2 ( x ) } ,

hence, the scalarization function has the form

s ( x ) = d ( 0 , co { f 1 ( x ) , f 2 ( x ) } ) .

For any point x n , there are two possible cases:

(i) f 1 ( x ) = f 2 ( x ) . Then s ( x ) = f 1 ( x ) = f 2 ( x ) .

(ii) f 1 ( x ) f 2 ( x ) . Then s ( x ) is the distance from 0 to the line segment S joining f 1 ( x ) and f 2 ( x ) .

We now consider case (ii). The line L passing through f 1 ( x ) and f 2 ( x ) is parametrized as L ( t ) = b + t a where b : = f 1 ( x ) is a point on the line, and a : = f 2 ( x ) f 1 ( x ) is the line direction. The closest point on the line L to 0 is the projection of 0 onto L which is equal to

q : = b + t 0 a , where t 0 = a , b a , a = a , b a 2 .

Using the same parametrization, we can represent the line segment S as follows:

S = { b + t a : 0 t 1 } .

Therefore, if t 0 0 , then the point in S closest to 0 is b. Similarly, if t 0 1 , then the point in S closest to 0 is b + a . Finally, if 0 < t 0 < 1 , then the point in S closest to 0 is q. Hence, the function s can be described as follows:

s ( x ) = { b if t 0 0, b + t 0 a if 0 < t 0 < 1, b + a if t 0 1. (23)

Taking into account the definitions of a and b above, we see that this scalarization function depends on the values of gradients of f 1 and f 2 only, so it is easily computable.

Example 13 (problem FON in [19] , p. 187) Let f = ( f 1 , f 2 ) : 3 2 be defined by

f 1 ( x ) = 1 exp ( i = 1 3 ( x i 1 3 ) 2 ) , (24)

f 2 ( x ) = 1 exp ( i = 1 3 ( x i + 1 3 ) 2 ) . (25)

The authors of [19] consider problem (3), where Ω = [ 4 , 4 ] 3 , and state that the set of efficient (Pareto) solutions for this problem is equal to the set of points x = ( x 1 , x 2 , x 3 ) satisfying

x 1 = x 2 = x 3 [ 1 / 3 , 1 / 3 ] . (26)

Here the set Ω is closed (contrary to the rest of our paper), but this constraint is in fact inessential and the problem can also be considered on the whole space 3 . Computing the partial derivatives of f 1 and f 2 , we obtain from (24) - (25)

f 1 x j ( x ) = 2 ( x j 1 3 ) exp ( i = 1 3 ( x i 1 3 ) 2 ) , j = 1 , 2 , 3 , (27)

f 2 x j ( x ) = 2 ( x j + 1 3 ) exp ( i = 1 3 ( x i + 1 3 ) 2 ) , j = 1 , 2 , 3. (28)

We have designed a program in Maple to compute s ( x ) , using formulae (23) and (27) - (28). This program consists of three nested loops for the values of the variables x 1 , x 2 , x 3 , each variable taking values from −4 to 4 in steps of 0.01. We have obtained s ( x ) = 0 for each x satisfying (26), and s ( x ) > 0 for all other points x. However, there are some points x for which the values s ( x ) are very small; the smallest value obtained is

s ( 4 , 4 , 4 ) = s ( 4 , 4 , 4 ) = α : = 0.79802094823 × 10 26 . (29)

There are no other points at which s ( x ) < α , except the Pareto optimal solutions (26).

This example shows that one must be careful when using global optimization algorithms to minimize s because points like the ones appearing in (29) can be easily misclassified as vector critical points.

4. Conclusion

We have presented a new scalarization method for solving multiobjective optimization problems which is based on computing the Euclidean distance from the origin to some subset determined by the generalized Jacobian of the mapping being optimized. This article contains the main underlying theory and only some preliminary numerical computations pertaining to this method. More numerical results will be presented in another research.

Acknowledgements

The authors are grateful to an anonymous referee for his/her comments which have improved the quality of the paper.

Cite this paper

Rahmo, E. and Studniarski, M. (2017) A New Global Scalarization Method for Multiobjective Optimization with an Arbitrary Ordering Cone. Applied Mathematics, 8, 154-163. doi: 10.4236/am.2017.82013.

References

[1] Ehrgott, M. and Gandibleux, X. (Eds.) (2002) Multiple Criteria Optimization: State of the Art Annotated Bibliography Surveys. Kluwer, Boston.
[2] Wierzbicki, A.P. (1986) On the Completeness and Constructiveness of Parametric Characterizations to Vector Optimization Problems. Operations-Research-Spektrum, 8, 73-87.
https://doi.org/10.1007/BF01719738
[3] Bakhtin, V.I. and Gorokhovik, V.V. (2010) First and Second Order Optimality Conditions for Vector Optimization Problems on Metric Spaces. Proceedings of the Steklov Institute of Mathematics, 269, S28-S39.
https://doi.org/10.1134/s0081543810060040
[4] Ginchev, I., Guerraggio, A. and Rocca, M. (2005) Isolated Minimizers and Proper Efficiency for C0,1 Constrained Vector Optimization Problems. Journal of Mathematical Analysis and Applications, 309, 353-368.
https://doi.org/10.1016/j.jmaa.2005.01.041
[5] Pascoletti, A. and Serafini, P. (1984) Scalarizing Vector Optimization Problems. Journal of Optimization Theory and Applications, 42, 499-524.
https://doi.org/10.1007/BF00934564
[6] Eichfelder, G. (2008) Adaptive Scalarization Methods in Multiobjective Optimization. Springer, Berlin.
https://doi.org/10.1007/978-3-540-79159-1
[7] Eichfelder, G. (2014) Variable Ordering Structures in Vector Optimization. Springer, Berlin.
[8] Gutiérrez, C., Jiménez, B., Novo, V. and Ruiz-Garzon, G. (2016) Vector Critical Points and Efficiency in Vector Optimization with Lipschitz Functions. Optimization Letters, 10, 47-62.
https://doi.org/10.1007/s11590-015-0850-2
[9] Nakayama, H., Yun, Y. and Yoon, M. (2009) Sequential Approximate Multiobjective Optimization Using Computational Intelligence. Springer, Berlin.
[10] Clarke, F.H. (1983) Optimization and Nonsmooth Analysis. John Wiley & Sons, New York.
[11] Scholtes, S. (2012) Introduction to Piecewise Differentiable Equations. Springer, New York.
https://doi.org/10.1007/978-1-4614-4340-7
[12] Guerraggio, A. and Luc, T. (2001) Optimality Conditions for C1,1 Vector Optimization Problems. Journal of Optimization Theory and Applications, 109, 615-629.
https://doi.org/10.1023/A:1017519922669
[13] Gopfert, A., Riahi, H., Tammer, Ch. and Zalinescu, C. (2003) Variational Methods in Partially Ordered Spaces. Springer, New York.
[14] Rockafellar, R.T. (1970) Convex Analysis. Princeton University Press, Princeton.
[15] Lange, K. (2013) Optimization. 2nd Edition, Springer, New York.
https://doi.org/10.1007/978-1-4614-5838-8
[16] Boyd, S. and Vandenberghe, L. (2004) Convex Optimization. Cambridge University Press, Cambridge.
https://doi.org/10.1017/CBO9780511804441
[17] Arioli, M., Laratta, A. and Menchi, O. (1984) Numerical Computation of the Projection of a Point onto a Polyhedron. Journal of Optimization Theory and Applications, 43, 495-525.
https://doi.org/10.1007/BF00935003
[18] Mückeley, M. (1992) Computing the Vector in the Convex Hull of a Finite Set of Points Having Minimal Length. Optimization, 26, 15-26.
https://doi.org/10.1080/02331939208843839
[19] Deb, K., Pratap, A., Agarwal, S. and Meyarivan, T. (2002) A Fast and Elitist Multiobjective Genetic Algorithm: NSGA-II. IEEE Transactions on Evolutionary Computation, 6, 182-197.
https://doi.org/10.1109/4235.996017

  
comments powered by Disqus

Copyright © 2017 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.