A New Global Scalarization Method for Multiobjective Optimization with an Arbitrary Ordering Cone ()
1. Introduction
Scalarization is one of the most commonly used methods of solving multiobjective optimization problems. It consists in replacing the original multiobjective problem by a scalar optimization problem, or a family of scalar optimization problems, which is, in a certain sense, equivalent to the original problem. The existing scalarization methods can be divided into two groups:
1) Methods that use some representation of a given multiobjective problem as a parametrized family of scalar optimization problems. Such scalarization methods should have the following two properties (see [1] , p. 77): (i) an optimal solution of each scalarized problem is efficient (in some sense) for the original multiobjective problem, (ii) every efficient solution of the multiobjective problem can be obtained as an optimal solution of an appropriate scalarized problem by adjusting the parameter value. Some examples of possible scalarizations of this kind are given, for instance, in [1] (pp. 77-78) and [2] .
2) Methods that use local equivalence of a multiobjective optimization problem and some scalar optimization problem whose formulation depends on a given point. Such equivalence enables one to solve the multiobjective problem locally by using necessary and/or sufficient optimality conditions formulated for the scalar problem (for examples of such an approach, see [3] , Thm. 1 and [4] , Prop. 2.1 and 2.2).
There are also scalarization approaches which combine properties of both groups such as the Pascoletti-Serafini scalarization [5] (for a survey of different scalarization methods, see [6] , Chapter 2; for adaptive algorithms using different scalarizations, see [6] , Chapter 4; for scalarizations in the context of variable ordering structures, see [7] , Chapters 4 and 5).
In this paper, we propose a new scalarization method different from the above-mentioned ones. It consists in constructing, for a given multiobjective optimization problem, a single scalarization function, whose global minimum points are exactly vector critical points in the sense of [8] for the original problem. This equivalence holds globally and enables one to use global optimization algorithms designed for scalar-valued problems (for example, classical genetic algorithms with “roulette wheel” selection) to solve the original multiobjective problem. We also show that, if we consider an order defined by a polyhedral cone and the function being optimized is piecewise differentiable, then computing the values of a scalarization function reduces to solving a quadratic programming problem.
So far, the term “scalarization function” has been used for a scalar-valued function defined on the image space of an optimization problem, which transforms a vector-valued objective function into a scalar-valued one (see [9] , Thm. 1.1). However, by using such a scalarization, we are able to find only some (usually a small part of) Pareto solutions, or efficient points, of the original multiobjective optimization problem, while the other Pareto solutions are lost. Contrary to this approach, our scalarization function is defined on the space of feasible solutions of the original problem and attains the minimum (zero) value on the set of vector critical points for this problem. The set of vector critical points is larger than the set of efficient solutions and can serve as an approximation of the latter one.
The purpose of this research is to describe the idea of our new scalarization method and to present some underlying theory for the case of an unconstrained multiobjective optimization problem. The extension to constrained optimization is also possible and will be the subject of further investigations.
2. A Global Scalarization Function for an Arbitrary Ordering Cone
Let
be an open set in
, and let
be a locally Lipschitzian vector function. Suppose that C is a closed convex pointed cone in
with nonempty interior. We denote by C+ the positive polar cone to C, i.e.,
(1)
where
is the usual inner product in
. The partial order relation in
is defined by
(2)
for all
. We consider the following multiobjective optimization problem:
(3)
Definition 1 [10] We define the (Clarke’s) generalized Jacobian of f at
as follows:
(4)
where
denotes the usual Jacobian matrix of f at x whenever f is Fréchet differentiable at x, and “co” denotes the convex hull of a set.
We will denote by
the vector space of all
real matrices. It follows from ( [10] , Prop. 2.6.2(a)) that
is a nonempty convex compact subset of
. The calculation of Clarke’s generalized Jacobian in the general case can be quite difficult due to the lack of exact calculus rules. For piecewise differentiable functions, however, there is a representation of the generalized Jacobian as the convex hull of a finite number of Jacobian matrices, which was obtained by Scholtes in [11] . To formulate this result, we need some additional definitions.
Definition 2 Let Ω be an open subset of
and let
Math_22#, be a collection of continuous functions.
(i) A function
is said to be a continuous selection of the functions
on the set
if f is continuous on U and
for every
.
(ii) A function
is called a PC1-function if for every
there exists an open neighborhood
and a finite number of C1-functions
, such that f is a continuous selection of
on U. In this case, we call
the selection functions for f at
.
(iii) Let
be a PC1-function and let
(U open). Suppose that f is a continuous selection of
on U. We define the set of essentially active indices for f at
as follows:
(5)
Proposition 3 ( [11] , Prop. 4.3.1) If Ω is an open subset of
and
is a PC1-function with C1 selection functions
, where
, then
(6)
Definition 4 [8] Let
. We say that
(i)
is a vector critical point for problem (3) if there exist
and
such that
(7)
where
is the zero vector in
;
(ii)
is an efficient solution for (3) if
(8)
(iii)
is a weakly efficient solution for (3) if
(9)
(iv)
is a local weakly efficient solution for (3) if there exists a neighborhood U of
such that
(10)
It is obvious that implications
hold in Definition 4. The implication
(for locally Lipschizian f) follows from [12] (Thm. 5.1 (i)(b)). Some opposite implications can be obtained under additional assumptions of generalized convexity type. In particular, Gutiérrez et al. [8] have identified the class of pseudoinvex functions for which
holds, and the class of strong pseudoinvex functions for which
holds.
Definition 5 [13] Let C be a nontrivial convex cone in
. A nonempty convex subset B of C is called a base for C if each nonzero element
has a unique representation of the form
with
and
.
Remark 6 If B is a base of the nontrivial convex cone C, then
.
Lemma 7 (a finite-dimensional version of [13] , Lemma 2.2.17) Let C be a nontrivial closed convex cone in
with
. If
, then the set
(11)
is a compact base for
.
In the sequel, we consider a fixed vector
and a base B for
defined by (11). In order to define a global scalarization function for problem (3), we first consider the following mapping
:
(12)
Lemma 8 A point
is a vector critical point for problem (3) if and only if
(13)
Proof. If
is a vector critical point for problem (3), then equality (7) holds for some
and
. Since B is a base for
, there exist
and
such that
. Then, by (7),
(14)
so that (13) holds. Conversely, if (14) is true for some
and
, then by Definition 5 and Remark 6, we have
. Taking
in Definition 4, we see that
is a vector critical point for (3).
For a nonempty subset S of
, let
be the distance function of S, defined as follows:
(15)
where
denotes the Euclidean norm. We now introduce the following scalari- zation function
:
(16)
Note that
depends on the choice of
. The name “scalarization function” is justified by the following.
Theorem 9 A point
is a vector critical point for problem (3) if and only if
.
Proof. If
is a vector critical point for (3), then by Lemma 8, condition (13) holds, which gives
. Conversely, suppose that
. Since h is continuous and the sets B and
are compact in
and
, respectively, the set
is also compact; hence it is closed. Therefore, the equality
implies condition (13).
Having defined the scalarization function s, we can now replace problem (3) by the following scalar optimization problem:
(17)
Obviously, problems (3) and (17) are not equivalent because there may exist vector critical points which are not (weakly) efficient solutions for (3). Nevertheless, by solving problem (17) we can obtain some approximation of the set of solutions to (3).
Computing the distance function in (16) is not easy in the general case, but under additional assumptions on both C and f, it is possible to apply some existing algorithms to perform this task. The details are described below.
Definition 10 ( [14] , p. 170) A convex set D in
is called polyhedral if it can be expressed as the intersection of some finite collection of closed half- spaces, that is, there exist vectors
and numbers
such that
(18)
A convex cone which is a polyhedral set is called a polyhedral cone.
Theorem 11 Suppose that the ordering cone C in
is polyhedral and the function
is PC1. Let
, let B be a base for
defined by (11) and let h be the function defined by (12). Then, for each
, the set
is polyhedral, or equivalently, it can be represented as the convex hull of a finite number of points in
.
Proof. It follows from ( [14] , Thm. 19.1) that a convex set D in
is polyhedral if and only if it is finitely generated, which means that there exist vectors
such that, for a fixed integer k,
, D consists of all the vectors of the form
(19)
where
(20)
In particular, if D is bounded, then no
can be arbitrarily large, which implies that
, and conditions (19) - (20) reduce to
By assumption, C is polyhedral, hence, by [14] (Corollary 19.2.2),
is also a polyhedral cone, which implies that its base B is a polyhedral set. By Proposition 3,
is the convex hull of a finite collection of
matrices, so it is a polyhedral set in
. It is easy to prove that the Cartesian product of two polyhedral sets is a polyhedral set and that the image of a polyhedral set under a linear transformation is a polyhedral set (see [15] , Proposition A.3.4). Therefore,
is a polyhedral set in
.
Theorem 11 reduces the problem of computing the values
given by (16) to the problem of computing the Euclidean projection of
onto the polyhedron
. This is a particular case of a quadratic programming problem (see [16] , p. 398). There are also specialized algorithms designed for computing such projections (see [17] [18] ).
3. The Case of Two Objectives
For two objectives, under differentiability assumptions, it is possible to find some representation of the scalarization function s in terms of the gradients
and
. Let p = 2 and suppose that the mapping
is continuously differentiable on
. Denote by
the gradient of fi at x (i = 1, 2). Then (4) implies
(21)
The following theorem will help to compute the scalarization function (16) for bi-objective problems.
Theorem 12 Let p = 2,
, and let B be the compact base for
defined by (8). Then there exist vectors
, such that
(22)
Proof. It follows from (8) that B is a subset of some line in
. Moreover, by Lemma 7, B is compact and convex, so it must be a closed line segment. Denote by
and
the endpoints of B. Using (21) and the linearity of h with respect to the first argument, we obtain
Pareto Optimization
We now consider the case of classical Pareto optimization, i.e., when
. We have
. Let
, then by Lemma 7 the set
is a compact base for
, and B is the closed line segment joining the two points
and
. According to Theorem 12, we have
hence, the scalarization function has the form
For any point
, there are two possible cases:
(i)
. Then
.
(ii)
. Then
is the distance from 0 to the line segment S joining
and
.
We now consider case (ii). The line L passing through
and
is parametrized as
where
is a point on the line, and
is the line direction. The closest point on the line L to 0 is the projection of 0 onto L which is equal to
Using the same parametrization, we can represent the line segment S as follows:
Therefore, if
, then the point in S closest to 0 is b. Similarly, if
, then the point in S closest to 0 is
. Finally, if
, then the point in S closest to 0 is q. Hence, the function s can be described as follows:
(23)
Taking into account the definitions of
and
above, we see that this scalarization function depends on the values of gradients of
and
only, so it is easily computable.
Example 13 (problem FON in [19] , p. 187) Let
be defined by
(24)
(25)
The authors of [19] consider problem (3), where
, and state that the set of efficient (Pareto) solutions for this problem is equal to the set of points
satisfying
(26)
Here the set
is closed (contrary to the rest of our paper), but this constraint is in fact inessential and the problem can also be considered on the whole space
. Computing the partial derivatives of
and
, we obtain from (24) - (25)
(27)
(28)
We have designed a program in Maple to compute
, using formulae (23) and (27) - (28). This program consists of three nested loops for the values of the variables
, each variable taking values from −4 to 4 in steps of 0.01. We have obtained
for each x satisfying (26), and
for all other points x. However, there are some points x for which the values
are very small; the smallest value obtained is
(29)
There are no other points at which
, except the Pareto optimal solutions (26).
This example shows that one must be careful when using global optimization algorithms to minimize s because points like the ones appearing in (29) can be easily misclassified as vector critical points.
4. Conclusion
We have presented a new scalarization method for solving multiobjective optimization problems which is based on computing the Euclidean distance from the origin to some subset determined by the generalized Jacobian of the mapping being optimized. This article contains the main underlying theory and only some preliminary numerical computations pertaining to this method. More numerical results will be presented in another research.
Acknowledgements
The authors are grateful to an anonymous referee for his/her comments which have improved the quality of the paper.