A Mean-Field Stochastic Maximum Principle for Optimal Control of Forward-Backward Stochastic Differential Equations with Jumps via Malliavin Calculus ()
1. Introduction
In contrast to the stochastic control problem (e.g. [1] [2] ) which is studied in the complete information case (and [1] with the Brownian motion case only), the performance functional that we will investigate involves the mean of functionals of the state variables (hence the name mean-field). Problems of this type occur in many applications; for example in a continuous-time Markowitz’s mean-variance portfolio selection model where the variance term involves a quadratic function of the expectation. The inclusion of this mean term introduces some major technical difficulties, which include among others the time inconsistency leading to the failure of dynamic programming approach. Recently, there has been increasing interest in the study of this type of stochastic control problems; see for example [3] [4] and [5] .
On the other hand, since we allow the coefficients (
and
as follows) to be the stochastic processes and also because our control must be partial information adapted, this problem is not of Markovian type and hence cannot be solved by dynamic programming even if the mean term were not present. We instead investigate the maximum principle, and will derive an explicit form for the adjoint process. The approach we employ is Malliavin calculus which enables us to express the duality involved via the Malliavin derivative. Our paper is related to the recent paper [6] and [7] . In [6] , they consider a mean-field type stochastic control problem where the dynamics is governed by a controlled forward SDE with jumps and the information available to the controller is possibly less than the overall information. Malliavin calculus is employed to derive a maximum principle for the optimal control of such a system where the adjoint process is explicitly expressed. [7] presents various versions of the maximum principle for optimal control (not mean-field type) of forward-backward stochastic differential equations with jumps and a Malliavin calculus approach which allow us to handle non-Markovian system. The motivation of [7] is risk minimization via g-expectation.
This paper can be considered as the continuation of [6] and [7] . We consider a mean-field type stochastic control problem where the dynamics is governed by a forward and backward stochastic differential equation (SDE) driven by Lévy processes and the information available to the controller is possibly less than the overall information. All the system coefficients and the objective performance functional are allowed to be random, possibly non-Markovian. Malliavin calculus will be employed to derive a maximum principle for the optimal control of such a system where the adjoint process is explicitly expressed.
As in the paper [6] , we emphasize that our problem should be distinguished from the partial observation control problem, where it is assumed that the controls are based on the noisy observation of the state process. For the latter type of problems, there is a rich literature (see, e.g. [1] [8] [9] [10] [11] [12] ). Note that the methods and results in the partial observation case do not apply to our situation. On the other hand, there are several existing works on stochastic maximum principle (either completely or partially observed) where adjoint processes are explicitly expressed (see, e.g. [8] [10] [12] [13] ). However, these works all essentially employ stochastic flow technique, over which the Malliavin calculus has the advantage in terms of numerical computations (see, e.g. [14] ).
Now let’s state our problem as follows:
Suppose the state process
;
,
, of our system is described by the following coupled forward-backward system of SDEs.
Forward system in the controlled process
:
(1.1)
Backward system in the unknown processes
,
,
:
(1.2)
Here
,
and
, given by
(1.3)
are a 1-dimension Brownian motion (see [15] Theorem 13.5) and an independent pure jump Lévy martingale, respectively, on a given filtered probability space
. Thus
(1.4)
is the compensated jump measure of
, where
is the jump measure and
is the Lévy measure of the Lévy process
. The process
is our control process, assumed to be
-adapted and have values in a given open convex set
. The coefficients
,
,
and
are given
-predictable processes.
Let
be a given constant. For simplicity, we assume that
(1.5)
Suppose in addition that we are given a subfiltration
representing the information available to the controller at time t and satisfying the usual conditions. For example, we could have
meaning that the controller gets a delayed information compared to
.
Let
denote a given family of controls, contained in the set of
- predictable controls
such that the system (1.1)-(1.2) has a unique strong solution. If
, then we call u an admissible control. Let
be a given convex set such that
for all
a.s., for all
.
Suppose we are given a performance functional of the form
(1.6)
where
denotes expectation with respect to
,
,
and
are given functions such that
,
for all t and
, and
and
are given
-predictable processes and
is a given function with
(1.7)
The control problem we consider is the following:
Problem 1.1 (Partial information optimal control). Find
and
(if it exists) such that
(1.8)
2. A Brief Review of Malliavin Calculus for Lévy Processes
In this section, we recall the basic definitions and properties of Malliavin calculus for Brownian motion
and
related to this paper, for reader’s convenience.
Let
be the space of all
-valued
-measurable, and square-integrable random variables. Let
be the space of deterministic real functions f such that
(2.1)
where
denotes the Lebesgue measure on
.
Let
be the space of deterministic real functions f such that
(2.2)
can be similarly denoted.
A general reference for this presentation is [16] [17] and [18] . See also the book [19] .
2.1. Malliavin Calculus for
A natural starting point is the Wiener-Itô chaos expansion theorem (See [18] Theorem 1.1.2), which states that any
can be written as
(2.3)
for a unique sequence of symmetric deterministic functions
, where
is Lebesgue measure on
and
(2.4)
(the n-times iterated integral of
with respect to
) for
and
when
is a constant.
Moreover, we have the isometry
(2.5)
Definition 2.1 (Malliavin derivative
). Let
be the space of all
such that its chaos expansion (11) satisfies
(2.6)
For
and
, we define the Malliavin derivative of
at
(with respect to
),
, by
(2.7)
where the notation
means that we apply the
-times iterated integral to the first
variables
of
and keep the last variable
as a parameter.
One can easily check that
(2.8)
so
belongs to
.
Some other basic properties of the Malliavin derivative
are the following:
1) Chain rule ( [18] , page 29)
Suppose
and that
is
with bounded partial derivatives. Then
and
(2.9)
2) Integration by parts/duality formula ( [18] , page 35)
Suppose
is
-adapted with
and let
. Then
(2.10)
2.2. Malliavin Calculus for
The construction of a stochastic derivative/Malliavin derivative in the pure jump martingale case follows the same lines as in the Brownian motion case. In this case, the corresponding Wiener-Itô chaos expansion theorem states that any
(where in this case
is the s-algebra generated by
) can be written as
(2.11)
where
is the space of functions
,
such that
and
is symmetric with respect to the pairs of variables
.
It is important to note that in this case the n-times iterated integral
is taken with respect to
and not with respect to
. Thus, we define
(2.12)
for
.
Then Itô isometry for stochastic integrals with respect to
gives the following isometry for the chaos expansion:
(2.13)
As in the Brownian motion case, we use the chaos expansion to define the Malliavin derivative. Note that in this case there are two parameters
, where t represents time and
represents a generic jump size.
Definition 2.2 (Malliavin derivative
) ( [16] [17] ) Let
be the space of all
such that its chaos expansion (2.11) satisfies
(2.14)
For
, we define the Malliavin derivative of F at
(with respect to
),
, by
(2.15)
where
means that we perform the
-times iterated integral with respect to
to the first
variable pairs
, keeping
as a parameter.
In this case we get the isometry.
(2.16)
(Compare with (2.8)).
The properties of
corresponding to the properties (2.9) and (2.10) of
are the following:
1) Chain rule ( [17] [20] )
Suppose
and that
is continuous and bounded. Then
and
(2.17)
2) Integration by parts/duality formula ( [17] )
Suppose
is
-adapted and
and let
. Then
(2.18)
We let
denote the set of all random variables which are Malliavin differentiable with respect to both
and
.
3. The Stochastic Maximum Principle
We now return to Problem 1.1 given in the introduction. We make the following assumptions:
Assumptions 3.1. (3.1) The functions
,
,
,
,
,
,
,
,
,
are all continuously differentiable (
) with respect to the arguments (if depending on them)
,
,
,
and
for each
and a.a.
.
(3.2) For all
,
, and all bounded
-measurable random variables
the control
belongs to
.
(3.3) For all
with
bounded, there exists
such that
Furthermore, if we define
(3.1)
(3.2)
(3.3)
then the family
(3.4)
and
(3.5)
are
-uniformly integrable and the family
(3.6)
is P-uniformly integrable.
(3.4) For all
, with
bounded, the processes
,
,
and
exist and satisfy the equations
(3.7)
(3.8)
where we used the simplified notation
(3.9)
(3.5) For all
, with definition (3.1), (3.2) and (3.3), the following process:
(3.10)
exists and we now define the adjoint process
,
,
,
as follows:
(3.11)
(3.12)
(3.13)
with
(3.14)
(3.15)
The above processes all exist for
,
. Above and in the following, we use the shorthand notation
.
We now define the Hamiltonian for this problem:
is defined by
(3.16)
The process
is given by the forward equation
(3.17)
for
.
We can now formulate our stochastic maximum principle:
Theorem 3.1 (Partial information equivalence principle) Suppose
with corresponding solutions
,
,
,
,
, of (1.1), (1.2) and (3.17). Assume that the random variables
,
and
belong to
for all
and that
(3.18)
(3.19)
(3.20)
Then the following are equivalent:
i)
for all bounded
.
ii)
, for a.a.
.
Proof. (i) Þ (ii): Assume that (i) holds and note that
(3.21)
and
(3.22)
Then
(3.23)
By the duality formulae (2.10), (2.18) and with
, we get
(3.24)
Similarly using the Fubini theorem in the following last equality, we have
(3.25)
Changing the notation
, this becomes
(3.26)
Combing (3.24) and (3.26) and using (3.14) we get
(3.27)
Then by the Itô formula and (3.17),
(3.28)
Now by (3.16) we have
(3.29)
Hence, we conclude
(3.30)
Combining (3.23), (3.27) and (3.30) we get
(3.31)
This holds for all
. In particular, if we apply this to
where
is
-measurable and
, we get, by (3.7)
and (3.31) can be written
(3.32)
where
(3.33)
and
(3.34)
Note that with
we have, for
,
(3.35)
Hence, by the Itô formula
(3.36)
where G is defined in (3.10). Note that
does not depend on h. Then
(3.37)
where
is defined in (3.15). Differentiating with respect to
at
gives
(3.38)
Since
we see that
(3.39)
Therefore, by (3.36)
(3.40)
By (3.7) we have
(3.41)
Therefore, by (3.40) and (3.41)
(3.42)
where
(3.43)
and
(3.44)
Recall that
. By the duality formula (2.10) and (2.18), we have
(3.45)
Since
, we see that
(3.46)
We conclude from (3.42)-(3.46) that
(3.47)
Moreover, we see directly that
(3.48)
By differentiating (3.32) with respect to
at
, we thus obtain the equation
(3.49)
Using (3.11), equation (3.49) can be written
(3.50)
Since this holds for all
-measurable
we conclude that
(3.51)
(ii) Þ (i): Conversely, suppose (3.51) holds for some
. Then we can reverse the argument to get that (3.32) holds for all
. Then (3.32) holds for all linear combinations of such
. Since all bounded
can be approximated by such linear combinations, it follows that (3.32) hold for all bounded
. Hence, by reversing the remaining part of the argument above, we conclude that (ii) Þ (i).
4. Conclusion
In this paper, we consider a mean-field type stochastic control problem where the dynamics is governed by a forward and backward stochastic differential equation driven by Lévy processes and the information available to the controller is possibly less than the overall information. All the system coefficients and the objective performance functional are allowed to be random, possibly non-Markovian. Malliavin calculus is employed to derive a maximum principle for the optimal control of such a system where the adjoint process is explicitly expressed.
Acknowledgements
The work was partially done while the first author was visiting the University of Kansas. She would like to thank Professor David Nualart and Professor Yaozhong Hu for providing a stimulating working environment.
Fund
The work of Qing Zhou is supported by the National Natural Science Foundation of China (No. 11471051 and 11371362). The work of Yong Ren is supported by the National Natural Science Foundation of China (No. 11371029).