On Global Minimization for the Value Function in Affine Optimal Control Problems ()
1. Introduction and Preliminary
The regularity properties of a value function associated with an optimal control problem have been deeply studied in the last decades, extensively using tools from geometric control theory and nonsmooth analysis. It is well known that the value function associated with an optimal control problem fails to be everywhere differentiable. Actually, it is not even continuous.
In [1] the authors define a value function on the target state set associated with an affine optimal control problem and study its regularity properties. Under certain assumptions, in the attainable set the authors find out some subsets on which the value function is continuous and differentiable. In this paper, by a point of view in optimization methods, we first solve a global minimization problem for the value function defined by authors of the paper in [1] , and then we show that the value function is continuous and differentiable at a global minimizer point.
In deed, the value function defined in [1] takes its value by an optimal control with a terminal constraint. In this paper we solve a global minimization problem by the optimal control without terminal constraints.
The optimal control problem we deal with in this paper is the following:
(1.1)
such that
Here the function
is smooth and bounded from above, the vector function
and matrix function
are smooth on
.
is a given initial condition and
is a given terminal condition. An admissible control
is a vector function in
, such that the solution
for the equation
satisfying
is well defined on the interval
. The set of admissible controls is denoted by
. For the initial state
, we define value function
associated with
to be a function of terminal states as follows: for
,
(1.2)
with the understanding that
if y cannot be attained by admissible trajectories in time T. It is also clear that
is bounded below noting that
is bounded below due to the assumption that
is bounded from above. Define the attainable set
as the set of points on
that can be reached from a by admissible trajectories in time interval
. We always assume
. In [1] the authors have studied the problem concerning the regularity properties of the value function
on a dense subset of the attainable set.
In this paper we provide a computational approach to the following minimization problem:
(1.3)
Meanwhile, we study the regularity properties of the value function
at a global minimizer.
Remark 1.1. In practice, when
, because the cost functional is a square of the
norm of control, by the minimization of the value function one can find an optimal target for the system to work at minimal cost.
To deal with minimizing
over
, we need to solve the following optimal control problem without terminal state restriction.
(1.4)
such that
Remark 1.2. We see that, for
and u which is an admissible control steering the affine system from a to y,
(1.5)
Remark 1.3. Suppose that
is solvable (i.e. there exists an optimal control of the problem
). Let
be an optimal control of the problem
. It implies that
. Let the optimal control
steer the primal affine system from a to a point
. By (1.2), (1.5), we have
(1.6)
Since
is bounded below, we conclude that
is not constantly infinity. Thus if
is solvable, then by (1.6) we see that
is finite. Thus the minimization (1.3) is meaningful. Moreover, we will show that
is solvable only if the minimization problem
is solvable.
In this paper we focus on the Hamilton-Jacobi-Bellman equation [2] [3] with respect to the problem
. We present parameterized convection-diffusion equations for a viscosity approximation [4] [5] [6] to the Hamilton-Jacobi-Bellman equation. Then a parameterized convection-diffusion equation yields a piecewise differentiable flow for an approximation of the optimal objective value of the problem
.
The rest of the paper is organized as follows. In section 2, we study the global minimization problem of the value function
over the attainable set. In section 3, two results are given on continuity and differentiability of the value function
at a global minimizer. The section 4 is devoted to present a computational approach to the minimization problem of the value function
. Two examples are presented to illustrate the computational approach for a linear quadratic optimal control problem under terminal constraint in section 5. In section 6, we derive an iteration of difference equations for implementing the computational approach to
given in section 4. A conclusion is in the last section.
2. Minimizing the Value Function
over the Attainable Set
For the problem
, to minimize the value function
over the attainable set
, we consider the optimal control problem
in (1.4). For the problem
, we define its value function as follows:
(2.1)
Theorem 2.1. If
is solvable (i.e. there exists an optimal control of the problem
), then
(2.2)
Proof. Let
be an optimal control of the problem
, which steers the control affine system (in (1.4)) from a to
. By the definition of
((1.2)), noting that by (1.5) for all controls u steering the affine control system from a to
satisfying
, we have
(2.3)
Thus we have
(2.4)
On the other hand, if there is a
such that
, then, noting (1.5) and (1.2), there is a control u which steers the affine system from a to y such that
, which leads to a contradiction to the fact that
is an optimal control of the problem
. Thus we have
. It follows from (2.4) that we have
. The proof of Theorem 2.1 is completed.
Theorem 2.2. If a vector
satisfies
and an admissible control
satisfies
, then
is solvable.
Proof. Since the minimization problem
is solvable, we have a point
such that
(2.5)
We need to show that an admissible control
steering the system from a to
such that
is an optimal control of
. Let
be an arbitrary admissible control steering the system from a to
. We will show that
. By (1.5), (2.5) and the definition of
, also noting the assumption
, we have
(2.6)
Since
is an arbitrary admissible control, by (2.6) we see that
is an optimal control of
. The proof of Theorem 2.2 is completed.
Remark 2.1. In the proof of Theorem 2.1 and Theorem 2.2, a basic fact is the both of them have the same control system and the same cost for the same admissible control. By these two theorems, we know that the optimal control of the problem
steers the control system from the initial state point to the minimizer of the value function
of the problem
over the attainable set. On the other hand, if
is a minimizer of
over
and a control
, steering the system from the initial point a to
, is an optimal control of
, then
is an optimal control of
.
3. The Regularity Properties of Value Function
at a Minimizer Point
In this section we assume that
is solvable. Let
be an optimal control of the problem
, which steers the primal affine system from the initial point a to
. By (2.4) we see that
is a minimizer point of
over
.
We know that the end-point map
is smooth [1] [7] [8] [9] [10] . Further we assume that the end-point map
is an open map at the optimal control
in the following theorems.
Theorem 3.1. If the minimizer point
and the end-point map is an open map at
, then the value function
of
is continuous at the minimizer
of
.
Proof. Since
is an optimal control of the problem
, which steers the primal affine system from a to
, we have
by (2.4). Since the end-point map is open at
, there are positive numbers
and
such that the image of
under the end-point map covers
. Given
. We can also choose positive number
so small that, when
, the inequality
holds(see Proposition 32 in [3] ). Since
, we can choose positive number
above so small that, for all
,
and
noting that
is the minimizer of
. In other words, now, for given
, we have two positive numbers
and
such that:
1) the image of
under the end-point map covers
;
2) when
, the inequality
holds;
2) when
, we have
and
.
Now for
, in
, we have an admissible control v steering the primal affine system from a to
. By the definition of
, also noting that the relationship
and
, we have the following inequalities:
(3.1)
Consequently,
(3.2)
The proof of Theorem 3.1 is completed.
Next we study the differentiability of the value function
at a minimizer point.
Theorem 3.2. If the minimizer point
and the end-point map is an one-to-one open map at
, then the value function
of
is differentiable at the minimizer
of
.
Proof. Since all functions
are smooth, we see that the functional
is F-differentiable at
. Noting that the end-point map is an open map at
and
for all admissible control near
in
, we have
(3.3)
which also implies that, for a small positive r, there exists
such that, as
,
(3.4)
Since the end-point map
is an open map at
, there are positive numbers
and
such that
is surjective. There exists a smooth right inverse
such that
for every y in
. Fix local coordinates around the initial state a, and let
and
denote some balls of radius
centered at a and
, respectively. Due to
being smooth, there exists positive numbers
and
such that, for every
,
(3.5)
Pick any point
such that
, with
. Then by (3.5) there exists
satisfying
and
such that
. Noting that
(3.6)
When
(
), by (3.1), (3.2), (3.4), (3.6), also noting that
, we have,
and
(3.7)
By (3.5) we see that when
, i.e.
(
is picked above), we have
. Thus, by (3.7), (3.6), (3.3) we have
Consequently, the value function
of
is differentiable at the minimizer
of
and the corresponding derivative is zero. The proof of Theorem 3.1 is completed.
Remark 3.1. For both theorems above we assume
noting that the candidate as a minimizer of the value function should be in the interior of the attainable set. Since
is the image of
under the end-point map, it is reasonable to assume that the end-point map is an open map at the optimal control
. In [1] the authors assume the end-point map to be open and a submersion at an optimal control and consider the regularity properties of the value function on a subset (
). In this paper we only need to assume that
is an image of the end-point map which is open at the optimal control
. But we do not assume the end-point map to be a submersion.
4. An Extremal Flow for Minimizing the Value Function
of Affine Optimal Control Problems under Terminal Constraint
In this section, for the problem
, to minimize the value function
over the attainable set, we create a so called extremal flow for computing optimal value
of the problem
numerically. We focus on the following HJB equation for
,
(4.1)
with boundary condition
.
By elementary optimization, for given
, we see that
is the unique minimizer of
over
. Then we have
(4.2)
Thus we can rewrite the equation in (4.1) as the following PDE:
(4.3)
Remark 4.1. By classical PDE theory [4] [5] [6] , we know that a viscosity solution of the PDE in (4.3) can be obtained from smooth solutions
to the family of convection-diffusion equations
(4.4)
(parameterized by
) in the limits as
, where
. The convergence of (4.4) to (4.3) as
has been established by the classical works of PDE on the viscosity approximation (see [2] ). Thus in the equation (2.4) the diffusion term
converges to zero locally uniformly as
.
For computing
numerically, we define extremal flows as follows.
Definition 4.1. Given
, for a solution
of the PDE problem (4.4), we call
an extremal flow, if it is a solution of Cauchy initial value problem
(4.5)
with a feedback control
(4.6)
The following theorem claims that optimal value
of the problem
can be approximated by solving the equation (4.4) and (4.5).
Theorem 4.1. Let
be denoted as a solution of PDE in (4.4) corresponding to a positive real number
. Given
, if
is an extremal flow related to
and
is the corresponding feedback control (see (4.5), (4.6)), then we have
(4.7)
Proof. By (4.5), (4.6), (4.7), we have
(4.8)
Integrating the equality in (4.8) with respect to t from 0 to T, noting that
, we have
(4.9)
On the other hand, if
is another trajectory corresponding to an admissible control
with respect to
, we have
(4.10)
then for each
, we have
(4.11)
Integrating the above inequality over
, noting
, we obtain
(4.12)
By (4.9) and (4.12), we have
(4.13)
Let
be an optimal control and
be the corresponding optimal trajectory with respect to
. Using (4.13) for the optimal pair
, noting the fact
, we have
(4.14)
By (4.14), noting that
is an admissible feedback control, we have
which yields
(4.15)
Noting that, in the Equation (4.4), the diffusion term
converges to zero locally uniformly as
(see Remark 4.1), we can show that, on a compact set
which contains the optimal trajectory
and the flow
,
(4.16)
Thus, by Lebesgue Convergence Theorem, we have
(4.17)
and
(4.18)
Thus by (4.15), (4.16), (4.17), (4.18) we have, as
,
(4.19)
It follows from Theorem 2.1 that
The theorem has been proved.
In the proof of Theorem 4.1, replacing
with zero, the extremal flow will not depend on
. By the same way we can prove the following result.
Theorem 4.2. If
satisfies the PDE
(4.20)
and
is the solution to Cauchy initial value problem
(4.21)
with a feedback control
(4.22)
then we have
(4.23)
5. Examples on Linear-Quadratic Optimal Control Problem under Terminal State Constraint for Illustrating Theorem 4.2
Example 5.1. We consider the following linear-quadratic optimal control problem with terminal state constraint:
(5.1)
and the corresponding linear-quadratic optimal control problem:
(5.2)
where in (5.1) and (5.2),
and
.
By classical LQ optimal control theory [3] , there exists an absolutely continuous symmetric matrix function
, defined for
, which satisfies the matrix Riccati Differential Equation on
:
(5.3)
Moreover the LQ optimal control problem
is solvable.
To use Theorem 4.2, we see that the function
satisfies the following HJB equation
(5.4)
For
, we have
For
to find an extremal flow
, we solve the Cauchy initial value problem
(5.5)
Let
be the solution of the matrix differential equation
By classical ordinary differential equation theory,
is the fundamental solution associated to
and the solution of (5.5) is given by
Then we have a feedback control
(5.6)
By Theorem 4.2 we have
(5.7)
Remark 5.1. We will provide an approximation approach to compute
in the example as follows.
Example 5.2. We consider following linear-quadratic optimal control problem with terminal state constraint:
(5.8)
and the corresponding linear-quadratic optimal control problem:
(5.9)
Similar as the PDE in (5.4), the HJB equation for this example is
(5.10)
Similar as Example 5.1 we have
(5.11)
where
satisfies Riccati Differential Equation:
(5.12)
We solve the Cauchy initial value problem
(5.13)
to find an extremal flow
and the feedback control
By Theorem 4.2, for this example we have
(5.14)
For a numerical approach to compute
, in the following we present a sequence of flows to converge the extremal flow and the corresponding feedback control for an approximation of
.
By the iteration method given in [11] , we have a sequence of differentiable functions
, satisfying
(5.15)
(5.16)
such that
converges uniformly to the solution
of the equation in (5.12). Then we have a sequence of
. Such that for
,
(5.17)
Noting that
is bounded and
converges uniformly to the solution
, we see that
is uniformly bounded. Therefore by Bellman-Gronwall inequality we can show that
is uniformly bounded. Further, we show that
converges uniformly to
which is the solution of the equation in (5.13) as follows. We have, for
and
,
Then for
and
,, by general integral estimation we have,
By Bellman-Gronwall inequality, we have,
(5.18)
Noting that
are uniformly bounded
,
is bounded on
and
converges uniformly to zero, by (5.18) we have shown that
converges uniformly to
on
. Meanwhile, if define
, then
converges uniformly to the feedback control
on
. Noting that
we have
(5.19)
6. A Numerical Approach to Compute
for the Sake of General Affine Optimal Control Problem
In this section, we present an iteration of difference equations to illustrate the approximation of
given by Theorem 4.1 concerning the affine optimal control problem
.
Given
. Let
satisfy
(6.1)
with a feedback control
(6.2)
By the result in Theorem 4.1, we need to compute
. Consider the function
(6.3)
Noting the expression of the cost functional of
in (1.4), we will estimate
(6.4)
Let
. Dividing the time interval
evenly into L small intervals
, with
,
. Let
. Define
(6.5)
By classical numerical mathematics [12] , we have
(6.6)
By (6.3), for computing
, we need to estimate
by following difference equation
(6.7)
with appearing
. Therefore, next for a given
we need to estimate
. We present an iteration of difference equations as follows to compute
numerically.
On
we define
(6.8)
The equation in (4.4) can be rewritten as
(6.9)
with the boundary condition
.
For simplicity, we restrict our discussion to the state in
as follows. For a given
, along the direction
, we have a linear function
such that
(6.10)
Noting that
. Then when
, at
, we have the following difference equation:
(6.11)
(6.12)
Write the state
in
. Denote
by
. We focus on difference equations on
. For positive integers:
, let
and
. For
, denote
and
. Let
denote the approximate grid value of the solution
and
(6.13)
(6.14)
We use
to denote the piecewise bi-linear interpolant of
,
,
,
and
, for all
.
By difference iteration method with (6.11)-(6.14) and
(a piecewise bi-linear interpolant of
), for a given
, we write algorithm for approximating
as follows.
Algorithm 4.1.
1) Set
,
;
2) Compute
3) For
,
(6.15)
(6.16)
(6.17)
(6.18)
4) Compute
(6.19)
Remark 6.1. The above discretization scheme is essentially an Euler’s method in the characteristic direction to be stable in t [12] . The system matrix associated with the third step of the algorithm above is symmetric and positive definite if
is sufficiently smaller than
. We can benefit from these special properties to use an efficient iterative solver which takes advantage of the symmetry and positive-definiteness of the matrix.
Remark 6.2. In Algorithm 4.1, for computing
, by (6.15) we only need to have linear computations:
(6.20)
noting that
have been got by previous steps.
7. Conclusion
It is well-known that in general a value function of optimal control problem is non-smooth. It is hard to study the regularity properties of a non-smooth function. In this paper, we study the regularity properties of the value function in an affine optimal control problem by solving the global minimization problem for the value function over the attainable set. We also provide a computational approach to this global minimization by a convection-diffusion equation. By more works in future we may consider some global optimization for non-smooth function with the help of optimal control methods.