Symplectic Numerical Approach for Nonlinear Optimal Control of Systems with Inequality Constraints

This paper proposes a system representation for unifying control design and numerical calculation in nonlinear optimal control problems with inequality constraints in terms of the symplectic structure. The symplectic structure is derived from Hamiltonian systems that are equivalent to Hamilton-Jacobi equations. In the representation, the constraints can be described as an inputstate transformation of the system. Therefore, it can be seamlessly applied to the stable manifold method that is a precise numerical solver of the Hamilton-Jacobi equations. In conventional methods, e.g., the penalty method or the barrier method, it is difficult to systematically assign the weights of penalty functions that are used for realizing the constraints. In the proposed method, we can separate the adjustment of weights with respect to objective functions from that of penalty functions. Furthermore, the proposed method can extend the region of computable solutions in a state space. The validity of the method is shown by a numerical example of the optimal control of a vehicle model with steering limitations.


Introduction
An optimal control is one of the most important strategies in control design.Optimal control problems for non-linear systems can be formulated by Hamilton-Jacobi equations [1].Although a lot of approximate methods were presented (see [3]), a precise solving method of the equations had been left undeveloped for a long time.As it is now a definitive numerical solver called the stable manifold method was recently presented by [2], and it has been applied to many practical problems [3].The stable manifold method is a symplectic numerical scheme of equivalent Hamiltonian systems derived from the Hamilton Jacobi equations.
On the other hand, almost all actual control systems possess not only nonlinearity, but also constraints with respect to, e.g., inputs or state variables.Constrained optimal problems have attracted a lot of attentions, and many methods, e.g., the penalty method or the barrier method have been proposed for such systems [4] [5].The common strategy of such approaches is to create constraints by rapidly increasing nonlinear weights in cost functions for preventing states from violating constraints.This idea has been widely applied in various situations, and yielded many useful results, e.g., [6]- [12].However, in nonlinear optimal control problems with constraint, approaches based on numerical solutions obtained from the stable manifold method have not been sufficiently discussed yet to our knowledge.
This paper proposes the formulation for integrating inequality constraints in the nonlinear optimal regulator design using the stable manifold method.The formulation is derived from the input-output linearization technique [11] [13].We can apply the stable manifold method to input affine first order nonlinear system.For such a system, solutions can be numerically calculated even if analytical solutions are not available or their behaviors are very complex.Thus, the usual control design procedure of the stable manifold method can be seamlessly applied to constrained systems described by the proposed formulation.The optimality of the controllers can be proven in the same way of the barrier method.Indeed, we show that the change caused by the transformation can be interpreted as the addition of another penalty in objective functions.
Furthermore, our formulation has the following three advantages.First, a heuristic parameter tuning of weights is practically required for convergence of calculations in many cases.However, in conventional methods, the weights of error functions such as quadratic terms with respect to states or inputs and penalty functions are described as a single objective function.Thus, it is difficult to independently adjust each weight, because solutions in nonlinear optimizations are quite sensitive with respect to such a change.In our method, the constraint is described as a part of control systems by using the transformation between inputs and state variables; therefore, the weights can be separately adjusted.Second, in the barrier method (or the interior penalty method), it is not so easy to set initial states of numerical calculations in a constrained region in some cases.In the stable manifold method, optimal orbits are calculated from a stable point at the origin to the surrounding area of the origin in the inverse direction of time evolutions.Therefore, we can systematically search solutions intersecting a given initial condition.Third, cost functions with Lagrange multiplier are used for incorporating equality constraints in addition to penalty functions in the multiplier method.The method is based on a local optimality, i.e., the Karush-Kuhn-Tucker condition, and solutions are approximately calculated by an iterative calculation, e.g., the sequential quadratic programming method [14].In our method, optimal orbits obtained from the stable manifold method can be regarded as (semi-)global optimal solutions if they sufficiently cover a state space.Furthermore, there is no need to employ any approximation of solutions and any system reduction in the design.
This paper is constructed as follows: Section 2 states basic definitions of nonlinear optimal control problems and provides the brief introduction of our method.In Section 3, we first present a formulation of nonlinear optimal control problems with constraints in terms of differentiable saturation functions.Next, an augmented system including the constraints is introduced from the formulation.Then, we discuss three topics: the implementation of the system with the stable manifold method, the optimality of the formulation, and the extension to multi-input and multi-constraint cases.Finally, in Section 4, a numerical control example for a vehicle model with steering limitations modeled by a saturation function is illustrated to show the validity of our method.

Problem Setting and Brief Summary of Proposed Method
In this section, we make a brief summary of nonlinear optimal control problems and formalize a class of control systems with constraints described by inequalities.

Control Systems with Inequality Constraints
Let us consider the following nonlinear system: with an initial condition ( )  are the vector fields consisting of smooth functions, and the state and the input are denoted by, respectively, ( ) n x t ∈  and ( ) m u t ∈  .The vector representation of system (1) is as follows: Then, we consider the following constraint with respect to a state variable i x for some 1 i n where min 0 a < and max 0 a > are constants.Our method has been constructed with a mind to be applied to the stable manifold method [2].Hence, we assume the following conditions that are required for the application.

Nonlinear Optimal Regulator
We first recall the standard setting of nonlinear optimal control problems.
Problem 1. Find a control input u in (1) minimizing the objective function x Qx u Ru p f x g x u That is, Problem 1 can be rewritten as the problem of minimizing H.Then, from the stationary condition with respect to u, we obtain the optimal feedback as follows: ( ) ( ) By substituting (6) into (5) and using dynamic programming, we get the following Hamilton-Jacobi equation: If we can solve (7) with respect to x and p, then the solution ( ) , x p of (7) can determine the optimal gain ( ) p t in the feedback (6) for each state ( ) x t at a time 0 t ≤ < ∞ .The stable manifold method, which is the main tool in this paper, is an integral recursion formula for calculating solutions of the Hamiltonian systems that are transformed from the Hamilton-Jacobi Equation ( 7) that are directly difficult to solve (see Appendix).

Conventional Methods for Constrained Optimal Problems
The penalty method and the barrier method are extended optimization methods for treating constraints.In these methods, the penalty function ( ) P x and the barrier function ( ) are respectively added to the cost function (4), where S is the domain within the constraint, int S denotes the internal of S, and x S → ∂ means that the infimum of a certain norm between x and any y S ∈ ∂ tends to be zero.( ) P x becomes larger when x get more further away from the unconstrained region S; therefore, it is ex- pected that solutions avoid violating constraints.However, the constraint is not exactly guaranteed, and the behaviors are depend on the setting of weights multiplied to ( ) P x .( ) x is defined only in the internal of S. In- itial conditions in numerical calculations should be selected to satisfy this constraint, and this setting is not always easy.

Concept of Proposed Method
This section briefly summarizes the proposed method for applying the stable manifold method to nonlinear optimal problems with inequality constraints.The main idea of the method is to transform the problem as a standard problem for unconstrained systems.That is, we derive an input transformation that acts as the constraints with respect to a state i x from the input-state linearization [11] [13].Let us consider the original control system Σ , and the input transformer z Σ that is added to Σ (see Figure 1).Σ has the input u and the full-state output y, and z Σ has the external input d u and the output z y that is defined as the function u of x and z x that realizes the constraint (e.g., ( ) z η in Figure 2, details will be explained later) as follows: where z x is defined by using the virtual variable ( ) , , , , 0 , 0, 0, , 0,1 .
The input of Σ , i.e., the output of z Σ is designed for realizing the constraint (3).Hence, we only have to solve the standard nonlinear optimal control problem of the transformed system z Σ ⋅ Σ instead of that of the system Σ with the constraints.
Example 1.Let us consider the nonlinear system under the constraint On the other hand, by substituting the relation 2 x u =  of the system ( 13) into ( 14), the following relation is given: where z  in the right side of ( 15) has been regarded as a new input d u .Hence, we can augment the system (13) as follows: Furthermore, we can reduce this expression by eliminating 2 x as follows: ( ) This system actually includes the inequality constraint.Moreover, the system ( 16) is an input affine first order nonlinear system, and it satisfies Assumption 1.Hence, we can apply the stable manifold method to the nonlinear optimal control problem of ( 16) instead of ( 13) with the constraint.In this case, the objective function is given as where Q and R are some weight matrices, and we have defined the state variables

Formulation of Nonlinear Optimal Control Problem with Constraints for Stable Manifold Method
In this section, we formally state the formulation of the previously discussed basic concept for constrained systems, and then we prove the optimality of the formulation.Finally, we show that the formulation is generalized for multi-input-output systems with higher relative degrees.

Differentiable Saturation Function
For simplification, we first introduce the standard formulation for single-input-output systems (1) with the constraint (3) from [11].We shall introduce the following well-known definition [13] as a preparation for the generalization.Definition 1.Let y be the output of the system (1).We call ρ the relative degree of y if ρ is the minimum integer satisfying ≠ , where f L y is the Lie derivative of ( ) y x with respect to ( ) and we denote In this paper, we consider the following function describing inequality constraints.Definition 2. Let i y x = be the output of the system (1) with the relative degree ρ for some where ( ) ( ) ( ) for some 1 i n ≤ ≤ and ( ) are functions of the time t, the k-th order time derivative of i x and η are denoted by ( ) Proof.By integrating the both side of ( ) ( ) ( ) with respect to t, we get , where k C is some constant of integration.Here, if we set ( ) ( ) ( ) ( ) ( ) .
In the same way of the integration for each l, we can see that (20) is equal to (19). Thus, we use the relation (20) instead of the inequality constraint (3) in the nonlinear optimal control design.Remark 1.In Definition 2, the function η is defined as a bijection.The surjectivity can be relaxed by re- stricting a state space to some open subdomain including the origin.The injectivity is used for guaranteeing the uniqueness of solutions around the origin.If we don't assume injective, there exist a lot of z satisfying this relation at an initial time, e.g., considering a sinusoidal wave function as a constraint or in the case of systems with cyclic coordinates.Here, we assumed the injectivity to avoid such complexities.

Augmented System Representation Including Constraints
In this section, we clarify the representation of the connected system z Σ ⋅ Σ described in Section 2. Proposition 3. Let i y x = be the output of the system (1) with the relative degree ρ for some 1 i n ≤ ≤ .Let η be a ρ-th differentiable saturation function describing the constraint (3) with respect to i x .Then, for any natural number l ρ ≤ , we can describe as follows: where l φ is some function.Proof.By applying a time differentiation to (19), we get For 1 l = , (23) holds if 1 0 φ = .We assume that (23) holds for 1 l j = − .Then, the following relation holds: By regarding the first and second terms in (25) as j φ , we can see that (23) holds for l j = .Because η is C ρ - class, (23) holds for any l. Proposition 4. Consider the system (1) with the state i x constrained by (3) for some be the output of the system with the relative degree ρ .By regarding ( )   p z as a new input d u in the relation (20), we can augment the system (1) as follows: where we assume that the initial condition is ( ) ( ) ( ) ( ) ( ) and we have defined .
Proof.From the direct calculation of (23) for l ρ = , we can obtain , , .
Hence, we can determine u by using the second equality. In Proposition 4, the subsystem with respect to z corresponds with z Σ explained in Section 2. The system (26) can be reduced as follows.

Unifying Augmented Systems with Stable Manifold Method
Let us consider designing an optimal feedback (6) for the system (30) by using the stable manifold method.To apply the stable manifold method, we must check whether Assumption 1 holds.The first assumption obviously holds.
Lemma 1.Consider the constrained system (30).Then, ( ) Proof.In the original system (1), we assumed that ( ) From the definition of differentiable saturation functions, we obtain ( ) Y. Abe et al.

241
( )  Consequently, the following condition is obtained from the above fact.Theorem 5. Consider the nonlinear optimal regulator design of the system (2) that satisfies Assumption 1 with the constraint (3) in terms of the augmented system (30).The stable manifold method can be applied to the system if the linearized system of the augmented system (30) is stabilizable.
Proof.By Corollary 1 and Lemma 1, we can prove the applicability of the stable manifold method for the system.A stable manifold is defined by the adjoint variable ( ) for a certain solution ( ) V x of the Hamilton-Jacobi equation.If the linearized system (30) is stabilizable, there exists the stabilizing solution of the Riccati equation corresponding to the Hamilton-Jacobi equation, and there also exists a solution ( ) V x of the Hamilton-Jacobi equation such that ( ) ( ) ( ) ( ) ( ) − is asymptotic stable [15].
In this case, p derived from the stabilizing solution ( ) V x can be calculated by the stable manifold method. Furthermore, we can derive the following Hamiltonian system from the above representation in (30) for the constrained system.
Lemma 2. Equation (7) for the system (30) can be transformed into the equivalent Hamiltonian system One of the most important advantages of this implementation of the stable manifold method is the following symplectic property of the numerical scheme.Proposition 6.The numerical precision on the optimality of controllers obtained from the stable manifold method for the constrained optimal problem can be checked by the condition whether the Hamiltonian ( ) , H x p , i.e., the left-side of the Hamilton-Jacobi Equation ( 7) is sufficiently close to zero.
Proof.The Hamilton-Jacobi equation is defined by ( ) , 0 H x p = .Therefore, by using the representation in Lemma 2, if the Hamiltonian of the equivalent Hamiltonian system (33) is approximately zero, state variables of the Hamiltonian system can be regarded as a solution of the Hamilton-Jacobi equation.

Optimality of Problem Setting for Augmented Systems
This section shows that the formulation using the augmented system (30) is reasonable in the sense of optimal problems.Our purpose is to find u that is subject to minimize the cost function J in Problem 1.However, the costs with respect to if the i-th components i q of Q for all 1 i n ≥ + are sufficiently small, then a solution of Problem 2 gives an approximation of that of Problem 1 with arbitrary accuracy.
To prove the above theorem, we prepare the following facts.
where the sketch of the function W is illustrated in Figure 3, is the monotonic decreasing positive function such that ( ) ( ) is the monotonic increasing positive function such that ( ) ( ) Proof.Let i q be the i-th component of Q.Now, we can define : , where we used the relation in Lemma 3. Therefore, Equation (34) can be rewritten as because d R is sufficiently close to zero. Proof of Theorem 1. From Lemma 4, we can obtain the correspondence between ( 4) and ( 34) under the assumptions.On the other hand, in the barrier method, if there exist an optimal solution opt x ∞ is the solution of optimal problems with constraints.Hence, the solutions of Problem 2 are those of optimal problems with constraints.

Extension to Multi-Input and Multi-Constraint Cases
In this section, we describe the basic idea of extending the previous discussion on single input systems (i.e., 1 m = ) to multi-input systems (i.e., 1 m > ) in the framework of the stable manifold method.As the simplest case, if inputs are isolated with each other in their input-output relations, i.e., they do not have a common output, multi-input system representation can be defined by independently applying the same way of the single input case to each input with constraint.On the other hand, there might exist many inputs 1 2 , , u u  derived from a certain y by differentiations.In this case, we can actually employ the same procedure [11].Hence, we only explain the basic idea of the extension here.We first consider the following case of two inputs with different relative degrees for simplification.Proposition 8. Consider the system (1).Let y be the output of the system with the relative degree 1 ρ with respect to 1 u and the relative degree 2 ρ with respect to 2 u , where 1 2 ρ ρ ≠ and 1 u and 2 u are not related with each other through their derivatives.Then, the effect of 2 u is included in the first term of the right hand of the following equation: ( ) u appears after ( ) ; however, this term does not generate 1 u after further differentiations.
 The above discussion can be easily extended to the case of multi inputs with different relative degrees.Next, we consider the case when many inputs with the same relative degree are derived from a certain y by differentiations.In this case, u is not uniquely determined from (29), because 1 m × -matrix.However, we can choose a representative input by using the following procedure.
Definition 3. Let i α be the i-th component of ρ can be written as , , as in (29).There exists some j satisfying 0 j α ≠ in the left side of (41), because 1 0 Hence, we choose one of such j as 0 j , and define , , . As a result, in the multi constraint case, we only have to apply the above two extensions to each input, and the integration to the stable manifold method is basically the same construction of the single constraint case.

Numerical Example
This section shows the numerical result of the nonlinear optimal control for a vehicle model to demonstrate of the validity of the proposed method.

Control Model
Let us consider a 2-wheel vehicle model that is equivalent to a 4-wheel vehicle under the following assumptions: the characteristics of wheels are same, resistive forces except for the friction between tires and grounds are negligible, and the equilibrium point of the system is a state of a steady driving with a constant speed.The state equation of the model is given by sin cos 0 2 0 2 d cos , 0 d 1 0 where the state variables β , r, θ , and δ denote the slip angle at the center of gravity (COG), the yaw rate, the body angle, and the steering angle, respectively, and the following physical constant parameters are used: the mass 990 m = , the speed V, the moment of inertia 683 I = , the distance , , where the subscript i means the front wheel if i f = and the rear wheel if i r = , and we have defined the fric- tion constant 0.4 µ = between road surfaces and tires (wet condition), the experimental parameters 3.25 B = , 1.23 E = and 6.00 , and the gravitational acceleration 9.8 v g = .

Control Design
Let us design an optimal regulator for the vehicle model with the inequality constraint as an angle limitation of the front wheels.The purpose is to stabilize the state variables =  as an input, the relative degree of δ with respect to u is 1.Then, by setting d u z =  , u is eliminated from the system (43).Furthermore, by substituting δ into ( ) z η , δ is also eliminated.Hence, we have where , and we have used ( )

Numerical Results
We solved the nonlinear optimal control problem for the system (48) with the constraint (47) by the stable manifold method.The result was compared that of the penalty method with the penalty function ( ) , where 0 π 9 d = , and 1000 p n = by numerical experimentations.In this comparison, we reduced the difference between the both methods as far as possible by using the following small weights with respect to inputs: .
Figure 4 shows two finite sets of optimal orbits designed by the proposed method and the penalty method under the convergence condition that the Hamiltonian ( ) , H x p in ( 7) is sufficiently small: , where z has been replaced by δ in the axis of the figures for understandability.Figure 5 shows the compari- son result of single optimal orbits derived from the both method.A stable manifold is as a set of all solutions consisting of the pair of a controlled optimal orbit x and an optimal gain p along the time evolution.Thus, Figure 4 and Figure 5 shows the stable manifolds projected to the state space spanned by x, although stable manifolds lie on the space with x and p coordinates.We can see that the stable manifold of the proposed method covers a wider area that is equivalent to the stabilizable region in the state space.
Figure 6 shows the time responses using the conventional and proposed method, respectively, for the initial condition in the common area of the projected stable manifolds in , respectively.The optimality was improved by the proposed method, but the difference between them seems to be small.However, this comes as a result of the setting that the weights were chosen as a small value.
On the other hand, Figure 7 shows the time responses for the initial condition in the area where is quite near (i.e., the outside of) the projected stable manifold of the conventional method in the left graph of Figure 4, and also in the projected stable manifold of the proposed method in the right graph of  .The motion within 0.25 seconds in the case of the penalty method oscillates; however, the tire is slipping and the stability did not lose.In this case, the values of the cost functions are 0.7898 J = and 0.7814 J = , respectively, and the optimality was more improved from the previous example.Finally, Figure 8 shows the time responses for the initial condition in the area where is far from the projected stable manifold of the conventional method in the left graph of .Then, the conventional method caused the unstable motion as in Figure 8. Almost all trials using the penalty method in this area not only lost optimality, but also failed stabilization.

Conclusion and Future Work
This paper presented a control system representation that seamlessly can be applied to a precise numerical solver of Hamilton-Jacobi equations called the stable manifold method in the optimal regulator design for nonlinear systems with inequality constraints.The representation was derived from the slack variable method and the input-output linearization technique without any approximation of solutions and any system reduction.We clarified the following four facts: how to integrate the representation to stable manifold method, the weight adjustments of error and penalty functions which can be separated in this formulation, the optimality of the formulation, and the extension of the representation to multi-input and multi-constraint cases.Finally, the validity of the method was shown by the numerical example of a vehicle model with steering limitations modeled by a saturation function.
The extensions of our method to time-variable or state-dependent constraints, practical numerical simulations using more detailed vehicle models, and discussions on the robustness of the controller obtained from our method are possible future works.
where the weights Q and R are diagonal matrices such that 0 Q ≥ and 0 R > .The Hamiltonian H in the variational calculus of the above problem can be introduced by using Lagrange multiplier n p ∈  as follows:

Figure 1 .
Figure 1.Input transformation of control systems.

in Figure 2 ,
. By using a smooth function ( ) z η which will be formally defined as a differentiable saturation function in the next section, with the range (  and a new state z ∈  , the inequality constraint can be written by equivalent equation of the description (see Proposition 1):

Problem 2 . 7 .
are not included in J at present.Thus, we define a new objective function as follows.Find a control input u in (1) minimizing the objective function for the augmented system (30), where the weights Q , R and d R are diagonal matrices such that 0 Q ≥ , 0 R > and 0 d R > .The purpose of the term d d d u R u Τ is to converge the input d u to regions that are close to the bounds of the constraints.Thus, if d R is sufficiently small, the effect of this term becomes negligible in the sense of the original optimal problem.Theorem If the weight d R in (34) and the parameter  of η in Definition 2 are sufficiently small, and

Lemma 4 .
If the i-th components i q of Q for all 1 i n ≥ + are sufficiently small, the cost function (34) can be considered as the cost function (4) with the nonlinear weight
of the system (1) with the relative degree ρ .Let η be a ρ -th saturation function describing the constraint (3) with respect to i x .Then, the system with the input (42) satisfies the constraint for any input constructed in the same way in Proposition 4. Note that other inputs i u except for 0 i j = are canceled by the first term in (42).
0 under a constant speed V.The system (43) satisfies Assumption 1.By using a differentiable saturation function η and a new va- riable z, we describe the limitation min max differentiation of the both side of the relation, we obtain saturation function.Indeed, the model (48) is stabilizable.

Figure 4
in the projected stable manifold of the proposed method in the right graph of Figure 4:

Figure 5 .
Figure 5.Comparison of optimal orbits of penalty method (blue) and proposed method (red).

Y. Abe et al. 246 Figure 6 .
Figure 6.Time response for initial condition in stable manifold (left: penalty method, right: proposed method).

Figure 7 .
Figure 7. Time response for initial condition near stable manifold (left: penalty method, right: proposed method).

Figure 8 .
Figure 8.Time response for initial condition outside of stable manifold (left: penalty method, right: proposed method).