Hamiltonian Servo : Control and Estimation of a Large Team of Autonomous Robotic Vehicles

This paper proposes a novel Hamiltonian servo system, a combined modeling framework for control and estimation of a large team/fleet of autonomous robotic vehicles. The Hamiltonian servo framework represents high-dimensional, nonlinear and non-Gaussian generalization of the classical Kalman servo system. After defining the Kalman servo as a motivation, we define the affine Hamiltonian neural network for adaptive nonlinear control of a team of UGVs in continuous time. We then define a high-dimensional Bayesian particle filter for estimation of a team of UGVs in discrete time. Finally, we formulate a hybrid Hamiltonian servo system by combining the continuous-time control and the discrete-time estimation into a coherent framework that works like a predictor-corrector system.


Introduction
Today's military forces face an increasingly complex operating environment, whether dealing with humanitarian operations or in the theater of war.In many situations one has to deal with a wide variety of issues from logistics to lack of communications infrastructure [1].In many situations one may be even denied satellite access and hence unable to utilize services such as GPS for Geo-location and the ability reconnect back to HQ [2].
Given the increasing success of artificial intelligence (AI) techniques in recent times and the advancements made in unmanned vehicle (UV) design, autonomous systems in particular teams or swarms of autonomous vehicles have become a potential attractive solution to solving the complexity issues facing the Intelligent Control and Automation military today.Teams or swarms of intelligent autonomous vehicles would also introduce a level of survivability to the systems which would increase its utility in such complex and hostile environments [3].
A problem which arises in utilizing large teams or swarms of autonomous vehicles is one of control.Recently, the US Defence Advanced Research Projects Agency (DARPA) has sought to address some of the control issues associated with large swarm through its Offensive Swarm-Enabled Tactics (OFFSET) [4].
Therefore, there is a clear need for both local and global control of large teams or swarms of UV.This would allow for autonomous or semi-autonomous operation of large teams of UV which would be easily manageable.
This paper proposes a Hamiltonian servo system which is a combined modeling framework for the control and estimation of such autonomous teams of UV's.This paper will demonstrate the ability of this framework to handle the high-dimensional, nonlinear and non-Gaussian nature of the problem at hand.The authors will also show this to be a generalization of the classic Kalman Servo System.Finally results of a number of simulations will be presented to demonstrate the validity of the framework through the authors implementation using Mathematica & C++ code (which is provided in the Appendix).
Slightly more specifically, we recall here that traditional robotics from the 1970s-80s were mainly focused on building humanoid robots, manipulators and leg locomotion robots.Its modeling framework was linear dynamics and linear control, that is, linearized mechanics of multi-body systems (derived using Newton-Euler, Lagrangian, Gibs-Appel or kinetostatics equations of motion) and controlled by Kalman's linear quadratic controllers (for a comprehensive review, see [5]- [11]).The pinnacle of this approach to robotics in the last decade has been the famous Honda robot ASIMO (see [12]), with a related Hamiltonian biomechanical simulator [13].
In contrast, contemporary robotics research has mainly been focused on field robotics, based on estimation rather than control, resulting in a variety of self-localization and mapping (SLAM) algorithms (see [14] [15] [16] [17] [18] and the OpenSLAM web-site [19]).
Instead of following one of these two sides of robotics, in the present paper we will try to combine them.From a robotics point-of-view, control and estimation are two complementary sides, or necessary components, of efficient manipulation of autonomous robotic vehicles.In this paper we attempt to develop a computational framework (designed in Wolfram Mathematica  and implemented in C++) that unifies both of these components.From a mathematical point-of-view, in this paper we attempt to provide nonlinear, non-Gaussian and high-dimensional generalization of the classical Kalman servo system, outlined in the next section.

Motivation: Kalman Servo
In 1960s, Rudolf Kalman developed concept of a controller-estimator servo Intelligent Control and Automation where we have three vector spaces: state space  is an n-vector of state variables,

( ) t ∈ u
 is an m-vector of inputs and ( )  is a k-vector of outputs; 0 x is the initial state vector.The matrix variables are the following maps: i) n n × dimensional state dynamics Kalman feedback controller: optimal LQR-controller: 1 with the controller gain matrix: ( ) ( ) Kalman filter: optimal LQG-estimator 2 with the (additive, zero-mean, deltacorrelated) Gaussian white noise ( ) The acronym LQR means linear quadratic regulator. 2 The acronym LQG means linear quadratic Gaussian.Intelligent Control and Automation with the filter gain matrix: ( ) : determined by the matrix f P governed by the filter Riccati ODE: ( ) where 0 f P is the initial filter matrix. : covariance matrices from the noisy η -generalization of the quadratic cost function (3).Defined in this way, Kalman's controller-estimator servo system has a two-cycle action, similar to the predictor-corrector algorithm: in the first time-step the servo controls the plant (e.g., a UGV), in the second step it estimates the plants ( , x y ) coordinates, in the next step it corrects the control, then estimates again, and so on.
In the remainder of this paper we will develop a nonlinear, non-Gaussian and high-dimensional generalization of the Kalman servo.

Control of an Autonomous Team of UGVs: Affine Hamiltonian Neural Network
Now we switch from matrix to ( , 1, , n α β =  )-index notation, 3 to label the position of n n × individual UGVs within the swarm's global plane coordinates, longitude and latitude, respectively.The first step in the nonlinear generalization of the Kalman servo is replacing the linear, low-dimensional, state dynamics (1) and control ( 2)-( 3) with nonlinear, adaptive and high-dimensional dynamics and control system, as follows.
In the recent paper [23], we have proposed a dynamics and control model for a joint autonomous land-air operation, consisting of a swarm of unmanned ground vehicles (UGVs) and swarm of unmanned aerial vehicles (UAVs).In the present paper we are focusing on a swarm of UGVs only, from both control and estimation perspectives.It can be modeled as an affine Hamiltonian control system (see [24]) with many degrees-of-freedom (DOFs), reshaped in the form of a ( , q p )-pair of attractor matrix equations with nearest-neighbor couplings defined by the matrix of affine Hamiltonians: ( ) The following pair of attractor matrix equations defines the activation dynamics of a bidirectional recurrent neural network: , where superscripts ( , q p ) denote the corresponding Hamiltonian equations and the partial derivative is written as: u H H u ∂ ≡ ∂ ∂ .In Equations (( 5), ( 6)) the 3 For simplicity, we are dealing with a square ( ) n n × -matrix of UGV configurations.The whole approach works also for a rectangular ( ) Equations (( 5), ( 6)) are briefly derived as follows (for technical details, see [23] and the references therein).Given the swarm configuration manifold M (as a product of Euclidean groups of motion for each robotic vehicle), the affine Hamiltonian function

( )
, , : by (see [24]): where ( ) 0 , : H q p T M * →  is the physical Hamiltonian function (kinetic plus potential energy of the swarm) and control inputs ( ) , , j u t q p are defined using recurrent Lie derivatives: .
Using ( 7) and ( 8), the affine Hamiltonian control system can be formulated (for where D is Rayleigh's dissipative function and ( ) , , i F t q p are velocity and force controllers, respectively.
The affine Hamiltonian control system ( 9)-( 10) can be reshaped into a matrix form suitable for a UGV-swarm or a planar formation of a UAV-swarm: , by evaluating dissipation terms into cubic terms in the brackets, and replacing velocity and force controllers with attractors q A and p A , respectively (with strength ϕ ).If we add synaptic weights αβ ω we come to our recurrent neural network Equations (( 5), ( 6)).
Next, to make Equations (( 5), ( 6)) adaptive, we use the abstract Hebbian learning scheme: New Value Old Value Innovation = + which in our settings formalizes as (see [25]): e t q t q t = − :  ) where the innovations are given in the matrix signal form (generalized from [26]): ( ) ( ) ( ) ( ) , q q q q q S q S p S q S p In this way defined affine Hamiltonian control system ( 5)-( 6), which is also a recurrent neural network with Hebbian leaning ( 11)-( 12), represents our nonlinear, adaptive and high-dimensional generalization of Kalman's linear state dynamics (1) and control ( 2)-( 3).
Using Wolfram Mathematica  code given below, the Hamiltonian neural net was simulated for 1 sec, with the matrix dimensions 10 n = (i.e., with 100 neurons in both matrices ij q and ij p , which are longitudes and latitudes of a large fleet of 100 UGVs, see Figures 2-4).The Figures illustrate both stability  αβ together with their spreading from zero initial conditions.This plot is the simulation output of the recurrent activation dynamics given by Equations (( 5), ( 6)) with Hebbian learning dynamics given by Equations (( 11), ( 12)), starting from zero initial coordinates and momenta and random ( )   [ ] , , , ; Hyperbolic tangent activation functions: Derivatives of activation functions:

Estimation of an Autonomous Team of UGVs: High-Dimensional Bayesian Particle Filter
The second step in the nonlinear generalization of the Kalman servo is to replace the linear/Gaussian Kalman filter (4) with a general nonlinear/non-Gaussian Monte-Carlo filter, as follows.

Recursive Bayesian Filter
Recall that the celebrated Bayes rule gives the relation between the three basic conditional probabilities: i) a Prior ( ) P A (i.e., an initial degree-of-belief in event A, that is, Initial Odds), ii) Likelihood ( ) In statistical settings, Bayes rule ( 13) can be rewritten as: where t Z is the normalizing constant, or partition function, defined by: In the Bayesian filter ( 15)-( 16), given the sequence of observations t Y , the posterior fully defines the available information about the state of the system and the noisy environment-the filter propagates the posterior (embodying time and measurement updates for each iteration) across the nonlinear state-space model; therefore, it is the maximum a posteriori estimator of the system state.In a special case of a linear system and Gaussian environment, the filter ( 15)-( 16) has optimal closed-form solution, which is the Kalman filter (4).However, in the general case of a nonlinear dynamics and/or the non-Gaussian environment, it is no longer feasible to search for closed-form solutions for the integrals in ( 15)-( 16), so we are left with suboptimal, approximate, Monte Carlo solutions, called particle filters.

Sequential Monte Carlo Particle Filter
Now we consider a general, discrete-time, nonlinear, probabilistic SSM, defined by a Markov process { } 1 x n t t x ≥ ⊆  , which is observed indirectly via a measurement process { } 1 y n t t y ≥ ⊆  .This SSM consists of two probability density functions: dynamics ( ) f ⋅ and measurement ( ) h ⋅ .Formally, we have a system of nonlinear difference equations, given both in Bayesian probabilistic formulation (left-hand side) and nonlinear state-space formulation (right-hand side): ( ) with initial state : ~, x x µ where f and h are the state and output vector-functions, t  and t ν are non-Gaussian process and measurement noises and µ is a given non-Gaussian distribution.The state filtering problem means to recover information about the current state t x in (17) based on the available measurements 1:t y in (18) (see, e.g.[29]; for a recent review, see [30] and the references therein).
The principle solution to the nonlinear/non-Gaussian state filtering problem is provided by the following recursive Bayesian measurement and time update equations: Old Filter Dynamics Predictor V. Ivancevic In a particular case of linear/Gaussian SSMs ( 17)-( 18), the state filtering problem ( 19)-( 20) has an optimal closed-form solution for the filter PDF ( ) | t t p x y , given by the Kalman filter (4).However, in the general case of nonlinear/non-Gaussian SSMs-such an optimal closed-form solution does not exist.
The general state filtering problem ( 19)-( 20) associated with any nonlinear/ non-Gaussian SSMs ( 17)-( 18) can be only numerically approximated using the sequential Monte Carlo (SMC) methods.The SMC-approximation to the filter PDF, denoted by ( ) ˆ| t t p x y , is an empirical distribution represented as a weighted sum of Dirac-delta functions: where the samples { } 1 are called particles (point-masses 'spread out' in the state space); each particle i t x represents one possible system state and the corresponding weight i t w assigns its probability.In this way defined particle filter (PF) plays the role of the Kalman filter for nonlinear/non-Gaussian SSMs.
PF approximates the filter PDF ( ) p x y using the SMC delta-approximation (21).
Briefly, a PF can be interpreted as a sequential application of the SMC technique called importance sampling (IS, see e.g.[31]).Starting from the initial approximation: ( ) ( ) ( )  20)) together with the already generated IS approximation of ( ) . The particles { } 1 1 are sampled independently from some proposal distribution ( ) 1 r x .To account for the discrepancy between the proposal distribution and the target distribution, the particles are assigned importance weights, given by the ratio between the target and the proposal: , where the weights are normalized to sum to one (for a recent technical review, see e.g., [30]).
The IS proceeds in an inductive fashion: which, inserted as an old/previous filter into the time-update Equation (20), gives a mixture distribution, approximating ( ) Subsequent insertion of (22) as a predictor into the measurement-update Equation (19), gives the following approximation of the filter PDF: x w = can be used to iteratively approximate the filter PDF at all future times, this completes the PF-algorithm with an overall computational complexity of ( ) O N (as pioneered by [31]; for more technical details, see e.g., [30] and the references therein).
We remark that the pinnacle of the PF-theory is the so-called Rao-Blackwellized (or, marginalized) PF, which is a special kind of factored PF, where the state-space dynamics ( 17) is split into a linear/Gaussian part and a nonlinear/non-Gaussian part, so that each particle has the optimal linear-Gaussian Kalman filter associated to it (see [32]- [40]).However, in our case of highly nonlinear and adaptive dynamics, the use of Rao-Blackwellization does not give any advantage, while its mathematics is significantly heavier than in an ordinary PF.

Hamiltonian Servo Framework for General Control and Estimation
The Hamiltonian servo system is our hybrid5 global control-estimation framework for a large team/fleet of UGVs (e.g., 10 10 100 × = ), consisting of the following four components: Adaptive Control, which is nonlinear and high-dimensional, defined by: , , q q q q q S q S p S q S p n h p q w f q q w w f q q .This plot is the simulation output of the Hamiltonian neural network ( 23)-( 24) and recurrent estimation (larger amplitudes, governed by the Bayesian particle filter ( 25)-( 27)).
where bracketed superscripts ( ) , i n label IS weights.This Hamiltonian servo framework has been implemented in C++ language (using Visual Studio 2015; see Appendix 1.2).A sample simulation output of the servo system is given in Figure 5.The Figure demonstrates the basic stability and convergence of the servo system.

Conclusion
In this paper we have outlined a novel control and estimation framework called the Hamiltonian Servo system.The framework was shown to be a generalization of the Kalman Servo System.Using an example of a large team/fleet of 81 ( 9 9 × ) unmanned ground vehicles (UGVs), it was shown that the framework described provided the control and estimation as required.This approach encompassed a three tired system of adaptive control, measurement process and Bayesian recursive filtering which was demonstrated to enable control and estimation of multidimensional, non-linear and non-Gaussian systems.

High-Dimensional Bayesian Particle Filter
The following C++ code uses the Armadillo matrix library [41].

Figure 1 .
Figure 1.Kalman's fundamental controller-estimator servo system, including state-space dynamics, Kalman filter and feedback controller.
first equation in(1) is called the state or dynamics equation and the second equation in (1) is called the output or measurement equation.
are matrices from the quadratic cost function:

.
of the swarm's coordinates and momenta, respectively, with initial conditions: The field attractor lines q A and p A (with the field strength ϕ ) are defined in the simulated urban environment using the A-star algorithm that finds the shortest Manhattan distance from point A to point B on the environmental digital map (i.e., street directory).The adaptive synaptic weights matrices:

4 1
s c − are the coefficients of the linear ODE of order r for the error function ( ) ( ) ( )ref j j

(
and convergence of the recurrent Hamiltonian neural net.C++ code for the the Hamiltonian neural net is given in Appendix 1.1.

Figure 2 .
Figure 2. Adaptive control for a large fleet (a 10 10 × matrix) of 100 UGVs: time evolution of coordinates ( ) q t αβ

Figure 4 .
Figure 4. Adaptive control for a large fleet (a 10 10 × matrix) of 100 UGVs: time evolution of synaptic weights ( ) t αβ ω.As expected, we can see the converging behavior

Figure 5 .
Figure 5. Sample output of the Hamiltonian servo framework applied to a large fleet (a 9 9 × matrix) of 81 UGVs.As expected, the plot shows almost periodic time evolution of coordinates ( ) q t αβ

Table Aq , 0
a degree-of-belief in event B, given A, that is, a New Evidence); and Posterior: ( ) final step is to assign the importance weights to the new particles as: The Gaussian measurement noise.Note that here we use the inverse dynamics (given coordinates, calculate velocities, accelerations and forces)-as a bridge between robot's self-localization and control: measuring coordinates t q αβ at discrete time steps t and from them calculating momenta