1. Introduction
In Bayesian statistics, we consider parameters in models as random variables and its probability distributions, and we calculate the posterior distribution by using Bayes’ theorem.
In Hamiltonian dynamical system, any time evolution is defined by Hamiltonian equations and expressed by canonical transformations (or symplectic diffeomorphisms) on phase spaces. Phase spaces and equations of motion are abstract symplectic manifolds and Hamiltonian vector fields respectively. Under time evolution for Hamiltonian dynamical system, Hamiltonian functions and the phase volume are preserved. These are direct consequences of skew-symmetricity of symplectic structures. In the case where the dimension of a phase space is greater than or equal to 4, there are other conserved quantities that are called symplectic capacities. Symplectic capacities are far from trivial and are deep result in symplectic geometry. For detail, see [1] [2] [3] [4] .
In this paper we prove that Bayesian updating for multivariate normal population mean vector can be expressed by an affine symplectic diffeomorphism (affine canonical transformation). The main result is the following.
Theorem 1. There exists a linear symplectic diffeomorphism
on
such that the first component of the composition
maps a prior distribution to the posterior, where
is the parallel translation
on
.
In this theorem to reformulate Bayesian updating from symplectic geometric viewpoint, we consider the cotangent space of which its base space contains population mean
. The reason why we use the cotangent space is as follows. If we assume
is a point in
, to express Bayes’ theorem by an affine symplectic transformation on
, in the case where variance is known, we have to find an element
in
such that
. (Note that
are all
.) However usually in Bayesian updating, we fix a type of the posterior which is a section of density function, and then we normalize it:
(1)
It is well know that any canonical transformations are volume-preserving. Hence we cannot expect the existence of desired transformation. Moreover, in this case the population mean
is in
, so we can only treat even dimensional case. The key to getting rid of this drawback is to use of Lagrangian submanifolds. Consider a symplectic manifold which has a Lagrangian submanifold containing
, and construct desired canonical transformation on the total space. Canonical transformations on the total space may change a measure on Lagrangian submanifolds.
There is another approach to Bayesian inference from symplectic-contact geometric viewpoint due to Mori [5] [6] . Mori considers the square of the parameter space of normal distributions and its Lagrangian submanifold to describe Bayes’ theorem by Hamiltonian follows, and he simultaneously gives Bayesian updating for mean and variance in univariate case. Taking account of Mori’s considerations, we should use the Poincaré type symplectic form to express a Bayesian updating for covariant matrices while we use the canonical symplectic structure on
for mean vectors. For information geometry and a relation between information geometry and symplectic geomerty, see [7] [8] [9] .
2. Bayesian Updating
In this section we review Bayes’ theorem for multivariate normal distributions. For detail, see e.g., [10] .
Consider a posterior distribution of mean vector for a multivariate normal distribution with covariant matrix. Fix a positive definite symmetric matrix
. First we treat the case of covariance matrix
is known. Let a prior distribution
of
is distributed
:
(2)
The posterier distribution with sample y is
and
(3)
where
(4)
Next we consider the case of the variance
is unknown. If we denote a priori distribution of
by
,
, then the posterior is
,
, where
(5)
3. Symplectic Group and Affine Canonical Transformation
In this section we review properties of the symplectic group and Hamiltonian flows.
Denote the set of all linear symplectic transformations on
by
(6)
and call the symplectic group, where
.
Let z be a vector in
and
be the canonical symplectic structure on 2n dimensional vector space
, then a necessary and sufficient condition for
is
. For any
we have
, where
denotes the determinant of matrix S. We also have
for
. In general
is a connected Lie group of dimension
, and the Lie algebra is given by
. If we write
in terms of
block matrices by
, then
(7)
Hence the inverse matrix of S is given by
. For details, see Abraham-Marsden [1] and de Gosson [2] .
If we consider time evolutions of
by a Hamiltonian flow, the resulting function is distributed multivariate normal.
Lemma 1. If we evolve a density function such that
by a linear Hamiltonian system with transition matrix
, then we have
, where
.
For any Hamiltonian equation
(8)
the transition matrix is given by
and satisfies
for any t.
Lemma 1 shows that if we evolve a density function distributed a normal distribution by Hamiltonian equations, then the result is also distributed a normal distribution whose variance is obtained to original variance by multiplying the transition matrix from left and its transpose from right. The proof is straightforward as follows. By
and
, we have
4. Proof of the Theorem
To prove the theorem, we explicitly construct affine symplectic diffeomorphisms.
First we consider known variance case. Let
be the canonical symplectic structure on
. We consider
is in the first factor of
, and take the matrix
which corresponds to apriori distribution. Let
(9)
then we have
and
(10)
Hence the Bayesian updating can be expressed as
(11)
where
denotes the parallel translation
on
, and
denotes the composition of maps.
In the case where unknown variance, we take a matrix
as apriori distribution and set
(12)
then the desired transformation is given by
(13)
5. Conclusion
In this paper we show that Bayesian updating can be expressed by an affine symplectic diffeomorphism on
whose base space contains a population mean vector. Bayesian updating is widely used in several areas, and recently it is usual to use computers to determine the posterior, implicitly. However our theorem expresses the posterior explicitly and concretely, and gives a dynamical interpretation of Bayesian updating.