Correct Classification Rates in Multi-Category Discriminant Analysis of Spatial Gaussian Data

This paper discusses the problem of classifying a multivariate Gaussian random field observation into one of the several categories specified by different parametric mean models. Investigation is conducted on the classifier based on plug-in Bayes classification rule (PBCR) formed by replacing unknown parameters in Bayes classification rule (BCR) with category parameters estimators. This is the extension of the previous one from the two category cases to the multi-category case. The novel closed-form expressions for the Bayes classification probability and actual correct classification rate associated with PBCR are derived. These correct classification rates are suggested as performance measures for the classifications procedure. An empirical study has been carried out to analyze the dependence of derived classification rates on category parameters.


Introduction
Much work has been done concerning the error rates in two-category discrimination of uncorrelated observations (see e.g.[1]).Several methods for estimations of the error rates in discriminant analysis of spatial data have been recently proposed (see e.g.[2] [3]).
The multi-category problem, however, has very rarely been addressed because most of the methods proposed for two categories do not generalize.Schervish [4] considered the problem of classification into one of three known normal populations by single linear discriminant function.Techniques for multi-category probability estimation by combining all pairwise comparisons are investigated by several authors (see e.g.[5]).Empirical comparison of different methods of error rate estimation in multi-category linear discriminant analysis for multivariate homoscedastic Gaussian data was performed by Hirst [6].Bayesian multiclass classification problem for correlated Gaussian observation was empirically studied by Williams [7].The novel model-free estimation method for multiclass conditional probability based on conditional quintile regression functions is theoretically and numerically studied by Xu [8].Correct classification rates in multi-category classification of independent multivariate Gaussian observations were provided by Schervish [9].We generalize results of above to the problem of classification of multivariate spatially correlated Gaussian observations.
We propose the method of multi-category discriminant analysis essentially exploiting the Bayes classification rule that is optimal in the sense of minimum misclassification probability in case of complete statistical certainty (see [10], chapter 6).In practice, however, the complete statistical description of populations is usually not possible.Then having training sample, parametric plug-in Bayes classification rule formed by replacing unknown parameters with their estimators in BCR is being used.
Šaltytė and Dučinskas [11] derived the asymptotic approximation of the expected error rate when classifying the observation of a scalar Gaussian random field into one of two classes with different regression mean models and common variance.This result was generalized to multivariate spatial-temporal regression model in [12].However, the observations to be classified are assumed to be independent from training samples in all publication listed above.The assumption of independence for the classification of scalar GRF observations was removed by Dučinskas [2].Multivariate two-category case has been considered in Dučinskas [13] and Dučinskas and Dreižienė [14].Formulas for the error rates for multiclass classification of scalar GRF observation are derived in [15].The authors of the above papers have been focused on the maximum likelihood (ML) estimators because of tractability of the covariance matrix of these estimators.In the present paper, we extend the investigation of the performance of the PBCR in multi-category case.The novel closed form expressions for the actual correct classification rate (ACCR) are derived.
By using the derived formulas, the performance of the PBR is numerically analyzed in the case of stationary Gaussian random field on the square lattice with the exponential covariance function.The dependence of the correct classification rate and ACCR values on the range parameter is investigated.
The rest of the paper is organized as follows.Section 2 presents concepts and notions concerning BCR applied to multi-category classification of multivariate Gaussian random field (MGRF) observation.Bayes probability of correct classification is derived.In Section 3, the actual correct classification rate incurred by PBCR is considered and its closed-form expression is derived.Numerical examples, based on simulated data, are presented in Section 4, in order to illustrate theoretical results.The effect of the values of range parameter on the values of ACCR is examined.

The Main Concepts and Definitions
The main objective of this paper is to classify a single observation of MGRF The model of observation ( ) Here l µ represents a mean component and l B is a matrix of parameters.The error term is generated by with covariance function defined by model for all where ( ) r s u − is the spatial correlation function and Σ is the variance-covariance matrix with elements { } ij σ .So we have deal with so called intrinsic covariance model (see [16]).
Consider the problem of classification of the vector of observation of Z at location 0 s denoted by ( ) Then the model of T is ( ) is the matrix of category means parameters and E is the n p × matrix of random errors that has matrix-variate normal distribution i.e.

(
) Here R denotes the spatial correlation matrix among components (rows) of T .In the rest of the paper the realization (observed value) of training sample T will be denoted by t .Denote 0 r the vector of spatial correlations between 0 Z and observations in T and set Notice that in category l Ω , the conditional distribution of 0 Z given T t = is Gaussian, i.e.
( ) where conditional means 0 and conditional covariance matrix 0t Σ is ( ) The marginal and conditional squared Mahalanobis distances between categories k Ω and l

Ω ( )
, 1, , k l L =  for observation taken at location 0 s s = are specified respectively by ( ) ( ) It is easy to notice that kl d does not depend on realizations of T and depends only on their locations.Under the assumption of completely parametric certainty of populations and for known prior probabilities of populations l π , There is no loss of generality in focusing attention on category L , since the numbering of the categories is arbitrary.Let the set of population parameters is denoted by

Denote the log ratio of conditional densities in categories
L Ω and l Ω by where ( ) These functions will be called pairwise discriminant functions (PDF).Then Bayes rule (BR) (see [10], chapter 6) is given by:

Probabilities and Rates of Correct Classification
Set ( ) and set M as r -dimensional vector with the l-th components ( ) Lemma 1.The conditional probability of correct classification for category L due to BCR specified in ( 4) is Here ( ) r ϕ ⋅ is the probability density function of r-variate normal distribution with mean vector M and variance-covariance matrix V .Proof.Recall, that under the definition (see e.g.[4] [9]) a probability of correct classification due to aforementioned BCR is ( ) It is the probability of correct classification of 0 Z when it comes from l Ω .Probability measure 0t P is based on conditional distribution of 0 Z given T t = , k Ω with means and variance-covariance matrix speci- fied in (1), (2).0 Z may be expressed in form After making the substitution of variables p I in ( 5) we obtain that ( ) ( ) , then probability of correct classification can be rewritten in the following way After straightforward calculations we show that That completes the proof of lem- ma.
In practical applications not all statistical parameters of populations are known.Then the estimators of unknown parameters can be found from training sample.When estimators of unknown parameters are plugged into Bayes discriminant function (BDF), the plug-in BDF is obtained (PBDF).In this paper we assume that true values of parameters B and Σ are unknown.
Let B and Σ be the estimators of B and Σ based on Then replacing Ψ by Ψ in (3) we get the plug-in BDF (PBDF) Then the classification rule based on PBCR is associated with plug-in PDF (PPDF) in the following way: Definition 1.The actual correct classification rate incurred by PBCR associated with PPDF is  It's seen in Figure 1, that the closest location to be classified is location A and the farthest is location C. CCR and ACCR are largest for location A and smallest for location C. It can be concluded that better accuracy gives closer locations.

,
Bayes rule minimizing the probability of misclassification is based on the logarithm of the conditional densities ratio.
p I denotes the p dimensional identity matrix.

Lemma 2 .
The actual correct classification rate due to PBDR is r-dimensional vector with components l