^{1}

^{*}

^{2}

^{*}

^{2}

^{*}

This paper discusses the problem of classifying a multivariate Gaussian random field observation into one of the several categories specified by different parametric mean models. Investigation is conducted on the classifier based on plug-in Bayes classification rule (PBCR) formed by replacing unknown parameters in Bayes classification rule (BCR) with category parameters estimators. This is the extension of the previous one from the two category cases to the multi-category case. The novel closed-form expressions for the Bayes classification probability and actual correct classification rate associated with PBCR are derived. These correct classification rates are suggested as performance measures for the classifications procedure. An empirical study has been carried out to analyze the dependence of derived classification rates on category parameters.

Much work has been done concerning the error rates in two-category discrimination of uncorrelated observations (see e.g. [

The multi-category problem, however, has very rarely been addressed because most of the methods proposed for two categories do not generalize. Schervish [

We propose the method of multi-category discriminant analysis essentially exploiting the Bayes classification rule that is optimal in the sense of minimum misclassification probability in case of complete statistical certainty (see [

Šaltytė and Dučinskas [

By using the derived formulas, the performance of the PBR is numerically analyzed in the case of stationary Gaussian random field on the square lattice with the exponential covariance function. The dependence of the correct classification rate and ACCR values on the range parameter is investigated.

The rest of the paper is organized as follows. Section 2 presents concepts and notions concerning BCR applied to multi-category classification of multivariate Gaussian random field (MGRF) observation. Bayes probability of correct classification is derived. In Section 3, the actual correct classification rate incurred by PBCR is considered and its closed-form expression is derived. Numerical examples, based on simulated data, are presented in Section 4, in order to illustrate theoretical results. The effect of the values of range parameter on the values of ACCR is examined.

The main objective of this paper is to classify a single observation of MGRF

The model of observation

Here

where

Consider the problem of classification of the vector of observation of

Then the model of

where

Here

Denote by

Notice that in category

where conditional means

and conditional covariance matrix

The marginal and conditional squared Mahalanobis distances between categories

and

It is easy to notice that

Under the assumption of completely parametric certainty of populations and for known prior probabilities of populations

There is no loss of generality in focusing attention on category

Denote the log ratio of conditional densities in categories

where

These functions will be called pairwise discriminant functions (PDF).

Then Bayes rule (BR) (see [

classify

Set

Lemma 1. The conditional probability of correct classification for category

Here

Proof. Recall, that under the definition (see e.g. [

It is the probability of correct classification of

where

After making the substitution of variables

Set

After straightforward calculations we show that

In practical applications not all statistical parameters of populations are known. Then the estimators of unknown parameters can be found from training sample. When estimators of unknown parameters are plugged into Bayes discriminant function (BDF), the plug-in BDF is obtained (PBDF). In this paper we assume that true values of parameters

Let

Then replacing

Then the classification rule based on PBCR is associated with plug-in PDF (PPDF) in the following way: classify

Definition 1. The actual correct classification rate incurred by PBCR associated with PPDF is

Set

Lemma 2. The actual correct classification rate due to PBDR is

where

Proof. It is obvious that in population

Set

After straightforward calculations we show that

Simulation study in order to compare proposed Bayes probability of correct classification rate and the actual correct classification rate incurred by PBCR was carried out for three class case

In this example, observations are assumed to arise from bivariate stationary Gaussian random field

Set

Estimators of

where

Considered set of training locations with indicated class labels is shown in

So we have small training sample sizes (i.e.

Simulations were performed by geoR: a free and open-source package for geostatistical analysis included in statistical computing software R (http://www.r-project.org/). Each case was simulated 100 times (runs) and

θ | CCR | |||||
---|---|---|---|---|---|---|

A (0,0) | B (−1,2) | C (−2,2) | A (0,0) | B (−1,2) | C (−2,2) | |

1 | 0.633777 | 0.634955 | 0.487689 | 0.544697 | 0.427956 | 0.404184 |

2 | 0.882716 | 0.839326 | 0.599707 | 0.636842 | 0.501045 | 0.483608 |

3 | 0.973217 | 0.948266 | 0.716064 | 0.725356 | 0.565855 | 0.480018 |

4 | 0.999777 | 0.955340 | 0.810298 | 0.773966 | 0.639186 | 0.553055 |

It’s seen in