^{1}

^{*}

^{1}

^{*}

^{1}

^{1}

Estimation of the unknown mean,
*μ* and variance,
*σ*
^{2} of a univariate Gaussian distribution
given a single study variable x is considered. We propose an approach that does not require initialization of the sufficient unknown distribution parameters. The approach is motivated by linearizing the Gaussian distribution through differential techniques, and estimating,
*μ* and
*σ*
^{2} as regression coefficients using the ordinary least squares method. Two simulated datasets on hereditary traits and morphometric analysis of housefly strains are used to evaluate the proposed method (PM), the maximum likelihood estimation (MLE), and the method of moments (MM). The methods are evaluated by re-estimating the required Gaussian parameters on both large and small samples. The root mean squared error (RMSE), mean error (ME), and the standard deviation (SD) are used to assess the accuracy of the PM and MLE; confidence intervals (CIs) are also constructed for the ME estimate. The PM compares well with both the MLE and MM approaches as they all produce estimates whose errors have good asymptotic properties, also small CIs are observed for the ME using the PM and MLE. The PM can be used symbiotically with the MLE to provide initial approximations at the expectation maximization step.

The Gaussian distribution is a continuous function characterized by the mean µ and variance σ^{2}. It is regarded as the mostly applied distribution in all of the science disciplines since it can be used to approximate several other distributions. We consider a single observation x obtained from a univariate Gaussian distribution with both the mean µ and variance, σ^{2}, unknown, that is^{2} are referred to as sufficient parameters in most of the statistics literature and this is due to the fact that they contain all information about the probability distribution function, see Equation (1).

An important problem in statistics is to obtain information about the mean, µ, and the variance, σ^{2} of a given population. The estimation of these parameters is central in areas such as machine learning, pattern recognition, neural networks, signal processing, computer vision and in feature extraction, see [

The rationale and motivation for the proposed approach are presented in Section 2. The methodological steps and the datasets simulated to validate the proposed approach are discussed in Section 3. Explicit estimation steps using the ordinary least squares method are presented in Section 4. Statistical analysis results on simulations are presented in Section 5. The error distribution analyses are presented in Section 6. Accuracy results for the proposed method (PM) and maximum likelihood estimation (MLE) methods are presented in Section 7. In Sections 8 and 9 we provide a thorough discussion of the results and some concluding remarks on the study findings.

Numerical methods for estimating parameters of a Gaussian distribution function are well known like the bisection method, Newton-Raphson, secant, false position, Gauss-Seidel, see [

We transform the Gaussian density function (1) into a new function that is linear with respect to some of the unknown parameters or their combinations in an appropriate form. For linearization, we consider the derivatives for the parent function (1). The unknown regression parameters are then estimated using the ordinary least squares (OLS) methods. The employed frame-work was first proposed by [

In the course of estimation of the parameters using the PM, we anticipate that, there is a shift of the estimated parameters from their “true” values. The amount of this shift is what is commonly referred to as accuracy, and is computed as the difference between the known values and the estimates from the underlying process [

It is always a requirement to estimate the parameters of a Gaussian distribution in most of the data modelling aspects involving normally distributed observations. In this section the method we present has not been considered before in the statistical literature that has been reviewed. The approach is to transform the original Gaussian function (1), and this is done by taking its first derivative and subsequently introducing new parameters either as linear or their combination.

Re-arranging Equation (5)

We observe from Equation (6) that the original function (1) is contained in both the first and second terms. Hence, we write Equation (6) as

where

Introducing new parameters in Equation (8) to formulate a model linear in the new parameters, we obtain a simple linear model of the form

where

There are well-recognised approaches for obtaining the parameter,

1) Each of the independent variables (in this case

2) The model contians at most one unidentified parameter that does not have an independent variable.

3) All the discrete terms are summed to yield the ultimate model value [

Parameter estimation is an important aspect in most of the statistical modelling frame-works. The major goal of estimation is to obtain the numerical values of the regression coefficients associated with individual or a com- bination of the regressors [

If a dataset say,

be an estimation of

We estimate the error, since it is known that an important part of estimation is the assessment of how much the computed value will vary due to noise in the dataset. When information concerning the deviations is not available, then there is no basis on which comparison of the estimated value to the “true” or target value can be done [

The sum of squares of the errors over all the data points is

In Equation (14), variables

So that,

as the goal function for the ordinary least sqaures estimation of the parameters

The estimates of Gaussian distribution parameters are then estimated as

In oder to evaluate the performance of the proposed method (PM), we perform simulations of the father and daughters heights using Mathematica software [

We now require to estimate the known means and standard deviations of the considered datasets using the PM, MLE and MM. The analysis is done on two samples,

MM | MLE | PM | |
---|---|---|---|

67.71 | 67.82 | 67.74 | |

2.84 | 2.72 | 2.80 |

MM | MLE | PM | |
---|---|---|---|

63.83 | 63.71 | 63.80 | |

2.72 | 2.76 | 2.77 |

MM | MLE | PM | |
---|---|---|---|

44.55 | 45.78 | 45.15 | |

3.98 | 3.76 | 3.98 |

We are frequently faced with a situation of processing volumes of data whose generative process we are uncertain about, yet it is always necessary to understand the sampling theory and statistcial inference before carrying out any parameter estimation in statistical modelling problems [

We aim at establishing the distribution of the errors from the PM in comparison to those from the standard method, that’s MLE. We would wish to use the easier standard statistical techniques like, the Pearson Chi- Square, the Jacque-berra, and the Kolmogorov-Smirnov methods to test for normality in the errors, but such tests are usually more receptive in case of large datasets. In that case visual methods have been preferred, see Figures 1-6, and these have several advantages [

Error distribution can with little effort be observed by a histogram of the sampled errors, where the error counts are plotted. Such a histogram presents an overview of the normality of the error distribution, see Figures 1-6. For comparison with normality, normal distribution curves are superimposed on the histograms. The figures illu- strate the distribution of errors,

Better diagnostic methods for checking deviations from a normal distribution are the so called quantile-quan- tile (Q-Q) plots, see [

When normal distribution for the parent dataset, and no outliers are exhibited as shown in Section 6, then the accuracy measures in

In _{i} denotes the difference between the observed and estimated value. Where i is the sampled data point, and n is the sample size. Assuming that the generated errors follow a normal distribution as established in Section 6, see Figures 1-6. Then from the theory of errors, it is well known that 68.3% of data will fall within the interval

Results generated by the standard measures of

Tables 5-8 show the accuracy measures considered to evaluate the performance of the PM and the MLE, on two datasets of different sizes, that is

The PM has been compared with some of the current methods in use that is, MM and MLE. These were preferred

Measure | Formulae |
---|---|

Root mean square error | |

Mean error | |

Standard deviation |

Accuracy measure | Value (inches) | 95% CI |
---|---|---|

Root mean square error | 0.18 | - |

Mean error | 0.15 | [−0.0333, 0.3368] |

Standard deviation | 0.09 | - |

Accuracy measure | Value (inches) | 95% CI |
---|---|---|

Root mean square error | 0.15 | - |

Mean error | 0.12 | [−0.0601, 0.2947] |

Standard deviation | 0.09 | - |

Accuracy measure | Value (mm) | 95% CI |
---|---|---|

Root mean square error | 0.73 | - |

Mean error | 0.47 | [−0.6437, 1.5741] |

Standard deviation | 0.57 | - |

Accuracy measure | Value (mm) | 95% CI |
---|---|---|

Root mean square error | 0.66 | - |

Mean error | 0.55 | [−0.1710, 1.2610] |

Standard deviation | 0.37 | - |

due to their computation lure and availability inmost of the statistical Software packages. Secondly the MLE method is more preferred and widely applied due to its good asymptotic properties. Three standard datasets from [

Section 5, contains the computation results for PM, MM and MLE. Tables 1-3 illustrate and show the parameter estimates obtained from the methods. It is observed that all the approaches give comparable results with the “true” or required values of the parameters given in the captions of the respective tables.

In order to use standard techniques that are employed for accuracy measurements, the errors have been tested for normality, see Section 6. Statistical visualization techniques were preferred to other statistical tests which are said to be sensitive in the presence of outliers and large datasets [

This research laid out an easy approach to computing the parameters of a univariate normal distribution which is an important distribution in applied statistics and in most of the science disciplines. It serves as a platform or bench mark for studying more complex distributions, like the mixture of two or more Gaussians, mixture of exponentials and other continuous distributions which are very useful in pattern recognition, machine learning and unsupervised learning. The simplicity of the approach is time saving in computation and guarantees convergence to the required values, this is not usually the case in the conventional analytical and numerical methods as these may fail or take a long time to converge depending on the quality of initial approximations.

The authors wish to thank the Directorate of Research and Innovation of Tshwane University of Technology for funding the research under the Postdoctoral research fund 2014/2015. The anonymous reviewer and editors whose criticisms led to an improved version of the manuscript.

Cliff R.Kikawa,Michael Y.Shatalov,Petrus H.Kloppers,Andrew C.Mkolesia, (2015) On the Estimation of a Univariate Gaussian Distribution: A Comparative Approach. Open Journal of Statistics,05,445-454. doi: 10.4236/ojs.2015.55046