Generalized Minimum Perpendicular Distance Square Method of Estimation

In case of heteroscedasticity, a Generalized Minimum Perpendicular Distance Square (GMPDS) method has been suggested instead of traditionally used Generalized Least Square (GLS) method to fit a regression line, with an aim to get a better fitted regression line, so that the estimated line will be closest one to the observed points. Mathematical form of the estimator for the parameters has been presented. A logical argument behind the relationship between the slopes of the lines and i 0 1 ˆ ˆ î i Y X     0 1 ˆ ˆ ˆ i X Y       has been placed.


Introduction
Linear regression has a long history in its way of development from the very begging of eighteenth century till today.A lot of literatures are available in this area, these literatures involves the estimation of regression coefficients and constant by Ordinary Least Square (OLS) method i.e. by minimizing the sum of square of the vertical distances between the observed points and the assumed regression line, and estimate the regression coefficients traditionally known as OLS estimation procedure.
M. F. Hossain and G. Khalaf, (2009) showed that OLS method does not minimize actual distance from the observed point to the fitted regression line.They have suggested minimum perpendicular distance square (MPDS) Method estimation for simple linear regression in case of homoscedasticity which boils down the traditional OLS method.But regression disturbances whose variances are not constant across observations are heteroscedastic.Heteroscedasticity arises in numerous applications, in both cross-section and time-series data.For example, even after accounting for firm sizes, we expect to observe greater variation in the profits of large firms than in those of small ones.The variance of profits might also depend on product diversification, research and development expenditure, and industry characteristics and therefore might also vary across firms of similar sizes.When analyzing family spending patterns, we observe greater variation in expenditure on certain commodity groups among high-income families than low ones due to the greater discretion allowed by higher incomes [1].MPDS method is not suitable for this type of heteroscedasticity situation because this method was established only for homoscedasticity cases.
In this paper we have considered minimum perpendicular distance square method in case of heteroscedasticity which we called Generalized Minimum Perpendicular Distance Square (GMPDS) method.

Problems of Ordinary Least Square (OLS) and Generalized Least Square (GLS) Method
Suppose the simple linear regression model is where the response variable is related to the explanatory variable Y X through the regression coefficient 1  , constant intercept 0  and random disturbance term .
We assume that the disturbance terms i u follow all assumptions of classical linear regression model.

u
The estimation procedure of regression coefficient by Ordinary Least Square (OLS) method and Generalized Least Square (GLS) method is actually minimizing the sum of square of the vertical distances from the observed points to the assumed regression line.

 
The OLS estimators are: The important assumption for applying OLS method is that the variance of each disturbance term i , conditional on the chosen values of the explanatory variables, is some constant number (is called homoscedasticity assumption).If the data violet this homoscedasticity assumption that is the variance of each disturbance term i conditional on the chosen values of the explanatory variables is random (say u u 2 i  ) then we can not apply OLS and in this case we apply GLS estimation procedure for estimating parameters [2].
The GLS estimators are: where, 2 1 , and The problem of OLS and GLS estimation is that, actually they don't minimize real distance from the observed point to the fitted regression line rather they minimize the vertical distance from the observe point to the fitted regression line.For this reason we have the well known theorem is where XY   is the estimated regression coefficient of X on Y and YX   is the estimated regression coeffi- cient of on Y X .If OLS and GLS minimize real distance (error) then should be unity that is . But in OLS and GLS methods, it only occurs if data are perfectly correlated, that is In real life problem this type of perfect correlation occurs in rare case.
The Minimum Perpendicular Distance Square Method suggested by Hossain and Khalaf (2009) produced the estimator which gives for all cases and it indicates that the errors are really minimized and gives more accurate result than that of OLS [3].

Concept of Minimum Perpendicular Distance Square (MPDS) Estimation
The real distance of the assumed regression line  are not the vertical distances or height of the point minus height of regression line i.e. .

 
In fact the actual distances from the line are the perpendicular distances 's (as indicated in Figure 1).These perpendicular distances would also be positive and negative according to Hence estimating 1  and 0  by minimizing sum of the squares of these perpendicular distances will produce the closest fitted regression line from the points   which may be used for more accurate prediction purposes.

The Method of Generalized Minimum Perpendicular Distance Squares Method (GMPDSM)
Let us consider two-variable linear regression function is which for ease of algebraic simplification we write as where 0 1 i X  for each and the response variable is related to the explanatory variable i Y X through the regression coefficient 1  , constant intercept 0  and random disturbance term .We know that one of the important assumptions of the classical linear regression model is that the variance of each disturbance term i , conditional on the chosen values of the explanatory variables is some constant number equal to u u 2  .This is the assumption of homoscedasticity.Symbolically, and suppose the heteroscedastic variance 2 i  are known.Then dividing (1) by 2 i  both sides, we get 0 0 1 which for ease of exposition we write where the transformed variables are the original variables divided by (the known) i  .We use the notation 0   and 1   , the parameters of the transformed model, to distinguish them from the usual MPDS parameters 0  and 1  .Now we see 1 since is known 1 since which is a constant.That is, the variance of the transformed disturbance term is now homoscedastic.
i This procedure of transforming the original variables is done in such a way that the transformed variables satisfy the assumptions of the classical model.Now applying MPDS method to this transformed model to estimate parameter we call Generalized Minimum Perpendicular Distance Squares Method (GMPDSM).In short, GMPDS is MPDS on the transformed variables that satisfy the classical regression assumptions.The estimators thus obtained are knows as GMPDSM estimators.

Perpendicular Distance from the Points to
the Line   Let us consider two-variable linear regression function Dividing both sides by i  we have 0 0 1 For estimating 0   and 0   we need to determine the perpendicular distance from the observed point   to the line .The perpendicular dis-tance

Parameter Estimation Based on GMPDS Method
To obtain the GMPDS estimators, we minimize sum of square of perpendicular distances from the points ˆi u    , ; following steps are taken.
where weights Differentiating (5) with respect to 1  

1
, then putting equal to zero and setting for 1  Again differentiating Equation ( 5) with respect to 0   and equating zero with 0 ˆ0 Using Equation ( 7) in Equation ( 6) we get Copyright © 2012 SciRes.AM So the solution of the above equation is:

SPwxy SSwx SSwy SSwx SSwy SPwxy
SPwxy Using this result in Equation ( 7) we can estimate 0   .And hence In this method we get two regression coefficients, it could be proved that the "+" solution i.e.     gives minimum of ( 5) and hence we suggest the reader to use   as the regression coefficient and accordingly the regression constant 0   could be estimated by using

Estimation of Regression Coefficient by Using GMPDS for the Model
To estimate regression coefficient 1    and regression constant 0    by minimizing sum of squares of the error term i u   's (assumed) the perpendicular distances from the fitted line   ; we do the similar steps as we do in Section 3.7.
Differentiating both sides with respect to 0    and 1    and putting equal to zero and setting for 0    and 1    , we get the following solutions: Here we also get two regression coefficients and for the same region as we have mentioned in Section 3.2, we will suggest the reader to use      as regression coefficient and accordingly the estimation of 0    may be obtained to fit the regression line X  on .Y 

Relationship between Regression Coefficients
If we consider the GMPDS method to estimate regression coefficients 1   and 1    as we have indicated in Sections 3.2 and 3.3, by minimizing the error term ˆi u   and respectively (the perpendicular distances from these lines to the observed points), we get which indicate that during estimating regression coefficient by using GMPDS method in case of heteroscedasticity, the error term is minimized.This is a new angle to advocate the advantage our suggested method (GMPDSM) to estimate regression coefficients in case of heteroscedasticity.

Concluding Remarks
The method of MPDS estimation actually minimize real distances from the observed points to the fitted regres-sion line but OLS and GLS method fail to do that by using vertical distance from the observe points to the fitted regression line.But one of the crucial assumptions of MPDS method and also for traditional OLS method is that the variance of each disturbance terms remains some constant number   2  .So we can not apply MPDS method when this assumption is violated.That is, in presence of heteroscedasticity OLS and MPDS is not suitable.In this paper our main focus is on minimum perpendicular deviations in case of heteroscedasticity, and we have shown in mathematically that GMPDS method gives an estimator that the error term is really minimized.Hence we propose GMPDS method in case of heteroscedasticity.