Identifying Unusual Observations in Ridge Regression Linear Model Using Box-Cox Power Transformation Technique

The use of [1] Box-Cox power transformation in regression analysis is now common; in the last two decades there has been emphasis on diagnostics methods for Box-Cox power transformation, much of which has involved deletion of influential data cases. The pioneer work of [2] studied local influence on constant variance perturbation in the Box-Cox unbiased regression linear mode. Tsai and Wu [3] analyzed local influence method of [2] to assess the effect of the case-weights perturbation on the transformation-power estimator in the Box-Cox unbiased regression linear model. Many authors noted that the influential observations on the biased estimators are different from the unbiased estimators. In this paper I describe a diagnostic method for assessing the local influence on the constant variance perturbation on the transformation in the Box-Cox biased ridge regression linear model. Two real macroeconomic data sets are used to illustrate the methodologies.


Introduction
Deletion diagnostics for assessing the influential cases on the power transformation parameter estimator in the Box-Cox linear unbiased regression model has been intensively studied in the last two and half decades (see [4][5][6]).Rather than deleting the influential case, [7] proposed a general method for assessing the local influence of minor perturbations of a statistical model.Lawrance [2] adapted Cook's approach to obtain a diagnostic that can be used to examine the local changes of the transformation-parameter estimator caused by small perturbations on a constant-variance assumption.Tsai and Wu [8] analyzed the case-deletion model directly and obtain a more accurate and reliable transformation power estimator in weighted regression model.Also, Tsai and Wu [3] applied a case-weights perturbation scheme to obtain an alternative local influence diagnostic that takes into account the perturbation effects of the Jacobian.
In the literature, many authors noted that the influential observations on ridge type estimators are different from the corresponding least squares estimate (see [9]).The aim of this paper is to apply local influence of minor perturbation of constant variance to biased ridge regression Box-Cox power transformation model.The structure of this paper is as follows.Section 2 establishes transformation for ridge regression model for dealing with the local influence on the power transformation estimator.Section 3 gives calculation of the maximum local influence for ridge estimator.In Section 4 two real macroeconomic data sets are used to illustrate the methodologies.In the last section conclusions are given.

Transformation for Ridge Regression Model
A parametric family of transformations takes a column of a response vector y into ( ) λ y , where λ is a scalar transformation parameter.It is assumed to be a value λ 0 of λ, ensuring that ( ) 0 λ y follows a standard regression model, having the design matrix X.Thus, it is assumed that where X X I X y is ridge regression estimator (RRE) proposed by Hoerl and Kennard (see.[10,11]), X is a known full rank an n p × matrix and R ε an 1 n × random error vector of RRE, it has a multivariate normal distribution with ( ) Var R σ = ε I here I is an identity matrix.The Equation (1) was originally proposed by Box-Cox [1].If the Jacobian of the transformation denoted from y to ( ) y λ by ( ), is the natural scaling of ( ) y λ suggested by the likelihood.

Local Influence Approach for Power Transformation on RRE
For assessing the local influence on the power transformation estimator Lawrance [2] has obtained a diagnostic by perturbing the constant variance assumption.
First, here it is to be assumed that the variance of under perturbation is Var , where where ω 0 = 1, ω denotes the an 1 n × vector of case-weights for the regression, and  is a fixed nonzero vector of unit length in R q .
The distribution function for RRE linear model random error is where ( ) J λ Jacobian of the transformation.The perturbation of the distribution in Equation (3) becomes The corresponding log-likelihood function for the untransformed observation y is ( Thus the profile likelihood function for λ is obtained when maximizing RRE R β and 2 σ for given data set y. The maximum likelihood estimator (MLE) of R β is ( ) σ is 2 σ this has to be estimated from the Equation ( 5) as ( ) where the ( )

OPEN ACCESS OJS
From the above results the likelihood function for transformed observation y with perturbation can be written as where The resulting MLE ˆω λ of λ can be found by minimizing Equation ( 6).Furthermore, the estimator ˆω λ can be regarded as a surface with Euclidean coordinates ω .A curve over this surface is mapped from a straight-line path that passes through the point of the null perturbation.The direction and location of this path are specified by 1 al ω = + passing through the null point, where the quantity "a" measures distance along the line and are the direction cosines of Lawrance's local influence diagnostic.The partial differentiation of Equation ( 6) with respect to λ is then the result will be arrived as, But it is known that The transpose of Equation ( 7) can be written as The local influence diagnostic is the slope of the curve on the ˆω λ surface at the point of null perturbation, at 0. a = If it is 0 small perturbations have no effect in the path points to the data cases being perturbed; the weighted set of cases that are most sensitive to local perturbations are thus specified by the direction that makes the path slope, the greatest at the null point a = 0.This is the key idea in the local-influence approach.It is not principally the value of the slope, but the direction of maximum slope that is important and that forms the main diagnostic.This description is the basis of Cook's presentation when just one parameter is being considered; there is then no need to use a likelihood-displacement measure of the distance between ˆω λ and ˆ. λ This also removes the need to consider curvature of the likelihood displacement, and avoids a loss of sign in connection with ( )

Calculation of the Direction of Maximum Local Influence in RRE
In this section it is tried to develop a method which is the direction that maximizes the slope of ( ) ( ) . Let L denote the diagonal matrix with diagonal entries ( ) . The maximum likelihood estimator ( ) ˆa λ satis-

OPEN ACCESS OJS
A. JAHUFER 22 fies Equation (8) where each term is a function of a .Hence, differentiate the Equation ( 8) with respect to a first differentiating with respect to ( ) ˆa λ when dealing with the variables ( ) λ Z and (10) where is second order derivative of ( ) λ z with respect to , λ and can be called the second constructed variable; all ( ) λ z terms in Equation ( 10) are used at ( ) Therefore, the perturbation matrix ( ) with respect to , a then it gives ( ) where ( )

B
M is a square matrix hence, the Equation ( 10) becomes ( ) The direction max l of maximum slope is now determined by the Equation ( 11) and it gives the following results.
The matrix M can be written as = M ELF , where E and F are symmetric matrices.Therefore, If the terms denoted by t and ′ t containing ( ) , respectively.Therefore, the Equation ( 11) becomes Therefore, the direction of max l maximum slope is now easily determined from the Equation ( 12) that is and yields the results for i-th max i l is ( )

OPEN ACCESS OJS
A. JAHUFER Finally, consider the slope of ( ) ˆa λ at a = 0 when just the i-th variance is perturbed; denoting by i λ′ , gives The result in Equation ( 14) is the local-influence version of computing λ after deleting the i-th data case, an operation analogous to the global perturbation 0 i ω = and ( ) the global perturbation is itself algebraically intractable without approximation (see [5]).

Macroeconomic Impact of Foreign Direct Investment (MIFDI) in Sri Lanka Data Set
Sun [12] studied MIFDI in China 1979-1996.Based on his theory, the MIFDI data were collected in Sri Lanka form 1978 to 2004 to illustrate the methodologies derived in this paper.The data set consists four regressors (Foreign Direct Investment (FDI), Gross Domestic Product Per Capita (GDPPC), Exchange Rate (ER) and Interest Rate (IR)) and one response variable (Total Domestic Investment (TDI)) with 27 observations.The selected variables were tested for statistical conditions: 1) Cointegration, 2) Constant Error Variance and 3) Multicollinearity.The test results showed that: 1) Variables are cointegrated with a same cointegration coefficient I(1) at 1% level of significance, 2) The estimated Durbin-Watson value for the linear model is 2.0131 so, satisfied the constant error variance condition and 3) The scaled condition number of this data set is 31244, this large value suggests the presence of an unusually high level of multicollinearity among the regressors (the proposed cutoff is 30; see [9]).Hence, RRE is more preferable than ordinary least squares estimator to fit model for this data set.
The transformation parameter λ is estimated for this data set using the Box-Cox transformation model (ne- cessary formulas for its implementations are given in the Appendix) is ˆ0.236.

λ =
The 2 R after the scale- preserving transformation is 96.6%, the standard deviation σ is 1.23374, and there is hardly any interaction; these are considerable improvements over the original values of 95.9% and 12665.3respectively, and a large interaction.The REE biasing parameter is estimated for this data set that is k = 0.0063.The l max values are estimated using the Equation ( 13), the values and corresponding index plot are given in below Table 1 and Figure 1, respectively.
From Table 1 and Figure 1, it can be observed that the most five influential cases are 1, 3, 23, 14, and 20 in this order in the Box-Cox transformation analysis for RRE in MIFDI data.

Longley Data
The second data set is [13] to explain the influential observations on the Liu estimator.The scaled condition number of this data set is 43,275 (see [14]).This large value suggests the presence severe multicollinearity among regressors.Cook [15] used this data to identify the influential observations in ordinary least squares estimator using Cook's D i and found that cases 5, 16, 4, 10, and 15 (in this order) were the most influential cases.Walker and Birch [14] analyzed the same data to detect anomalous cases in ridge regression using global influence method.They observed that cases 16, 10, 4, 15 and 5 (in this order) were most influential observations.Shi and Wang [16] also analyzed the same data to detect influential cases on the ridge regression estimator using For the Longley data the parameter λ is estimated using Box-Cox transformation model (necessary formu- las for its implementations are given in the Appendix) is ˆ1.475 λ = .The adjusted 2  R after the scale-pre- serving transformation is 99.3%, the standard deviation σ is 3.42981, and there is hardly any interaction; these are considerable improvements over the original values of 99.3% and 16.8976, respectively, and a large interaction.The RRE biasing parameter for Longley data is k = 0.00146.The l max values are estimated using the Equation (13) and index plot for these values are given below.
From the index plot in Figure 2 it can be seen that the most five influential cases are 13, 10, 14, 8, and 12 in this order in the Box-Cox transformation analysis for RRE in Longley data.Compare the influential cases detected by this method and the previous studies, there are some new influential cases were detected.

Conclusions
In this paper, I have studied Box-Cox power transformation for biased RRE that seem practical and can play a considerable part in RRE data analysis.The local influence measure introduced focus on perturbing the constant variance.The influential cases detected by this method for biased RRE are different than the influential cases detected in global and local influential method for RRE.Although no conventional cut off points are introduced or developed for the RRE Box-Cox power transformation diagnostic quantities, it seems that index plot is an optimistic and conventional procedure to disclose influential cases.It is a bottleneck for cut off values for the influence method.Also, the issue of accommodating influential cases has not been studied.These are additional active issues for future research study.

∂
for a = 0 in an arbitrary direction .l Let W in Equation (2) have diagonal entries ( )

Figure 1 .Figure 2 .
Figure 1.Index plot of l max in RRE using Box-Cox transformation in MIFDI data.