^{1}

^{*}

^{1}

^{1}

The Shapiro-Wilk test (SWT) for normality is well known for its competitive power against numerous one-dimensional alternatives. Several extensions of the SWT to multi-dimensions have also been proposed. This paper investigates the relative strength and rotational robustness of some SWT-based normality tests. In particular, the Royston’s H-test and the SWT-based test proposed by Villase?or-Alva and González-Estrada have R packages available for testing multivariate normality; thus they are user friendly but lack of rotational robustness compared to the test proposed by Fattorini. Numerical power comparison is provided for illustration along with some practical guidelines on the choice of these SWT-type tests in practice.

Normal distributions are of central importance in statistical inference and in numerous applications. Thus, testing for normality including assessing multivariate normality has been studied extensively in statistics. For instance, in a research monograph, Thode [

Originally created to test univariate distributions for normality, given univariate data

where

vector and covariance matrix of the order statistics of a random standard normal sample of size

riate data

inverse of its covariance matrix

degenerate with probability one. Without loss of generality, let

Under the null hypothesis,

where

age called mvShapiroTest that makes it very user friendly [

The multivariate normal distributions have rotational invariance. In particular,

[

roTest or the Royston test. The FA test statistic is given by

bust power properties of the FA test, Thode [

The Iris data set is a well-known multivariate data set collected to measure the morphologic variation of Iris flowers of three related species. The data set consists of 50 samples from each of three species of Iris including setosa, virginica and versicolor. For each sample, four variables were measured including the length and the width of the sepals and petals, in centimeters. Fisher [

From the above Iris data example, it is clear that testing for X and testing for U(X) can have dramatically different powers for the mvShapiro.Test and royston.test, we conducted further simulations for a wide variety of alternatives. Indeed, neither of these two tests has robust power against rotational alternatives when the marginal distributions of X are independent. They are seriously lack of rotational robustness compared to the FA test. More specifically, the R package mvShapiroTest was used to evaluate the test statistics, the critical values, and powers of the test discussed in [

that exceeded the previously calculated critical values under 50,000 samples from each alternative. Similarly the royston package in R is used to calculate critical values of the royston.test based on 500,000 samples from the standard bivariate normal distribution and the empirical power based on 50,000 samples from each alternative distribution. Using the same set up, the critical values and power of the FA test [

The simulated power is illustrated in

Alternative Distributions for X = (X_{1}, X_{2}) | Testing Normality of X = (X_{1}, X_{2}) | Testing U(X) = (X_{1} − X_{2}, X_{1} + X_{2}) | |||||
---|---|---|---|---|---|---|---|

FA | mvShapiro | Royston H | R6 | FA | mvShapiro | Royston H | |

N (0,1)*N (0,1) | 5 | 5 | 5 | 5 | 5 | 5 | 5 |

Exponential*Exponential | 100 | 100 | 100 | 100 | 100 | 90 | 89 |

χ² (2)*χ² (2) | 100 | 100 | 100 | 100 | 100 | 90 | 89 |

χ² (5)* χ² (5) | 92 | 99 | 97 | 94 | 93 | 48 | 46 |

χ² (10)* χ² (10) | 62 | 80 | 72 | 66 | 62 | 25 | 23 |

Lognormal (0,0.5)*Lognormal (0,0.5) | 96 | 99 | 98 | 97 | 97 | 67 | 62 |

Lognormal (0,0.25)* Lognormal (0,0.25) | 48 | 62 | 57 | 50 | 49 | 20 | 18 |

Gamma (5,1)*Gamma (5,1) | 63 | 80 | 73 | 66 | 63 | 26 | 24 |

t (2)*t (2) | 96 | 98 | 98 | 98 | 96 | 84 | 75 |

t (5)*t (5) | 45 | 51 | 55 | 52 | 46 | 26 | 26 |

Beta (1, 1)*Beta (1,1) | 51 | 93 | 79 | 63 | 52 | 4 | 3 |

Beta (1,2)*Beta (1,2) | 73 | 97 | 89 | 81 | 75 | 12 | 10 |

Beta (2,2)*Beta (2,2) | 6 | 23 | 13 | 6 | 6 | 3 | 2 |

Logistic (0,1)*Logistic (0,1) | 24 | 27 | 30 | 27 | 24 | 12 | 14 |

Halfnormal*Halfnormal | 93 | 100 | 97 | 95 | 93 | 35 | 33 |

Weibull (1)*Weibull (1) | 100 | 100 | 100 | 100 | 100 | 90 | 89 |

Weibull (1.5)*Weibull (1.5) | 89 | 98 | 95 | 92 | 89 | 38 | 36 |

Pearson II (0) | 26 | 49 | 34 | 35 | 26 | 49 | 33 |

Pearson II (1) | 6 | 11 | 7 | 6 | 6 | 11 | 7 |

Pearson VII (4) | 30 | 37 | 40 | 49 | 30 | 38 | 39 |

Pearson VII (5) | 43 | 26 | 28 | 35 | 43 | 25 | 28 |

N (0, 1)*Exponential | 99 | 99 | 98 | 98 | 99 | 43 | 45 |

N (0, 1)*Beta (1, 1) | 33 | 47 | 40 | 23 | 32 | 5 | 32 |

N (0, 1)*Beta (1, 2) | 51 | 61 | 55 | 38 | 52 | 9 | 54 |

N (0, 1)*Halfnormal | 75 | 80 | 75 | 68 | 75 | 17 | 42 |

N(0, 1)*Gamma(5, 1) | 41 | 47 | 43 | 34 | 41 | 14 | 3 |

N(0, 1)*t(2) | 81 | 82 | 84 | 81 | 81 | 51 | 7 |

N(0, 1)*t(5) | 28 | 30 | 33 | 30 | 29 | 14 | 9 |

N(0, 1)*Chisq(2) | 99 | 100 | 99 | 97 | 99 | 42 | 7 |

N(0, 1)*Chisq(5) | 73 | 79 | 74 | 60 | 74 | 21 | 1 |

N(0, 1)*Weibull(1) | 99 | 99 | 99 | 99 | 99 | 44 | 47 |

N(0, 1)*Weibull(1.5) | 68 | 74 | 69 | 62 | 68 | 18 | 47 |

NMIX(.5, 2,0,0) | 10 | 4 | 3 | 7 | 11 | 19 | 14 |

NMIX(.5, 2, 0,0.9) | 66 | 53 | 56 | 67 | 67 | 62 | 61 |

NMIX(.75, 2, .9, 0) | 86 | 85 | 69 | 84 | 86 | 69 | 71 |

NMIX(.75, 2, 0, .9) | 51 | 6 | 15 | 47 | 53 | 65 | 63 |

ternative χ^{2}(5)*χ^{2}(5) stands for a bivariate distribution with the two independent marginal distributions each being the χ^{2}(5) distribution. ^{2}(5)*χ^{2}(5) [^{2}(5)*χ^{2}(5) when testing

If we have prior information that

angle in the Royston test among six fixed angles

comparable to the FA test in the bivariate case. Of course, combining SWT-based tests with other non-SWT type tests (e.g. the kurtosis test [

The authors would like to thank Dr. Ming Zhou and Dr. Jiqiang Guo for help with R programming and for constructive conversations. Y.S.’s research is partially supported by the NYU NIEHS Center Grant P30 ES00260 and the NYU Cancer Center Support Grant 2P30 CA16087.

“SW” <- function(X) shapiro.test(X)$statistic

“FA” <- function(X) {

X < - as.matrix(X)

n < - NROW(X)

p < - NCOL(X)

mu< - apply(X,2,mean)

nSinver< - solve((n-1)*cov(X))

Y < - X%*%t ((X-matrix(rep(mu,n),ncol=p,byrow=TRUE))%*%nSinver)

return(min(apply(Y,2,SW)))}

## END