
1. Introduction
Tests of covariance matrices in multivariate statistical analysis have wide applications in many fields of research and practice, such as target detection [1] , face recognition [2] and so on. They have already attracted considerable interests since the 1940s. However, most of the existing researches on this topic focus on testing for covariance matrix other than cross-covariance matrix. In some circumstances, not all the entries in the covariance matrix are concerned, thus testing for a cross-covariance matrix being equal to a specified one becomes an important issue. For instance, when testing for time-reversibility (see Section 3 for more details), we can transform the problem into the one of cross-covariance matrix test. Therefore, like the covariance matrix test, it is also of great practical interest to develop methods for the cross-covariance matrix test.
Over the past several years, many types of statistics have been proposed to test various equalities of covariance matrices. The first type is a class of statistics based on the likelihood ratio (LR). Mauchly [3] was one of the earlier attempts whose approach was based on the likelihood ratio. The statistic of Mauchly depends on the determinant and the trace of sample covariance matrix. It requires that the sample covariance matrix is non-singular, which is the case with probability one when the sample size is larger than the dimension. Gupta and Xu [4] generalized the likelihood ratio test to non-normal distributions by deriving the asymptotic expansion of the test statistic under the null hypothesis when the sample size is moderate. Latterly, Jiang et al. [5] proved that the likelihood ratio test statistic has an asymptotic normal distribution under two different assumptions by the aid of Selberg integrals. Also, the first type of statistics can be extended to analyze high-dimensional data. For instance, Bai et al. [6] used central limit theorems for linear spectral statistics of sample covariance matrices and of random F-matrices, and proposed a modification of the likelihood ratio test to cope with high-dimensional effects. In the following, Niu et al. [7] considered testing mean vector and covariance matrix simultaneously with high-dimensional non-Gaussian data. Niu et al. applied the central limit theorem for linear spectral statistics of sample covariance matrices and established new modification for the likelihood ratio test. The second type is a class of statistics based on empirical distance. Let
be a p-dimensional random sample drawn from a normal distribution with mean vector
and covariance matrix
. Nagao [8] proposed a test statistic
(1)
to test the null hypothesis
versus the alternative
, where
with
,
and I is the identity matrix. Thus the null hypothesis should be rejected when the observed value of
exceeds a pre-assigned level of significance. The third type statistic is based on the largest eigenvalue of the covariance matrix and the random matrix theory. For instance, Cai et al. [9] studied the limiting laws of the coherence of an
random matrix in the high-dimensional setting that p can be much larger than n, then Cai et al. considered testing the structure of the covariance matrix of a high-dimensional Gaussian distribution, where the random matrix plays a crucial role in the construction of the test. The last type is a statistic based on the examination of a fixed column of the sample covariance matrix. Gupta and Bodnar [10] proposed an exact test on the structure of the covariance matrix. The test statistic of Gupta is based on the examination of a fixed column of the sample covariance matrix, and it can also be applied if the sample size is much smaller than the dimension.
The above mentioned statistics for covariance matrix test are applicable when the dispersion matrix has the Wishart distribution or the distribution of the test statistic is derivable, thus the asymptotic properties of these statistics can be obtained. In many circumstances, the asymptotic distribution of the test statistic is complicated in the absence of strict normality or when the Wishart distribution is unavailable. In this paper, we provide a new method for testing cross-covariance matrix other than covariance matrix, which can be more efficient in some problems that the variance are not concerned. Moreover, the proposed test is independent of Wishart distribution but can be implemented by parametric bootstrap scheme.
The proposed statistic is based on the Frobenius norm of the difference between the sample cross-covariance matrix and the given matrix. Theoretically, it can detect any deviation of the cross-covariance matrix from a pre-assigned one. Several numerical examples show that it is more powerful in testing a cross-covariance matrix deviating from the pre-assigned matrix than some other competitive methods.
Recently, tests of time-reversibility (TR) have drawn much attention due to that time reversibility is a necessary condition for an independent and identically distributed (i.i.d) sequence. As is known, i.i.d sequence and stationary Gaussian models are time reversible. Otherwise, a linear, non-Gaussian process is time-irreversible, except when its coefficients satisfy particular constraints [11] . Several tests for TR have been proposed to be applied as tests for specification check in model construction [12] [13] [14] [15] . In this paper, the TR test method is based on the copula spectral density kernel (CSDK) proposed by Dette et al. [16] , which is more informative than the traditional spectral density, the CSDK captures serial dependence more than covariance-related. The CSDK
(defined in (17)) is indexed by couple
of quantile levels, where
and
, with
being the one-dimensional marginal cumulative distribution function of a strictly stationary univariate process
. Obviously, the time series is pairwise time reversible if and only if
for all
and all
, where
is the imaginary part of a complex number a. Thus the imaginary part of CSDK is equal to zero if
is time reversible. So we can transform the problem of testing pairwise time-reversibility into one of testing the imaginary part of CSDK being zero. By Theorem 3.3 of Dette et al. [16] , we derived a covariance matrix
(defined in (24)), we find that time-reversibility indicates the cross-covariance matrix in
is equal to a zero matrix. Theoretically, we can transform the problem of testing pairwise time-reversibility into that of testing for the specification of a cross-covariance matrix.
Throughout the paper, we denote by
equality in distribution, and define
and
as the real part and imaginary part of a complex number a, respectively. For matrix notation,
and
denote the
identity matrix and
zero matrix, respectively;
and
represent the determinant and trace of the matrix M, respectively;
indicates the Frobenius norm of M.
and
denote a chi-square distribution and a Student t distribution with q degrees of freedom, respectively.
The rest of the paper is organized as follows. Section 2 presents the test statistic with the bootstrap scheme in computing the p-value of the cross-covariance matrix test. Section 3 reports empirical results for examining performance of the proposed test by using simulated data. Section 4 illustrates the applications in detecting any deviation from time-reversibility of a time series. Section 5 contains our conclusions.
2. Test Statistic and Its Distributional Approximation
2.1. Test Statistic
Let
be an independent sample from
, an multivariate normal distribution with mean vector
, covariance matrix
, where
is expressed by
, and
is a blocked matrix given by
(2)
with
being the cross-covariance matrix. In this section, we consider the problem of testing
The test statistic is constructed based on the Frobenius norm of the difference between the sample cross-covariance matrix and the given matrix. In the derivation, no assumption on p, like
or
, is required. Since the Wishart distribution is not achieved here, we implement the derivation by the aid of parametric bootstrap scheme. We define the test statistic
(3)
where
(4)
with
and
. In (3),
, which can detect any deviation
of cross-covariance from the pre-specified matrix
.
2.2. Bootstrap Approximation of the Null Distribution
Let
be an independent sample that are drawn from
.
and
are the estimator of the parameters
and
. Suppose that the pseudo data set
was resample from
, where
(5)
and
is expressed by
. The bootstrap statistic is defined as
(6)
where
(7)
with
and
.
To study the bootstrap approximation of the null distribution, we need the following conditions:
Condition (A1) Let
be the cumulative distribution function (cdf) of the bootstrap statistic
,
be the empirical distribution function of the bootstrap sample
. Let
(8)
We assume the value of the equality (8) vanishes as t goes to infinity.
Condition (A2) As M tends to infinity,
converges weakly to
in distribution, provided that
a.s., where F denotes the cdf of the statistic T and B is the Brownian bridge on [0,1].
Theorem 1. Suppose F is nondegenerate. Suppose also that
be the nominal size of the test. Under conditions (A1) and (A2), if
satisfies
(9)
then
(10)
The result (9) is almost immediate from Corollary 4.2 of Bickel et al. [17] . By Lemma 8.11 of Bickel et al. [17] , we obtain that
has a continuous distribution,
converges to the
-quantile of the law of
.
2.3. Algorithm for Calculating Test p-Value
In order to carry out the parametric bootstrap procedure for the proposed test, we present the following simulation steps. The bootstrap p-value is approximated by the following procedure.
Step 1. Calculate an observation
of statistic T.
Step 2. Estimate the covariance matrix
and
by sample covariance matrices, say
and
.
Step 3. Resample from
and calculate the value of bootstrap statistic
, where
is defined in (5).
Step 4. Repeat Step 3 M times, and compute the p-value by p-value =
, where
denotes an indicator on the set A, which equals 1
when A occurs, and 0 otherwise.
3. A Simulation Study
3.1. Comparison Study
We briefly describe the tests which are compared in the current paper, the modified LR test [18] , and the test of Nagao [8] . These two tests are used to test
Let
be an independent sample from
, the modified LR test statistic is based on
(11)
where
,
,
, and
. As is known,
, where the
stands for p-dimension Wishart distribution with
degrees of freedom and covariance matrix
. Anderson [19] derived the limiting distribution of the modified LR test statistic
with the help of Wishart distribution, when
,
is
asymptotically distributed as
, where
denotes the
logarithmic function based on natural logarithm e. Then, Nagao [8] proposed a test statistic
(defined in (1)) which can be regarded as a measure of departure from the null hypothesis.
In what follows, we propose the statistic T (3) to test
(12)
So far, no test methods are available for this problem. Thus, we choose statistics
and
to test
(13)
for the comparative study, where the cross-covariance matrix in
is equal to
. Testing the structure of the covariance matrix can also detect the deviation of cross-covariance from the pre-specified matrix. For a pre-specified level of significance
, the null hypothesis in (13) is rejected if
(14)
or
(15)
where
denotes the
quantile of the empirical distribution of statistic
.
We employ simulation data to evaluate the performance of the proposed statistic T, statistics
and
when applied to test the hypotheses (12) and (13) at a significant level of
. Empirical sizes and powers of the proposed test are computed based on
resample times and 500 repetition times, and that of tests (14) and (15) are based on 500 Monte Carlo replications. In the simulation study we choose mean vector
, dimension
, and take sample size
. The results are shown in Table 1.
Results in Table 1 show that each empirical type I error rate of above three different statistics is very close to the pre-specified nominal value. Also, the proposed test has higher empirical power than its counterpart tests (14) and (15). With increasing sample size, the change in empirical power of test (15) is barely noticeable while the performance of the proposed test improves significantly, which means statistic
is not sensitive to the change of the cross elements of the covariance matrix while statistic T can detect any deviation of cross-covariance from the pre-specified matrix
. Thus, when the variance in the covariance matrix is not concerned, we recommend that statistic T can be applied to testing the equality hypothesis about a cross-covariance matrix.
3.2. Bootstrap Asymptotic Study
In this section, we employ simulation data to investigate whether the performance of proposed test is sensitive to the block matrices in the diagonal. For this purpose, we consider two choices of
: one is
and the other is
We propose statistic T to test
(16)
For the two covariance matrices mentioned above, we run a simulation with
resample times and 500 repetition times to obtain the empirical sizes and powers of the proposed test at significant level
, where we take sample size
and choose mean vector
. The results are shown in Table 2 and Table 3.
For each sample size
and each nominal size
, Table 2 shows the empirical rejection probabilities of the proposed test. We present simulation results of
and
in the first panel
![]()
Table 1. Rejection probabilities of the proposed test, tests (14) and (15) from simulated data.
![]()
Table 2. Probability of committing the type I error of the proposed test in testing (16) for two different
.
![]()
Table 3. Empirical rejection probabilities of the proposed test in testing (16) for two different
.
and the second panel, respectively. We see that each empirical type I error rate of two different cases is close to their nominal sizes. For the nominal size
, Table 3 shows how the empirical rejection probability of the proposed test changes with respect to five different sample sizes
. Although the block matrices in the diagonal are different, their empirical powers improve significantly with increasing sample size n. Last but not least, when
, both the empirical powers of above two cases reach the maximum. By our simulation experiments in Table 2 and Table 3, we find that the proposed test is not sensitive to the block matrices in the diagonal. It is due to that the proposed statistic T depends only on the sample cross-covariance matrix. Thus we can conclude that our proposed test still achieve good performance though the change of the variance components in the covariance matrix take place.
4. An Empirical Application: Testing for Pairwise Time-Reversibility
4.1. Time Reversible Time Series and Prior Specification
A formal statistical definition of pairwise time-reversibility is defined as follows.
Definition 1. A time series
is pairwise time reversible if for all positive integers k, the random vectors
and
have the same joint probability distributions.
Under this definition, one can show that pairwise time reversibility implies stationarity. Likewise, nonstationarity implies time irreversibility [14] . Clearly,
is time reversible when
is i.i.d. Thus, for the study of testing time-reversibility, the pairwise case is generally considered. For instance, Ramsey et al. [12] proposed a pairwise TR test statistic consists of a sample estimate of the symmetric-bicovariance function given by the difference between two bicovariances of
. Laterly, Chen et al. [11] considered the pairwise time-reversibility and proposed a new test aiming at the symmetrical distribution of
rather than moments. In the following, Dette et al. [16] briefly analyzed the pairwise time-reversibility of four different time series models by the aid of quantile-based spectral analysis.
In this section, we primarily focus on testing for pairwise time-reversibility and the test method is based on copula spectral density kernel (CSDK) proposed by Dette et al. [16] . Let
be a strictly stationary univariate process, the CSDK is defined as
(17)
where
is the
-quantile of the marginal distribution of the process
, i.e.
. The
(copula cross-covariance kernel) of lag
is also introduced by Dette et al. [16] which is defined as
(18)
where
, F denotes one-dimensional marginal cumulative distribution function of the process
. Compared with traditional covariances, the concept of copula cross-covariance kernel is proper for describing a serial copula.
The collection of CSDKs for different
provides a full characterization of the copulas associated with the pairs
, and accounts for many important dynamic features of
, such as changes in the conditional shape (skewness, kurtosis), time-irreversibility, or dependence in the extremes that the traditional second-order spectra cannot capture [20] .
4.2. Test for Time-Reversibility
In the sequel, we will concentrate on testing for pairwise time-reversibility. Obviously, we have
for all
if and only if
for all
and all
. For the purpose of pairwise time-reversibility test, we consider the problem of testing
(19)
for all
. The method introduced by Dette et al. [16] for estimating the
is first to calculate the rank-based Laplace periodogram (RLP) and then to smooth it to obtain the consistent estimator. Let
be the observation from a strictly stationary univariate process
. Like Dette et al. [16] , we define
by
(20)
where
(21)
is the so-called check function [21] ,
,
. We extend the definition of
to a piecewise constant function on
as follows:
(22)
Let
. We denote
, where
and
. Then, under standard mixing conditions (Theorem 3.3 of Dette et al. [16] ), for
,
converges to a zero-mean real Gaussian distribution with covariance matrix
(23)
Not all the entries in
need to be concerned when testing hypotheses (19), so we consider the block covariance matrix
,
(24)
Thus, for the test of hypotheses (19), we can transform the problem into one of testing the cross-covariance matrix in (24) being a one dimensional zero matrix. Here, we define random vectors X and Y as
where the smoothed rank-based Laplace periodogram
denotes a consistent estimator of CSDK, and
(25)
Let
be an independent sample, where
is expressed by
. The test p-value is approximated by the previous bootstrap scheme.
4.3. An Illustration Example
Due to lack of the accurate values of CSDKs, it is difficult for us to evaluate our methodology by using the general models by simulation [20] . We consider two AR(1) models (models 1 and 2) with the form
(26)
since their CSDKs can be computed numerically. In model 1,
are independent
-distributed random variables, while independent Student t-distributed with 1 degree of freedom in model 2. For model 1, the imaginary component of CSDK is vanishing, which reflects that the process is time-reversible; for model 2, there exists a time-irreversible impact of extreme values on the central ones [16] .
Recently, Dette et al. [16] and Kley et al. [22] proposed to estimate the CSDKs by smoothing RLP, which can be defined by quantile regression (QR) or clipped time series (CT). The finite sample performance of the smoothed RLP can be conducted using the R package quantspec [22] . This makes the method of smoothed (QR- or CT-based) RLP serve a good reference to calculate the test p-value. We take the smoothed QR-based RLP to cope with the pairwise TR test. For each generated pseudo-random time series, we computed the smoothed QR-based RLP using the Epanechnikov kernel and bandwidth
. For each generated dataset and each pair of
, the test p-value is computed by the previous bootstrap procedure.
For each of those two models, we generated 512 and 1024 dataset with each containing pseudo-random time series of lengths
and
respectively. We set
. A boxplot that is drawn is based on 50 realizations of the log scale of the realized p-value. For each pairs
, the boxplots of two models are presented on the left, middle and right, respectively. For each boxplot, the median, extreme points and box shaped by lower and upper quartiles are marked in Figure 1 and Figure 2.
Next we discuss the simulation results in the case of an AR(1) process. We find from Figure 1 that the lower quartile of boxplots of two models is greater than
at quantile pair
, which means the null hypothesis in (19) cannot be rejected, i.e.
is approximately equal to
![]()
Figure 1. Boxplots of the estimated
-value for different
, and
.
![]()
Figure 2. Boxplots of the estimated
-value for different
, and
.
,means that AR process with Gaussian innovations or
distributed innovations is time-reversible. Also, when
or
, these observations also reflect the fact that AR process with Gaussian innovations are time-reversible. However, for
distributed innovations, this phenomenon only takes place for the extreme quantiles (
), does not hold for
and
or
, one of the important reason is that there exist a marked discrepancy between tail and central dependence structures when the innovations
of AR(1) process are non-Gaussian. From Figure 2, it can also be evidenced that AR process with Gaussian innovations is time reversible, while the case of
distributed innovations is time reversible only for extreme quantiles. Above results indicate that the imaginary parts of CSDK are not zero suggesting time-irreversibility.
5. Conclusions
In this paper, we proposed a new statistic (3) for testing the specification of the cross-covariance matrix. The test statistic is constructed based on Frobenius norm of the difference between the sample cross-covariance matrix and the given matrix. The asymptotic properties of test statistic were obtained with the help of bootstrap scheme. By computing the empirical size and power of the proposed test, the rationality of the test statistic was obtained. The advantage of the proposed statistic is twofold. First, through comparative study, we found that our empirical powers are clearly superior to others in detecting any deviation of the cross-covariance from the pre-assigned matrix. Second, there is no need to make complex derivations of the distribution of statistic T and only a few simulation studies we can obtain the performance of the test.
However, one challenge is to determine whether the test performs very well in the case where the data is high-dimensional, this will be our future work.
Acknowledgements
This work was supported in part by the National Natural Science Foundation of China (grant number: 11671416).