General Theory of Antithetic Time Series

A generalized antithetic time series theory for exponentially derived antithetic random variables is developed. The correlation function between a generalized gamma distributed random variable and its pth exponent is derived. We prove that the correlation approaches minus one as the exponent approaches zero from the left and the shape parameter approaches infinity.


Introduction
Serially correlated random variables arise in ways that both benefit and bias mathematical models for science, engineering and economics. In one widespread example, a mathematical formula is used to create uniformly distributed pseudo random numbers for use in Monte Carlo simulation. The numbers are serially correlated because they are generated by a formula. The benefit is that the same pseudo random numbers can be recreated at will, and two or more simulation experiments can be compared without regard to the pseudo random numbers. The correlation is designed to be very small so as not to bias the results of a simulation experiment. Still, some bias is unavoidable when using serially correlated numbers (Ferrenberg, Lanau, and Wong [1]).
Another wide spread example is a regression model in which the dependent variable is serially correlated. The result is biased model parameter estimates because the independence assumption of the Gauss-Markov theorem is violated (see Griliches [2], Nerlove [3], Koyck [4], Klein [5]). Similarly, the independence assumption of Fuller and Hasza [6] and Dufour [7] would not apply. The absence of any relevant information from a model will express itself in the patterns of the error term. If complete avoidance of bias requires normally distributed data, then the absence of normality is like missing information. Bias may also be due to missing data points (Chandan and Jones [8], Li, Nychka and Amman [9]). Assume that a perfect model is postulated for a given application in which the population to which the data belong is known exactly. The model must be fitted to a sample of data, not the population. However, once the sample is taken, the distribution is automatically truncated and distorted, and the fitted model is biased. Regardless of the method of fitting, however small, sampling bias is unavoidable. One approach aimed at improving model performance is to combine the results from different models. For an extensive discussion and review of traditional combining see Bunn [10], Diebold [11], Clemen [12], Makridakis et al. [13], and Winkler [14].
Economics researchers have commented on serial correlation bias. Hendry and Mizon [15] and Hendry [16] considered common factor analysis (Mizon [17]) and suggested that serial correlation is a feature for representing dynamic relationships in economic models. This in turn implies that economics allows for serial correlation (see Pindyck and Rubinfield [18]). Time domain methods for detecting the nature and presence of serial correlation were considered by Durbin and Watson [19] and Durbin [20]. Spectral methods were considered by Hendry [16], Osborn [21] and Espasa [22]. Even if serial correlation can be a tool for studying the nature of economics, it is detrimental to long range forecasting models. Whatever the source of bias may be, the only possibility for long range forecasting is to completely eliminate the bias.

Background
Inversely correlated random numbers were suggested by Hammersley and Morton [23] for use in Monte Carlo computer simulation experiments. In that application, a single computer simulation is replaced by two simulations. One simulation uses uniformly distributed ( ) 0,1 random numbers in r. The other simulation uses 1 r − .
The expectation is that the average of the results of these two simulations has a smaller variance than for either one. In practice, the variance sometimes decreases, but sometimes it increases. See also Kleijnen [24]. The theory of combining antithetic lognormally distributed random variables that contain negatively correlated components was introduced by Ridley [25]. The Ridley [25] antithetic time series theorem states that "if 0, 1, 2,3,  [25] antithetic fitted function theorem, and antithetic fitted error variance function theorem). Similarly, antithetic forecasts obtained from a time series model can be combined so as to eliminate bias in the forecast error. Ridley [26] applied combined antithetic forecasting to a wide range of data distributions. Ridley [27] demonstrated the methodology for optimizing weights for combining antithetic forecasts. See also Ridley and Ngnepieba [28] and Ridley, Ngnepieba, and Duke [29]. The antithetic variables proof in Ridley [25] was for the special case of t X lognormally distributed. The implication for using a biased mathematical model to investigate economic, engineering and scientific phenomena is that estimates obtained from the model are biased. Estimates of future values extrapolated from the model are also biased. As the forecast horizon increases, the bias accumulates and the extrapolations diverge from the actual values. This is most pronounced in the case of investigations into global warming phenomena. There, the horizon is by definition very far into the future. The smallest bias will accumulate, so much so that conclusions may be as much an artifact of the mathematical model as they are about climate dynamics. Combining antithetic extrapolations can dynamically remove the bias in the extrapolated values.

Proposed Research
The antithetic gamma variables discussed in this research are defined as follows. Definition 1. Two random variables are antithetic if their correlation is negative. A bivariate collection of random variables is asymptotically antithetic if its limiting correlation approaches minus one asymptotically (see antithetic gamma variables theorem below).
is an ensemble of random variables, where ξ belongs to a sample space and t belongs to an index set representing time, such that t X is a discrete realization of a gamma stationary stochastic process from the ensemble, , and , 1, 2, 3, t X t =  are serially correlated. In this paper, we extend the discovery by Ridley [25] beyond the lognormal distribution. The gamma distribution is very important for technical reasons, since it is the parent of the exponential distribution and can explain many other distributions. That is, a wide range of distributions can be represented by the gamma distribution. We will explore these possibilities by examining the correlation between X and p X when X is gamma distributed. Of particular interest is the correlation between X and p X as 0 p − → . A graph of the correlation between X and p X as 0 p is shown in Figure 1. We begin by reviewing the obvious results for the case when p is positive. The correlation is positive when p is positive and exactly one when p is one. This is expected. As p moves away from one, the correlation decreases. As p approaches zero from the right, the correlation falls, albeit very slowly. This is also expected. In the case when p is negative, the correlation behaves quite differently. The result is entirely counterintuitive. As expected, the correlation is negative. But, unlike when p is positive, as p approaches zero from the left, the absolute value of the correlation increases. Furthermore, the actual correlation approaches minus one, not zero.
One purpose of this paper is to derive an analytical function for the correlation between X and p X when X is gamma distributed. A second purpose is to explore by extensive computation, the behavior of the correlation as p approaches zero from the left. The trivial case of p equal zero where the correlation is zero, is of no interest. We are interested in p inside a delta neighborhood of zero, not zero. Finally, we prove that the limiting value of the correlation is minus one.
The paper is organized as follows. In Section 2 we review the gamma distribution. In Section 3 we derive the analytic function for the correlation. In Section 4 we prove its limiting value. In Section 5 we use MATLAB [30] to compute correlations for a wide range of values generated from the gamma distribution. In Section 6 we outline the method for using antithetic variables to dynamically remove bias from the fitted and forecast values obtained from a time series model. Examples include computer simulated data. Section 7 contains conclusions and suggestions for further research.

The Gamma Distribution
The gamma distribution is very important for technical reasons, since it is the parent of the exponential distribution and can explain many other distributions. Its probability distribution function (pdf) (see Hogg and Ledolter [31]) is: where 0 α > is the shape parameter and

Correlation between X and X p
 be a discrete realization of a generalized gamma stochastic process. For p ∈  , from Appendix A, the pth moment of the gamma distribution is given by Let ρ be the correlation between t X and p t X . Then Var .
Therefore, using Equations (3) and (6), Equation (5) becomes The gamma function ( ) α Γ results in a complex number when the argument is negative. This is avoided if 2 p α < . In any case, since we are only interested in p approaching zero from the left, this condition will always be satisfied when 0 α > .

Correlation versus p
The effect of p on the correlation is demonstrated by calculating the correlation coefficient from Equation (8) for various values of α and β . The correlation coefficients are listed in Table 1 and plotted in Figure 3 and   using Equation (8).
that are more symmetrical. Also, as α increases the standard deviation increases. This is indicative of greater spread about both sides of the mode of X. Equation (9) expresses the correlation in terms of σ . In practice the value of α will be that for the actual data under study. It cannot be modified. Still, one might say that it appears that the effect of p on reversing the correlation is greatest for symmetrical distributions.
To validate Equation ( The results are shown in Table 2. The coefficients are almost identical to the theoretical values obtained from Equation (8) and listed in Table 1. In practice, the data may include relatively few observations. To investigate the small sample correlation coefficient, the correlation coefficient is calculated for 100 n = .

Bias Reduction
Consider an autoregressive time series t x of n discrete observations obtained from a gamma distribution with a large shape parameter to which a least squares model    τ ω τ ω τ ′ = + − . A shift parameter λ similar to that discussed by Box and Cox [32] is used to facilitate the power transformation and further improve the combined fitted mse. λ (determined by grid search) can be added to each value of t x to obtain t t z x λ = + prior to applying the power transformation and subtracted after conversion back to their original units, leaving the mean unchanged. While the data may be from stationary time series, they are of necessity a truncated sample. Any truncated data sample will fall short of the complete distributional properties of the population from which they are drawn, and therefore the property of stationary data. Of the terms p, λ , and ω , only p is unique to antithetic time series analysis, and it is not a fitted parameter.
When implemented, p is actually a constant set to 0.001 − , an approximation of 0 − . Also, since ˆ1 p zz ρ = − , the transformation involving p is linear, and does not imply that the original model should have been non-linear. Like the use of λ here, it is common practice to apply various transformations such as logarithm, square root and Box and Cox [32] that add no new information, but make the data better conform to the assumptions of a postulated model. If there were no bias, or if antithetic combining did not reduce bias, then ω would simply be equal to one and the original postulated model only would apply.

Computer Simulation
To illustrate, consider a model fitted to computer simulated data based on stationary autoregressive processes, containing 1060 observations generated from The results are shown in Table 3. As α increases, the fitted mse's increase, indicative as expected, of the increase in the variance in the data. The combined fitted mse's are all lower than the original fitted mse's. The average gain is a reduction in fitted mse of 5.5%. This demonstrates that for a wide range of gamma distributions, combining antithetic fitted values can reduce the component of error that is due to systematic bias, leaving only random error. The fitted mse and 1000 period forecast horizon mse sensitivities to forecast origin (n) are shown in Table 4. As n increases from 51 to 60, the combined fitted mse's are lower than the original fitted mse's. The average gain is a reduction of 11.1%. The average gain in the combined forecast mse over the original forecast mse is a reduction of 6.9%. The forecast mse sensitivities to forecast horizon (N) are shown in Table 5. As N increases from 100 to 700, the combined forecast mse's are lower than the original forecast mse's. The average gain is a reduction of 6.1%.

Conclusion
The correlation between a gamma distributed random variable and its pth power was derived. It was proved that the correlation approaches minus one as p approaches zero from the left and the shape parameter approaches infinity. This counterintuitive result extends a previous finding of the similar result for lognormally distributed random variables. The gamma distribution was modified so as to emulate a range of distributions, showing that   antithetic time series analysis can be generalized to all data distributions that are likely to occur in practice. The gamma distribution is unimodal. A suggestion for future research is to investigate the correlation between a random variable and its pth power when its distribution is multimodal. Another suggestion is to compare the effectiveness of the Hammersley and Morton [23] antithetic random numbers with antithetic random numbers constructed from the method described in this paper. Combining antithetic extrapolations can dynamically reduce bias due to model misspecifications such as serial correlation, non-normality or truncation of the distribution due to data sampling. Removing bias will eliminate the divergence between the extrapolated and actual values. In the particular case of climate models, removing bias can reveal the true long range climate dynamics. This will be most useful in models designed to investigate the phenomenon of global warming. Beyond the examples discussed here, antithetic combining has broad implications for mathematical statistics, statistical process control, engineering and scientific modeling.