JAMPJournal of Applied Mathematics and Physics2327-4352Scientific Research Publishing10.4236/jamp.2015.312197JAMP-62460ArticlesPhysics&Mathematics General Theory of Antithetic Time Series ierreNgnepieba1*DennisRidley2*SBI, Florida A&amp;M University, Tallahassee, USADepartment of Mathematics, Florida A&amp;M University, Tallahassee, USA* E-mail:pierre.ngnepieba@famu.edu(IN);dridley@fsu.edu(DR);041220150312172617416 November 2015accepted 27 December 30 December 2015© Copyright 2014 by authors and Scientific Research Publishing Inc. 2014This work is licensed under the Creative Commons Attribution International License (CC BY). http://creativecommons.org/licenses/by/4.0/

A generalized antithetic time series theory for exponentially derived antithetic random variables is developed. The correlation function between a generalized gamma distributed random variable and its pth exponent is derived. We prove that the correlation approaches minus one as the exponent approaches zero from the left and the shape parameter approaches infinity.

Antithetic Time Series Theory Antithetic Random Variables Bias Reduction Gamma Distribution Inverse Correlation Serial Correlation
1. Introduction

Serially correlated random variables arise in ways that both benefit and bias mathematical models for science, engineering and economics. In one widespread example, a mathematical formula is used to create uniformly distributed pseudo random numbers for use in Monte Carlo simulation. The numbers are serially correlated because they are generated by a formula. The benefit is that the same pseudo random numbers can be recreated at will, and two or more simulation experiments can be compared without regard to the pseudo random numbers. The correlation is designed to be very small so as not to bias the results of a simulation experiment. Still, some bias is unavoidable when using serially correlated numbers (Ferrenberg, Lanau, and Wong  ).

Another wide spread example is a regression model in which the dependent variable is serially correlated. The result is biased model parameter estimates because the independence assumption of the Gauss-Markov theorem is violated (see Griliches  , Nerlove  , Koyck  , Klein  ). Similarly, the independence assumption of Fuller and Hasza  and Dufour  would not apply. The absence of any relevant information from a model will express itself in the patterns of the error term. If complete avoidance of bias requires normally distributed data, then the absence of normality is like missing information. Bias may also be due to missing data points (Chandan and Jones  , Li, Nychka and Amman  ). Assume that a perfect model is postulated for a given application in which the population to which the data belong is known exactly. The model must be fitted to a sample of data, not the population. However, once the sample is taken, the distribution is automatically truncated and distorted, and the fitted model is biased. Regardless of the method of fitting, however small, sampling bias is unavoidable. One approach aimed at improving model performance is to combine the results from different models. For an extensive discussion and review of traditional combining see Bunn  , Diebold  , Clemen  , Makridakis et al.  , and Winkler  .

Economics researchers have commented on serial correlation bias. Hendry and Mizon  and Hendry  considered common factor analysis (Mizon  ) and suggested that serial correlation is a feature for re- presenting dynamic relationships in economic models. This in turn implies that economics allows for serial correlation (see Pindyck and Rubinfield  ). Time domain methods for detecting the nature and presence of serial correlation were considered by Durbin and Watson  and Durbin  . Spectral methods were con- sidered by Hendry  , Osborn  and Espasa  . Even if serial correlation can be a tool for studying the nature of economics, it is detrimental to long range forecasting models. Whatever the source of bias may be, the only possibility for long range forecasting is to completely eliminate the bias.

1.1. Background

Inversely correlated random numbers were suggested by Hammersley and Morton  for use in Monte Carlo computer simulation experiments. In that application, a single computer simulation is replaced by two simula- tions. One simulation uses uniformly distributed random numbers in r. The other simulation uses. The expectation is that the average of the results of these two simulations has a smaller variance than for either one. In practice, the variance sometimes decreases, but sometimes it increases. See also Kleijnen  .

The theory of combining antithetic lognormally distributed random variables that contain negatively cor- related components was introduced by Ridley  . The Ridley  antithetic time series theorem states that “if is a discrete realization of a lognormal stochastic process, such that,

then if the correlation between and is, then.” Antithetic variables can be com-

bined so as to eliminate bias in fitted values associated with any autoregressive time series model (see the Ridley  antithetic fitted function theorem, and antithetic fitted error variance function theorem). Similarly, antithe- tic forecasts obtained from a time series model can be combined so as to eliminate bias in the forecast error. Ridley  applied combined antithetic forecasting to a wide range of data distributions. Ridley  demon- strated the methodology for optimizing weights for combining antithetic forecasts. See also Ridley and Ngne- pieba  and Ridley, Ngnepieba, and Duke  . The antithetic variables proof in Ridley  was for the special case of lognormally distributed.

The implication for using a biased mathematical model to investigate economic, engineering and scientific phenomena is that estimates obtained from the model are biased. Estimates of future values extrapolated from the model are also biased. As the forecast horizon increases, the bias accumulates and the extrapolations diverge from the actual values. This is most pronounced in the case of investigations into global warming phenomena. There, the horizon is by definition very far into the future. The smallest bias will accumulate, so much so that conclusions may be as much an artifact of the mathematical model as they are about climate dynamics. Com- bining antithetic extrapolations can dynamically remove the bias in the extrapolated values.

1.2. Proposed Research

The antithetic gamma variables discussed in this research are defined as follows.

Definition 1. Two random variables are antithetic if their correlation is negative. A bivariate collection of random variables is asymptotically antithetic if its limiting correlation approaches minus one asymptotically (see antithetic gamma variables theorem below).

Definition 2. is an ensemble of random variables, where belongs to a sample space and t belongs to an index set representing time, such that is a discrete realization of a gamma stationary stochastic process from the ensemble, , and are serially correlated.

In this paper, we extend the discovery by Ridley  beyond the lognormal distribution. The gamma distribution is very important for technical reasons, since it is the parent of the exponential distribution and can explain many other distributions. That is, a wide range of distributions can be represented by the gamma distribution. We will explore these possibilities by examining the correlation between X and when X is gamma distributed. Of particular interest is the correlation between X and as. A graph of the correlation between X and as and is shown in Figure 1. We begin by reviewing the obvious results for the case when p is positive. The correlation is positive when p is positive and exactly one when p is one. This is expected. As p moves away from one, the correlation decreases. As p approaches zero from the right, the correlation falls, albeit very slowly. This is also expected. In the case when p is negative, the correlation behaves quite differently. The result is entirely counterintuitive. As expected, the correlation is negative. But, unlike when p is positive, as p approaches zero from the left, the absolute value of the correlation increases. Furthermore, the actual correlation approaches minus one, not zero.

One purpose of this paper is to derive an analytical function for the correlation between X and when X is gamma distributed. A second purpose is to explore by extensive computation, the behavior of the correlation as p approaches zero from the left. The trivial case of p equal zero where the correlation is zero, is of no interest. We are interested in p inside a delta neighborhood of zero, not zero. Finally, we prove that the limiting value of the correlation is minus one.

The paper is organized as follows. In Section 2 we review the gamma distribution. In Section 3 we derive the analytic function for the correlation. In Section 4 we prove its limiting value. In Section 5 we use MATLAB  to compute correlations for a wide range of values generated from the gamma distribution. In Section 6 we outline the method for using antithetic variables to dynamically remove bias from the fitted and forecast values obtained from a time series model. Examples include computer simulated data. Section 7 contains conclusions and suggestions for further research.

2. The Gamma Distribution

The gamma distribution is very important for technical reasons, since it is the parent of the exponential distribution and can explain many other distributions. Its probability distribution function (pdf) (see Hogg and Ledolter  ) is:

where is the shape parameter and is the scale parameter. The gamma function is defined as

A graph of the gamma probability density function for and various values of is shown in Figure 2.

Behavior of r as p approaches 0 Exploring the effect of varying parameter values in the pdf of the gamma distribution
3. Correlation between X and X<sup>p</sup>

Let, be a discrete realization of a generalized gamma stochastic process. For, from Appendix A, the pth moment of the gamma distribution is given by

Therefore, since, the mean is

the second moment is

and the variance is

Let be the correlation between and. Then

and

Using Equation (3)

Therefore, using Equations (3) and (6), Equation (5) becomes

Since, Equation (7) becomes,

From Equation (4), and the correlation can be expressed in terms of as

or

The gamma function results in a complex number when the argument is negative. This is avoided if

. In any case, since we are only interested in p approaching zero from the left, this condition will always

be satisfied when.

4. Antithetic Gamma Variables Theorem

Theorem 1. If, , is a discrete realization of a generalized gamma stochastic process with shape parameter, then if is the correlation between and, then

See proof in Appendix B.

5. Correlation versus p

The effect of p on the correlation is demonstrated by calculating the correlation coefficient from Equation (8) for various values of and. The correlation coefficients are listed in Table 1 and plotted in Figure 3 and Figure 4. From Figure 3 and Figure 4, for all values of, the correlation coefficient gets closer to as. For all values of p, the correlation coefficient gets closer to as increases. From Figure 3, smaller values of produce distributions that are more asymmetrical, and larger values produce distributions

<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/18-1720429x76.png" xlink:type="simple"/></inline-formula>using Equation (8) for <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/18-1720429x76.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/18-1720429x77.png" xlink:type="simple"/></inline-formula> and 25; <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/18-1720429x76.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/18-1720429x77.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/18-1720429x78.png" xlink:type="simple"/></inline-formula>to<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/18-1720429x76.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/18-1720429x77.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/18-1720429x78.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/18-1720429x79.png" xlink:type="simple"/></inline-formula> <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/18-1720429x81.png" xlink:type="simple"/></inline-formula>using Equation (8)

that are more symmetrical. Also, as increases the standard deviation increases. This is indicative of greater spread about both sides of the mode of X. Equation (9) expresses the correlation in terms of. In practice the value of will be that for the actual data under study. It cannot be modified. Still, one might say that it appears that the effect of p on reversing the correlation is greatest for symmetrical distributions.

To validate Equation (8), the MATLAB  random number generator GAMRND (, , n) is used to generate random numbers, from the gamma distribution in Equation (1) with parameters,. The correlation is estimated from the sample correlation coefficient (). One application of the correlation reversal is to remove bias in values extrapolated from a time series model (see Appendix C). The gamma distribution is immediately applicable when, such that is approximately minus one. For, the difference between and minus one may introduce an error in estimating values extrapolated from the time series model. The sample correlation coefficient is obtained from

, where, , ,.

The results are shown in Table 2. The coefficients are almost identical to the theoretical values obtained from Equation (8) and listed in Table 1. In practice, the data may include relatively few observations. To investigate the small sample correlation coefficient, the correlation coefficient is calculated for.

6. Bias Reduction

Consider an autoregressive time series of n discrete observations obtained from a gamma distribution with a large shape parameter to which a least squares model, is fitted. Let the fitted values be. Next, consider the combined weighted average fitted values. The para- meter is a combining weight. The fitted values and are antithetic in the sense that they contain compo- nents of error and, respectively, that are biased and when weighted, and are perfectly nega-

tively correlated. The antithetic component is estimated from,

Behavior of <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/18-1720429x115.png" xlink:type="simple"/></inline-formula> as <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/18-1720429x115.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/18-1720429x116.png" xlink:type="simple"/></inline-formula> using Equation (8) with various values of<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/18-1720429x115.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/18-1720429x116.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/18-1720429x117.png" xlink:type="simple"/></inline-formula>
Values of <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/18-1720429x196.png" xlink:type="simple"/></inline-formula> for<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/18-1720429x196.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/18-1720429x197.png" xlink:type="simple"/></inline-formula>, <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/18-1720429x196.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/18-1720429x197.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/18-1720429x198.png" xlink:type="simple"/></inline-formula>,<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/18-1720429x196.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/18-1720429x197.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/18-1720429x198.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/18-1720429x199.png" xlink:type="simple"/></inline-formula>

where the exponent of the power transformation is set to the small negative value, r denotes sample correlation coefficient and s denotes sample standard deviation (see Appendix C for an outline of how inverse correlation can be used to eliminate bias in). The expectation is that if are biased, then will exhibit diminishing bias as. If are unbiased then and the combined fitted values are just the original fitted values. The corresponding combined forecast values are.

A shift parameter similar to that discussed by Box and Cox  is used to facilitate the power trans- formation and further improve the combined fitted mse. (determined by grid search) can be added to each value of to obtain prior to applying the power transformation and subtracted after conversion back to their original units, leaving the mean unchanged. While the data may be from stationary time series, they are of necessity a truncated sample. Any truncated data sample will fall short of the complete distributional properties of the population from which they are drawn, and therefore the property of stationary data. The

antithetic time series is rewritten and computed from, , where

, is the sample correlation between and, where and are sample standard deviations in and respectively, and where and are chosen so as to minimize the combined fitted mse for. The antithetic forecast values are computed from

,.

Of the terms p, , and, only p is unique to antithetic time series analysis, and it is not a fitted parameter. When implemented, p is actually a constant set to, an approximation of. Also, since, the transformation involving p is linear, and does not imply that the original model should have been non-linear. Like the use of here, it is common practice to apply various transformations such as logarithm, square root and Box and Cox  that add no new information, but make the data better conform to the assumptions of a postulated model. If there were no bias, or if antithetic combining did not reduce bias, then would simply be equal to one and the original postulated model only would apply.

Computer Simulation

To illustrate, consider a model fitted to computer simulated data based on stationary autoregressive processes, containing 1060 observations generated from , where to avoid initialization pro- blems, the first 250 values are dropped from and, , are obtained from MATLAB  . From the 1060 values, different models are fitted from the first 50, 51, , 60 values. Each model is used to forecast 1000 one-step-ahead forecast values corresponding to periods 51 - 1050, 52 - 1051, , 61 - 1060. This simple first order autoregressive model is chosen for its ease of understanding and transparency. It is perfect for the population from which the data are sampled. The sample sizes are typical of what can be expected in practice, and the outcomes from model fitting are subject to sampling bias.

The results are shown in Table 3. As increases, the fitted mse’s increase, indicative as expected, of the increase in the variance in the data. The combined fitted mse’s are all lower than the original fitted mse’s. The average gain is a reduction in fitted mse of 5.5%. This demonstrates that for a wide range of gamma distributions, combining antithetic fitted values can reduce the component of error that is due to systematic bias, leaving only random error. The fitted mse and 1000 period forecast horizon mse sensitivities to forecast origin (n) are shown in Table 4. As n increases from 51 to 60, the combined fitted mse’s are lower than the original fitted mse’s. The average gain is a reduction of 11.1%. The average gain in the combined forecast mse over the original forecast mse is a reduction of 6.9%. The forecast mse sensitivities to forecast horizon (N) are shown in Table 5. As N increases from 100 to 700, the combined forecast mse's are lower than the original forecast mse’s. The average gain is a reduction of 6.1%.

7. Conclusion

The correlation between a gamma distributed random variable and its pth power was derived. It was proved that the correlation approaches minus one as p approaches zero from the left and the shape parameter approaches infinity. This counterintuitive result extends a previous finding of the similar result for lognormally distributed random variables. The gamma distribution was modified so as to emulate a range of distributions, showing that

Fitted mean square error (mse) for gamma distributed autoregressive processes of length<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/18-1720429x278.png" xlink:type="simple"/></inline-formula>, <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/18-1720429x278.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/18-1720429x279.png" xlink:type="simple"/></inline-formula>, <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/18-1720429x278.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/18-1720429x279.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/18-1720429x280.png" xlink:type="simple"/></inline-formula>and various values of<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/18-1720429x278.png" xlink:type="simple"/></inline-formula><inline-formula> <inline-graphic xlink:href="http://html.scirp.org/file/18-1720429x279.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/18-1720429x280.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/18-1720429x281.png" xlink:type="simple"/></inline-formula>
Fitted mse
OriginalCombinedReduction %
5370−461.3771.20912.2
10520−394.7534.3947.6
15481−176.8926.7322.3
20575−116.9226.8241.4
25529−439.6869.3043.9
Average5.5
Fitted mean square error (mse) and one thousand period forecast mean square error (mse) for gamma distributed autoregressive processes of length n, <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/18-1720429x285.png" xlink:type="simple"/></inline-formula>, <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/18-1720429x285.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/18-1720429x286.png" xlink:type="simple"/></inline-formula>,<img data-original="http://html.scirp.org/file/18-1720429x287.png" />
Forecast origin n Fitted mseForecast mse
OriginalCombinedOriginalCombined
50370−461.3771.2096.1585.760
51411−461.3511.1856.2305.407
52252−461.3541.2026.4935.641
53446−391.3971.2666.0715.943
54310−551.3711.1606.0705.611
55261−371.3561.2545.9415.549
56465−411.3451.1896.1085.720
57400−391.3311.2186.2345.592
58507−361.3121.1916.1426.224
59405−321.2901.1756.0445.687
60354−621.4341.2075.6005.300
Average1.3561.2056.0995.676
Combined reduction %11.16.9
Forecast mean square error (mse) for gamma distributed autoregressive processes of length<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/18-1720429x290.png" xlink:type="simple"/></inline-formula>, <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/18-1720429x290.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/18-1720429x291.png" xlink:type="simple"/></inline-formula>, <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/18-1720429x290.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/18-1720429x291.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/18-1720429x292.png" xlink:type="simple"/></inline-formula>,<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/18-1720429x290.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/18-1720429x291.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/18-1720429x292.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/18-1720429x293.png" xlink:type="simple"/></inline-formula>
Forecast horizon NForecast mse
OriginalCombinedReduction%
1005.9735.5816.6
1506.5296.0597.2
2005.4265.2203.8
2506.5596.1526.2
3006.4556.1814.2
3506.3906.0515.3
4005.9825.6305.9
4505.9065.5546.0
5005.9725.6076.1
5506.1145.6996.8
6006.0635.6167.4
6505.8185.4017.2
7005.6325.2436.9
Average6.0635.6926.1

antithetic time series analysis can be generalized to all data distributions that are likely to occur in practice. The gamma distribution is unimodal. A suggestion for future research is to investigate the correlation between a random variable and its pth power when its distribution is multimodal. Another suggestion is to compare the effectiveness of the Hammersley and Morton  antithetic random numbers with antithetic random numbers constructed from the method described in this paper. Combining antithetic extrapolations can dynamically reduce bias due to model misspecifications such as serial correlation, non-normality or truncation of the dis- tribution due to data sampling. Removing bias will eliminate the divergence between the extrapolated and actual values. In the particular case of climate models, removing bias can reveal the true long range climate dynamics. This will be most useful in models designed to investigate the phenomenon of global warming. Beyond the examples discussed here, antithetic combining has broad implications for mathematical statistics, statistical process control, engineering and scientific modeling.

Acknowledgements

The authors would like to thank Dennis Duke for probing questions and good discussions.

Cite this paper

PierreNgnepieba,DennisRidley,11, (2015) General Theory of Antithetic Time Series. Journal of Applied Mathematics and Physics,03,1726-1741. doi: 10.4236/jamp.2015.312197

AppendixAppendix A: pth Order Moment for the Gamma Distribution

The pth moment of the gamma distribution is derived as follows:

Multiplying and dividing by, Equation (A.1) becomes

Since, is the pdf for a gamma function with the parameter,

and equation (A.2) becomes

Appendix B: Proof of the Antithetic Gamma Variables Theorem

By applying the Taylor expansion around to Equation (8) we have

where, are the first and second derivatives of and represents the remainder.

The combination of equations (B.1)-(B.3) reduces Equation (8) to

Therefore,

By using the polygamma function (see Abramowitz and Stegun  )

Equation (B.4) is transformed into

The digamma function for real, as is

(see also Bernado  ).

Its derivative is the polygamma function

And,

From which,. Finally, the limit in Equation (B.5) is

Appendix C: Inverse Correlation and Bias Elimination

Consider a gamma distributed time series with a large shape parameter from which are observations. We have shown that for very small negative p, and are nearly perfectly correlated, albeit negatively, so we can express in the original units of, by means of the linear regression of on as follows:

where is an error term.

As p approached zero from the left, near perfect correlation between and ensures that the error term becomes negligible, and a near perfect estimate is obtained from

Now, suppose that

is a time series model. If there is any bias due either to serial correlation in or sampling error in estimating the model, the estimated model will be biased such that. The estimated parameters of this model will be biased. That is unavoidable. Therefore, any estimate of from this model will also be biased.

To remove this bias, we power transform to obtain. Then, we use Equation (C.1) to convert back to the original units of. Hence

where and are least squares estimates obtained from the regression of on, and the error approaches 0.

Denoting sample standard deviation by s and correlation coefficient by,

(see also the Ridley  antithetic fitted function theorem).

Both estimates and contain errors. These errors contain two components. One component is purely random and one component is bias. Combining the estimates dynamically cancels the bias components, leaving only the purely random components. See Appendix D for the proof of how this can occur. The combining weights discussed in Appendix D are theoretical, expressed in terms of errors that are unknown and un- observable, so we must rely on the approximation as follows. The combined estimate is obtained from

where, and the value of is chosen so as to minimize the mse

Consider the error in,. Then. Differentiating with

respect to, Setting the derivative to zero and solving for,. This optimal yields the minimum mse, because iff, in which case, and otherwise.

The steps for obtaining the combined antithetic fitted values are outlined as follows:

Step 1: Estimate the model parameters and fitted values

Step 2: Set

Step 3: Calculate

Step 4: Calculate

Likewise, the unbiased combined estimate of a future value at time is obtained from

Appendix D: Antithetic Fitted Error Variance Reduction

Consider a gamma distributed time series with a large shape parameter. Next, consider a minimum mean square error fitted value obtained from a stationary first-autoregressive process, given by where and and are least-squares estimates of and, respectively, such that

Therefore, as, and since is stationary so that, and since the errors are serially correlated so that,

(see also Fuller  , p. 404). Consider as an estimate of. From (D.1), and given that the time series is stationary, then as and,

where

due only to errors resulting from serial correlation. Therefore,

Next, consider another fitted value, obtained from the linear projection of the asymptotically antithetic series on, without the introduction of any new error,

Substituting for

where is the correlation between and, and

is the antithetic error due to the serial correlation, but corresponding to.

The expansion of will

contain the constant, the product of p and some function of and as follows:

as. Substituting into Equation (D.5),

Now

Substituting from Equation (D.2) and (D.6) and since and are fixed for the data and model,

Substituting for and

Substituting from (D.3) and factoring out

and since (see Appendix B), and and as, then

from which we see that there are many

ways in which the combined error variance can be less than the original error variance in Equation (D.3). In

particular when, the error variance due to systematic serial correlation

vanishes. The only error variance remaining will be due purely to random error unexplained by the original model.

NOTESReferencesFerrenberg, A.M., Lanau, D.P. and Wong, Y.J. (1992) Monte Carlo Simulations: Hidden Errors from? Good Random Number Generators? Physical Review Letters, 69, 3382-3384. http://dx.doi.org/10.1103/PhysRevLett.69.3382Griliches, Z. (1961) A Note on Serial Correlation Bias in Estimates of Distributed Lags. Econometrica, 29, 65-73.http://dx.doi.org/10.2307/1907688Nerlove, M. (1958) Distributed Lags and Demand Analysis for Agricultural and Other Commodities. U.S.D.A Agricultural Handbook No. 141, Washington.Koyck, L.M. (1954) Distributed Lags and Investment Analysis. North-Holland Publishing Co., Amsterdam.Klein, L.R. (1958) The Estimation of Distributed Lags. Econometrica, 26, 553-565. http://dx.doi.org/10.2307/1907516Fuller, W.A. and Hasza, D.P. (1981) Properties of Predictors from Autoregressive Time Series. Journal of the American Statistical Association, 76, 155-161. http://dx.doi.org/10.1080/01621459.1981.10477622Dufour, J. (1985) Unbiasedness of Predictions from Estimated Vector Autoregressions. Econometric Theory, 1, 381-402. http://dx.doi.org/10.1017/S0266466600011270Chandan, S. and Jones, P. (2005) Asymptotic Bias in the Linear Mixed Effects Model under Non-Ignorable Missing Data Mechanisms. Journal of the Royal Statistical Society: Series B, 67, 167-182. http://dx.doi.org/10.1111/j.1467-9868.2005.00494.xLi, B., Nychka, D.W. and Ammann, C.M. (2010) The Value of Multiproxy Reconstruction of Past Climate. Journal of the American Statistical Association, 105, 883-911. http://dx.doi.org/10.1198/jasa.2010.ap09379Bunn, D.W. (1979) The Synthesis of Predictive Models in Marketing Research. Journal of Marketing Research, 16, 280-283. http://dx.doi.org/10.2307/3150692Diebold, F.X. (1989) Forecast Combination and Encompassing: Reconciling Two Divergent Literatures. International Journal of Forecasting, 5, 589-592. http://dx.doi.org/10.1016/0169-2070(89)90014-9Clemen, R.T. (1989) Combining Forecasts: A Review and Annotated Bibliography. International Journal of Forecasting, 5, 559-583. http://dx.doi.org/10.1016/0169-2070(89)90012-5Makridakis, S., Anderson, A., Carbone, R., Fildes, R., Hibon, M., Lewandowski, R., Newton, J., Parzen, E. and Winkler, R. (1982) The Accuracy of Extrapolation (Times Series) Methods: Results of a Forecasting Competition. Journal of Forecasting, 1, 111-153. http://dx.doi.org/10.1002/for.3980010202Winkler, R.L. (1989) Combining Forecasts: A Philosophical Basis and Some Current Issues. International Journal of Forecasting, 5, 605-609. http://dx.doi.org/10.1016/0169-2070(89)90018-6Hendry, D.F. and Mizon, G.E. (1978) Serial Correlation as a Convenient Simplification, Not a Nuisance: A Commentary on a Study of the Demand for Money by the Bank of England. The Economic Journal, 88, 549-563. http://dx.doi.org/10.2307/2232053Hendry, D.F. (1976) The Structure of Simultaneous Equations Estimators. Journal of Econometrics, 4, 551-588. http://dx.doi.org/10.1016/0304-4076(76)90017-8Mizon, G.E. (1977) Model Selection Procedures. In: Artis, M.J. and Nobay, A.D., Eds., Studies in Modern Economic Analysis, Basil Blackwell, Oxford.Pindyck, R.S. and Rubinfeld, D.L. (1976) Econometric Models and Economic Forecasts. McGraw-Hill, New York.Durbin, J. and Watson, G.S. (1950) Testing for Serial Correlation in Least Squares Regression: I. Biometrika, 37, 409-428.Durbin, J. (1970) Testing for Serial Correlation in Least-Squares Regression When Some of the Regressors Are Lagged Dependent Variables. Econometrica, 38, 410-421. http://dx.doi.org/10.2307/1909547Osborn D.R. ,et al. (1976)Maximum Likelihood Estimation of Moving Average Processes Journal of Economic and Social Measurement 5, 75-87.Espasa, D. (1977) The Spectral Maximum Likelihood Estimation of Econometric Models with Stationary Errors. 3, Applied Statistics and Economics Series. Vanderhoeck and Ruprecht, Gottingen.Hammersley, J.M. and Morton, K.W. (1956) A New Monte Carlo Technique: Antithetic Variates. Mathematical Proceedings of the Cambridge Philosophical Society, 52, 449-475. http://dx.doi.org/10.1017/S0305004100031455Kleijnen, J.P.C. (1975) Antithetic Variates, Common Random Numbers and Optimal Computer Time Allocation in Simulations. Management Science, 21, 1176-1185. http://dx.doi.org/10.1287/mnsc.21.10.1176Ridley, A.D. (1999) Optimal Antithetic Weights for Lognormal Time Series Forecasting. Computers & Operations Research, 26, 189-209. http://dx.doi.org/10.1016/s0305-0548(98)00058-6Ridley, A.D. (1995) Combining Global Antithetic Forecasts. International Transactions in Operational Research, 4,387-398. http://dx.doi.org/10.1111/j.1475-3995.1995.tb00030.xRidley, A.D. (1997) Optimal Weights for Combining Antithetic Forecasts. Computers & Industrial Engineering, 2, 371-381. http://dx.doi.org/10.1016/s0360-8352(96)00296-3Ridley, A.D. and Ngnepieba, P. (2014) Antithetic Time Series Analysis and the CompanyX Data. Journal of the Royal Statistical Society: Series A, 177, 83-94. http://dx.doi.org/10.1111/j.1467-985x.2012.12001.xRidley, A.D., Ngnepieba, P. and Duke, D. (2013) Parameter Optimization for Combining Lognormal Antithetic Time Series. European Journal of Mathematical Sciences, 2, 235-245.MATLAB (2008) Application Program Interface Reference, Version 8. The Math Works, Inc.Hogg, R.V. and Ledolter, J. (2010) Applied Statistics for Engineers and Physical Scientists. 3rd Edition, Prentice Hall, Upper Saddle River, 174.Box, G.E.P. and Cox, D.R. (1964) An Analysis of Transformations. Journal of the Royal Statistical Society: Series B, 26, 211-252.Abramowitz, M. and Stegun, I.A. (1964) Handbook of Mathematical Functions. Dover Publications, New York, 260 p.Bernado J.M. ,et al. (1976)Algorithm AS 103: Psi (Digamma) Function Journal of the Royal Statistical Society: Series C (Applied Statistics) 25, 315-317.Fuller, W.A. (1996) Introduction to Statistical Times Series. Wiley, New York.