Co-Integration Models for Koyna and Warna Reservoirs, India

Koyna region, a seismically active region, has many time series observations such as seismicity, reservoir water levels, and many bore well water levels. One of these series is used to predict others since these parameters are interlinked. If these series were stationary, we used correlation analysis. However, it is seen that maximum of these time series are nonstationary. In this case, co-integration method is used that is extracted from econometrics and forecast is possible. We have applied this methodology to study time series of reservoir water levels of this region and we find them to be co-integrated. Therefore, forecast of water levels for one of the reservoir is done from the other as these will never drift apart too much. The outcomes demonstrate that a joint modelling of both data sets based on underlying physics resolves to be sparingly useful for understanding predictability issues in reservoir induced seismicity.


Introduction
Fluid pressure and transport are playing crucial roles in controlling the seismicity in the near subsurface and described in two ways of assessing their roles.In theoretical approach, a model of the region is developed and tested to be validated with the data provided.Further, this model is used for understanding of the process and also forecasting of future events.Such a methodology requires a comprehensive knowledge of the distribution of physical properties and forces, which is not available due to the high cost of acquiring relevant data.In the other empirical approach, the responses of the region are studied to make statistical models for forecasting the onset of the future events in which society has immediate interest.For this approach, we have two datasets, viz., reservoir water level time series and seismicity time series and it should be fascinating to know any relationship between them.Hence we have an opportunity to find long term relationships between them about each other and this should be pattern even after short term departure.
The seismicity in the month of September is 90% correlated with a rapid increase in the water levels in the month of June/July due to the rain [1].The cross correlation between the seismicity and water levels in the Koyna reservoir is studied and recommended that a time lag of 223 days resembles a maximum correlation coefficient [2].Many times stationary time series methods such as correlation analysis are applied to them and false causal relationships are inferred.In stationary time series, mean and variance are constants and this is not observed in seismicity and reservoir water level data.The near subsurface is constantly changing due to variety of processes (both internal and external), making these times series as nonstationary.From the above consequences the cointegration method has been found to be a good application and this method have the scope to find the linear combination between two nonstationary time series, which transforms into stationary time series.Cointegration method has several fascinating applications in nonstationary time series.Further, we apply this methodology to find relationship between seismicity and reservoir water level time series.Error correction model based on the cointegration theory is applied to forecast the annual runoff of the Basishan station and Fengman station in the Songhua river basin [3].Koyna region has an impounded reservoir that continues to be seismically active and seismic activities have become noticeable after the start of impoundment since 1961 [4] [5].This area experienced a major shock in 1967 of magnitude 6.5.Since then, no other shock of such magnitude has taken place.Nevertheless the shock of lesser magnitudes has occurred magnitude 5 and above occurred in 1967, 1968, 1973, 1993, 1994, 2000, 2005 and 2009.Figure 1 shows the seismicity time series with minimum shock magnitude as 2 based upon Gutenberg-Richter relationship (LogN = a + (b * M), (N-number of earthquakes and M-magnitude) has been fitted to this data [6] and b values are observed to range within 0.9 to 1.2. Figure 2(a) shows the Koyna water level data since the impounding of the reservoir in 1967.The water level increases after rainfall and releases subsequently.The maximum water level recorded is about 660 m and the minimum value is 610 m, it follows an annual cycle in reservoir level data.Figure 2(b) shows the Warna reservoir water level data from 1985 after the impounding of the reservoir.Various reasons have been given to the cause of these earthquakes.Water load of the reservoir has been single such mechanism.It has also been argued that water seepage from the reservoir through porous basement and faults leads to the rise of pore pressure in the focal region to induce the faulting.The pore pressure increase at focal region depends on the hydraulic properties.
Therefore, the seismicity in the region requires a causative relationship between water level variation and seismicity data.As both time series are nonstationary, simple correlation coefficient analysis gives false relationship.It should be desirable to use co-integration method for finding a causative relationship and this is done in the converse.

Methodology
Non-stationary time series contain trends.The trends can be deterministic or stochastic.For instance a random walk model has a stochastic trend.At first we need to determine the presence of stochastic trend in the time series.A test for the presence of stochastic trend in the time series is widely called the Augmented Dickey Fuller (ADF) test.In this test the time series is represented as t y ∆ , called autoregressive model of order p, denoted as AR (p).
( ) α, β δ I , and γ are constants." t ε " Is the random error term and "p" indicates number of lags during this test, hypothesis are tested for, the coefficient of lag term to be at zero.If this null, then hypothesis cannot be rejected and the time series is nonstationary.First we need to find the value of lag order "p".This is done with the help of Akaike Information Criterion (AIC) or Bayesian information criterion (BIC).After determining the order, Equation (1) fits with the time series using ordinary least squares (OLS) method to get constants: , , , i α β γ δ and variance of i ε .From mean and standard error of constant, γ the value of the tau statistics defined by ( ) If this tau statistic is more than a critical value, then the hypothesis cannot be rejected and the time series has stochastic trend.
After determining the nonstationary nature of the time series, further we proceed to establish a relation between these two time series, the seismicity time series (reservoir water level changes), t y , with water level time series, t x , as Here t u is stationary time series.The values of a i 's are obtained by Ordinary Least Squares (OLS) theory and times series of residual t u is obtained using the coefficients derived by Ordinary Least Squares (OLS) method.This residual series fits with the following model: To test the null hypothesis the coefficient 1 γ must be zero, ADF test is performed.If the relevant "tau" sta- tistics are greater than the critical value and the presence of a unit root cannot be rejected then the both times series are not cointegrated otherwise they are cointegrated.This method is called the Engle-Granger two step methods.There are more advanced methods for multivariate problems.We will be using freely available software, EVIEWS, for implementing the cointegration mechanism.

The Akaike Information Criterion
(AIC) is an approach of selecting a model from a set of models.The chosen model is the one that minimizes the Kullback-Leibler distance between the model and the truth.It's based on information theory, but a heuristic way to think about it is as a criterion that seeks a model that has a good fit to the truth but few parameters.It is defined as: where likelihood is the probability of the data given a model and K is the number of free parameters in the model.AIC scores are often shown as ∆AIC scores, or difference between the best model (smallest AIC) and each model (so the best model has a ∆AIC of zero).

Results and Discussions:
The reservoir water levels in Koyna and Warna are taken from years 1996-2008 to establish the Cointegration theory.The lag parameters using Akaike information criteria for individual reservoir water levels as well as a combination of both the reservoirs are obtained and shown in Table 1 and Table 2.The trend of AIC value changes in both the reservoirs observed at lag 2.
Thus the fitted model is AR (2).Now we have to test the stationary nature of all the data sets.To achieve the above, we need to apply the ADF test for both time series with the order p. and the results are shown in Table 3.
From this table (the tau statistics) is recorded.The variables of X and Y can be made ordinary least square regression as the following form: 1.012132 40.17355 The error correction model according to the estimation of formula Equation (2) as follows:   4. The results show a maximum relative error of the forecasting values is 25%.The numerical results also show that the forecasting for a month after the period of observation the error is merely 3% but if the forecasting is for a few months before, then the error is more.These results conclude that our model can forecast the values in the tolerance limit are up to six points (months) only.

Conclusion
The error correction model (ECM) based on the co-integration theory to do analysis and discuss the cointegration methodology of daily water level changes in the Koyna and Warna reservoirs of the seismically active region is Established.The non-stationary time series of the model are cointegrated and the regression analysis can be used to forecast the time sequences.

Figure 3 .
From Figure 3 it is very clear that the model values are well matched with the observed values (original values).Further it is used to calculate the water levels in the Warna reservoir at different times since Jan 2005 to Dec. 2006 and the values are given in Table

Figure 3 .
Figure 3. Forecast of water levels in the Warna reservoir (red line is the observed water level, blue line is water level forecast).

Table 1 .
Lagging order for both Koyna-Warna reservoir water levels.and X t are the Warna and Koyna reservoir water levels respectively.From the residual equation the unit root test is done for the residuals and it is established that both the variables of X and Y are cointegrated.The vector error correction model (VECM) is estimated according to the formula Equation (1) as follows: ) is used to calculate the water level in the Warna reservoir at a given time t.The water levels are calculated from the years 1998 to 2006 and plotted in

Table 2 .
Calculated lag order 'p' for both Koyna-Warna reservoir water levels.

Table 3 .
The unit root test results of Koyna and Warna reservoirs.

Table 4 .
Calculated values of reservoir water levels.