Using the Markov Chain for the Generation of Monthly Rainfall Series in a Semi-Arid Zone

Numerous methodologies have been developed in the literature for the generation of rain. However, in semi-arid areas where the irregularity of rain is contrasted, the question of the applicability of these models is still relevant. The objective of this article is to propose a development method of stochastic generator of monthly rainfall series. The present work is based on the modeling of the occurrence and the quantity of rain in a separate way. The occurrence is treated in two stages. The first step considers the Markov chain according to the occurrence of annual statements (dry, average and wet). The second step uses the monthly rankings. The amount of rain is calculated based on historical series according to the monthly rank and the annual statement noted. This method is applied to rainfall data recorded at five rainfall stations in semi-arid region of Central Tunisia. The usual and conventional statistical tests of the generated series have shown the validity of this method.


Introduction
The geographical location of Tunisia plays an important role in the availability of water resources.As matter of fact, 70% of its surface is exposed to arid and semi-arid climate.This country is characterized by a limited resource that is irregularly distributed in space and time.This finding represents a major challenge for managers, which explains the state's orientation towards the mobilization of surface water and the development of management plans for the immobilization and preservation (Louati et al., 1998 [1], Sakiss et al., 1994 [2], Lacombe, 2007 [3], Kingumbi et al., 2009 [4]).
The lack of long series of rainfall observations considerably limits the estimation of water supplies.It has been shown in the literature that the generation of sets of data using short time series can lead to errors.Therefore, the most appropriate mode concerns the generation of rainfall series in conjunction with a rainfall-runoff model (Chandler, 2003 [5], Chandler et al., 2005 [6], Selvalingam and Miura, 1978 [7], Rodriguez-Iturbe et al., 1987 [8], Kim and Olivera, 2010 [9]).
The ARIMA models have been discussed in detail by Salas et al. (1980) [20], Fortin et al. (1994) [21], Unal et al. (2004) [16] and Machiwal and Jha (2012) [19].This type of model requires Gaussian series.As for global climate models, the generation of climate data is carried out on large meshes.The shift to finer mesh is through a spatial disaggregation.This method does not take into account the possible variability in the large mesh.
At the level of stochastic approach, rain is modeled by two processes, namely the occurrence and amount (Sharma and Lall, 1999 [22]).The Markov chain has been widely used in the literature to model the occurrence of rain.Gabriel and Neumann (1962) [23] are the first to have modeled the daily rainfall using the Markov chain with two states: dry and wet.Selvalingam and Miura (1978) [7] compared the different methods based on Markov chain developed by several authors.
The monthly rainfall in semi-arid areas, particularly in Tunisia, is essentially characterized by the presence of zero values.In a study of regionalization of monthly rainfall distribution laws in Tunisia, Merzougui and Slimani (2012) [24] showed that no statistical law was adjustable for the months of June, July and August.The annual rainfall is unevenly and very randomly split between the months.
The present article proposes to adopt a response to the difficulty of adjusting a statistical law to the monthly rainfall series in the semi-arid region in Tunisia.This work is based on a new method of application of the Markov algorithm for the generation of ranks between 1 and 12 for each month, and then the generation of the states for years (dry, wet or average) and finally, to assign monthly rain values to these ranks according to the year based on the observations of history.
The first part of this article is devoted to describing the study area as well as the characteristics of rain through five stations of the Tunisian semi-arid.The second part is devoted to the mode of application of the Markov algorithm.The last part presents the results and discussions.

Characteristics of rain in the area of central Tunisia
The proposed methodology for the generation of the rain has first required the study of the time series of rainfall.This last step is essential for the determination of variables characterizing the choice of the approach of this generation.Indeed, the study of the historical behavior of the data series allows achieving an adequate adaptation of the generation method.In Tunisia, several studies have been interested in assessing the variable "rain" by considering varying time steps (Bargaoui, 1983 [25], Camus, 1985 [26], Sakiss et al., 1994 [2], Thabet and Thabet, 1995 [27], Zahar and Laborde et al., 1998 [28], Kingumbi, 2009 [4], Merzougui and Slimani, 2012 [24]).
A study of the succession of dry, humid and average scenarios was conducted on annual rainfall series to provide a clear cyclical.According to Sakiss et al. (1994) [2], one year is considered dry if the annual total is 40% below the annual average, very dry if the annual total was 60% below the average, a humid year if the annual total is 40% more than average rain.Finally, one year is very humid if the annual total is 60% more than average rain.
In this article, only three scenarios were adopted.They are presented as follows: (1) The year is considered average if the annual rainfall varies between −40% and + 40% of the inter-annual average; (2) The year is considered humid when the total of the registered year exceeds 40% of the inter-annual average; (3) The year is considered dry if the total is less than 40% of the inter-annual average.The annual rainfall series recorded in the five stations, subject of the present work are illustrated in Figure 1.These series show a very irregular and random succession of dry, humid and average scenarios.No clear cyclicity has emerged.

Data Used
The rain generator developed in this article has been applied in five rainfall stations in central Tunisia (Figure 2).The characteristics of the monthly rainfall data recorded at these stations are summarized in Table 1.The months of October and September show the highest average.The months of June, July and August have the lowest averages for all stations.The coefficients of variation are high for the months that are the most and least rainy, thereby illustrating the strong irregular rainfall in the study area.
To characterize the distribution of annual rainfall depending on the month, we have undertaken a ranking of months of each year through the amount of monthly rainfall recorded.The ranks of each month take values from 1 to 12. Table 2 shows the frequencies of appearance of these ranks for each month and thereafter the contribution   of monthly rainfall to the annual precipitation.The examination of tables allows seeing that the monthly rainfall distribution is quite irregular.The month of October although on average the rainiest was found in 13% of cases during the period 1956-2010, first rank, 22% in third, 11% and the ninth and the last 2% for the Sidi Saad station.Monthly values reach their maximum everywhere during the months from September to May.The least rainy months might be September as it might be another month.However, the month of July has the highest frequency for the least rainy months.The month of October, which has the highest average monthly inter-annual is 7% of the time in rank 12 and only 17% of the time in the front row.This finding illustrates the irregularity of rain and corroborates with the results of the work already mentioned.

Development of a Rain Generator
The available approaches in the literature for the generation of monthly rain present constraints and assumptions that limit its use.These models are developed in wetlands with different characteristics from those of arid and semi-arid.Given the characteristic of the contrasted irregularity of the average rainfall, the Markov model seems to be the most appropriate in this study area to generate monthly rainfall data based on the ranks.A Markov model is a stochastic pattern in which the evolution of future states depends only on the present state and not the past states.The Markov chain designates the process of Markov in a discrete time that models the evolution of discrete amounts X over time that can take a finite number of states X = X 1 , X 2 = X, X = Xn and passes from the state X i to the instant t to the X j state to the next instant t + 1 with a conditional probability P ij (Bargaoui, 1983 [2]).
The method proposed in this article is based on three steps: 1) The first step is to generate annual returns of dry, wet and average by using the Markov chain.The corresponding matrix transition is determined by the frequency of historical occurrence of passage of each state to another.
2) The second is to generate monthly ranks ranging from 1 to 12 for each month.The matrix transition is developed by the historical frequency of appearance of each rank and for each month.3) Depending on the month, the generated ranking and the corresponding state of the year generated, the amount of rain is affected depending on the average of the history regarding each rank for each month and each corresponding status.

Generating annual states:
In the Markov chain, rain occurrence is modeled by states that can be dry and wet or more states depending on the amount of rain.Selvalingam and Miura (1978) [7] and Allen and Haan (1975) [29] are the first to consider several states of occurrence depending on the amount of rain and not just two dry and wet conditions.
This present work has considered three annual reports: State 0 (wet year), State 1 (average year) and State 2 (dry year).The study of the historical series allows describing the appearance of these dry, wet and average scenarios.
The transition from one state to another has developed the matrix transition.This latter is formed by the probabilities of transitions P ij , which is the probability of going to state j knowing that we are at the state i which is calculated by the following equation: where N ij is the number of transition from state i to state j and N i is the number of transitions from state i to any other state.Thus, we have 9 situations described in Table 3.
The station "Haffouz" was chosen here as an example to illustrate the process of generation of states.Table 4 describes the number of occurrence of each state during the historical period from 1968 to 2011.The matrix transition is shown in Table 5.A 100-year generation of annual statements (dry, wet and average) was carried out subsequently, based on the matrix transition and the Markov model.Table 6 shows only 15 years of generation as an example.

Generation of ranks for each month:
After the ranking of months between 1 and 12 depending on the amount of rainfall recorded in that month by contributing to the annual quantity, the appearance frequency table of every rank between 1 and 12 for each month from January to December is an array of size 12 * 12. Within this array, the element of each rank is equal to 1. Therefore, we assumed that this table is the matrix transition to generate monthly ranks of 100 years.For the "Haffouz" station, Table 7 shows the matrix transition monthly ranks.The ranks of months of the year 2011 are considered as the initial condition for the generation of ranks of 100 years.As an example, Table 8 only shows the ranks generated for 15 years.

Quantitative assignment of rain:
After generating the ranks for each month and the annual statement (dry, humid, medium) for future 100-year series, we sought a method to estimate the corresponding quantities of rain.The methods are based on generating rainfall following static laws cannot be applied herein since no law is adjustable for the months of June July and August (Merzougui and Slimani, 2012) [24].The method used herein is to calculate based on the observed series, the average monthly rainfall according to the rank and status of the corresponding year as shown in Table 9, which summarizes this methodology.The allocation of the amount of the average rainfall is calculated thereafter for each month depending on the generated rank as well as the corresponding generated annual state.

Results and Discussion
The series generated by the methodology outlined in the preceding paragraph are shown in Figure 3 for five sta-tions.
The principle of stochastic simulation models is to generate random variables that must meet the statistical structure of the process to be reproduced (Miquel, 2006 [17]).A monthly rainfall generator aims to preserve the monthly statistical characteristics.In fact, in case of aggregation of these data the annual series of characteristics observed are not obviously preserved.The reliability model is judged by its ability to reproduce the statistical characteristics comparable to those of historical series (Selvalingam and Miura, 1978 [7], Salas et al., 1980 [20], Unal et al., 2004 [16], Kim and Olivera, 2010 [9], Kim and Olivera, 2012 [30]).These features are essentially the means and coefficients of variation.
The characteristics of generated and observed rainfall are comparable for all months except for the month of September (Figure 4).This can be explained by the high variability of rainfall recorded in that month.This month brought together the entire strong flood that occurred.The control of chronic homogeneity is considered another test to ensure the reliability of the model and representativeness of the generated series, the method used herein is a graphic test of simple accumulated monthly rainfall totals.
Figure 5 shows that no breakage or a change in slope appears on the right of monthly totals for all stations from where the homogeneity of the observed series is generated.Figure 6 shows the autocorrelation coefficients between the staggered series and series of 1 to 7 for the generated and observed range of the five stations.It is clear that the coefficients of the two series are still between 1 and −1.

Conclusion
This study has shown the constraints that limit the application of conventional methods of generating rain in a semi-arid environment such as Tunisia.The model developed on Tunisian rainfall stations generates the occurrence and amount separately.The occurrence was generated by annual statements and monthly ranks by using the Markov chain.The amount of rain has been calculated by the observed series for each month depending on the status and rank.The results show that the model has preserved the statistical characteristics of every month except for the month of September which shows high variability due to heavy floods that usually occur in this month.The study of this month (September) in a separate way constitutes the perspective of the present work.

Table 1 .
Statistics characteristics of rain recorded in the five studied rainfall stations.

Table 3 .
Transition matrix of the states.

Table 4 .
Number of occurrence of states.

Table 7 .
Transition matrix for the ranks of the Haffouz station.

Table 8 .
Monthly rankings generated for a period of 15 years for Haffouz station.

Table 9 .
Calculation of the observed monthly rainfall.