Simulation of a Daily Precipitation Time Series Using a Stochastic Model with Filtering

After we modified raw data for anomalies, we conducted spectral analysis using the data. In the frequency, the spectrum is best described by a decaying exponential function. For this reason, stochastic models characterized by a spectrum attenuated according to a power law cannot be used to model precipitation anomaly. We introduced a new model, the e-model, which properly reproduces the spectrum of the precipitation anomaly. After using the data to infer the parameter values of the e-model, we used the e-model to generate synthetic daily precipitation time series. Comparison with recorded data shows a good agreement. This e-model resembles fractional Brown motion (fBm)/fractional Lévy motion (fLm), especially the spectral method. That is, we transform white noise Xt to the precipitation daily time series. Our analyses show that the frequency of extreme precipitation events is best described by a Lévy law and cannot be accounted with a Gaussian distribution.


Introduction
Just as turbulence and clouds have been described using (random) fractals, geoscientific fields such as topographical fields, temporal or spatial rainfall fields, and earthquake-slip fields are often modeled using fractals (Gagnon, et al. [1]; Lavallée and Archuleta [2]; Lavallée [3]; Lovejoy and Schertzer [4]; Schertzer and Lovejoy [5]; Tchiguirinskaia, et al. [6]).If we specifically examine simulations of temporal or spatial rainfall field as one example, then two approaches might be used (Over and Gupta [7]): stochastic approaches, or physical or dynamical approaches.Regarding the former, many stochastic models of temporal and spatial rainfall fields have been developed.According to Over and Gupta [7], a pioneering research effort for modeling of this type was that of LeCam [8].Many stochastic models have been proposed, examples of which are the AR model (Brockwell and Davis [9]), ARMA model (Box and Jenkins [10]), NSRP model (Cowperhwait [11]), and the WGR model (Waymire,et al. [12]).Regarding precipitation models before 1990, which are not based on the fractal theory, see Valdes [13].In recent twenty years, several mono-fractal or multifractal models have been used to model the scaling property of rainfall fields (Over and Gupta [7]; Lovejoy and Schertzer [4]).One of these multifractal models, the discrete cascade model is used to model rain data in Over and Gupta [7].An alternative multifractal model, the continuous cascade model (e.g.Lovejoy and Schertzer [14]) is used to model rain fields in Wilson et al. [15].
As indicated above, geoscientific fields are often modeled by more physical or dynamical methods.Regarding precipitation, rainfall fields are often generated using meso-scale meteorological models or a global climate model.An important shortcoming of physical or dynamic approaches is their nature: they are time-consuming and resource-consuming.Simulation using stochastic approaches is based on random number generation.A simulation result by the latter is not deterministic but stochastic.Once one can generate random numbers, however, the random numbers must be transformed to the required data.
As described herein, we specifically examine the simulation of daily precipitation time series because simulation using daily precipitation is extremely important for flood design.For generating daily precipitation time series at specific observation station, we believe that the stochastic approach is superior to physical and dynamical approaches from the viewpoint of the cost of resources including time.
Although stochastic models of various kinds exist, models based on fractal theory or scaling theory present great advantages because of their capability of simply describing phenomena.Fractals were originally conceptualized by Mandelbrot [16].Multifractals were proposed later (Frisch and Parisi [17]).Regarding the general theory of fractals and multifractals, see Feder [18] and Falconer [19].For instance, Gagnon et al. [2] described the evolution from the (mono-) fractal model to the multifractal model, and explained the latter's superiority in terms of the topographical field.Lovejoy, Schertzer and others (e.g.[14]) have clarified that some kind of (temporal or spatial or both) rainfall data can be modeled by multifractal model.Their multifractal model is characterized by the terms "continuous cascades", "universal model", and "the fractional integrated flux model (FIF)", and so on.Hereinafter, we simply refer to their simulation model as "FIF".Because the FIF model is generally defined by the three parameters of , C1, and H, it is necessary to estimate these three parameters before conducting simulations using the multifractal FIF.Usually, two steps exist for obtaining fractal parameters.The first is Fourier analyses.From results of these analyses, one can confirm whether or not the field is fractal field.To parameterize some multifractal models, it is possible to use the Double Trace Moment (DTM; Lavallée [20]) and Fourier analysis (it should be noted that the DTM has only be validated with discrete cascade model (personal communication, Daniel Lavallée, 2013)).Regarding mono-fractals, we will interpret a spectral method for fBm/ fLm, which is one example for obtaining parameters or the nature of a mono-fractal, in Section 3. Using these steps, we can confirm the fractality of the field and obtain relevant parameters.
We have outlined the flow of our research here.1) Construction of data: Raw data is strongly affected by seasonal and periodic changes.We modified the data and made our data for analyses using raw data.2) Confirmation of fractality: Fourier analyses were conducted.The power spectra were obtained.If  (angular frequency) versus E() is log-log linear, then the data field is mono-fractal or multifractal.
3) If the data field is fractal, we can obtain three parameters of the universal model (multifractal), or try to apply fBm or fLm (a mono-fractal model).Subsequently, we must ascertain which model is appropriate.4) If the data field is not fractal, then we must first seek the appropriate filter.Then we must construct a new model.

Raw Data
For precipitation observations, more than 150 manned observation stations (offices of the Japan Meteorological Agency) exist throughout Japan.Although their beginning times of operation mutually differ, only 51 manned stations have operated for more than 100 years.For this study, we used daily precipitation time series data (hereinafter, often designated simply as "daily precipitation") at these 51 manned stations during 1901-2011.At 10 stations, however, the data are not complete because of natural disasters and wars.In Figure 1, we portray these 10 stations using red (Naha), orange (Kure), and yellow (Miyako, Fukui, Yokohama, Tsuruga, Kofu, Hamamatsu, Kobe and Sakai) circles.Missing periods are, respectively 10, 2, and 1 year.Green circles show stations that have complete (111 yr) data.

Anomaly R
We analyze the time series (which contains 365 data = 1 yr) of daily precipitation R. The total time series includes data of 111 years at most observation stations.First, we carry out the Fourier transform and investigate the relation between  versus the power spectrum   E  .As we described earlier, the relation is log-log linear: might imagine, however, the relation does not show log-log linearity because of various factors.As described later, we obtained a relation (an exponential function) other than the log-log linearity using anomaly.However, in reality, we were unable to obtain a smooth function if we used R itself.Therefore, we conducted modification of the data as described to remove these factors in the following manner: 1) Assuming that the Tsu Observation Station has mn daily precipitation data, where m is the year and n is the Julian day number, then we can use a variable R(m,n).First, we calculate the average value of each day over a period of 111 years.
2) Secondly, we conducted smoothing (low-pass filtering) the R(n).We obtained five-day moving averages . We obtained 361 values for Tsu Observation Station.
3) For each year with index m, we calculated anomaly . So we have 111 time series available for analysis. In

Spectral Method for fBm/fLm
As we described in the previous chapter, first we carry out Fourier analyses to confirm the fractality of R.If the field of R has a fractal nature, then the relational Equation (2) holds between  (angular frequency) and E() In that equation,  is the scale exponent and , s is the discrete variable, N represents the number of data R (=361 in this study), and F s is the discrete Fourier transform.
The spectral method, discussed in Lavallée [3] can be used to generate Bm/fBm/fLm.See Saupe [21] for details and other methods.Lavallée's method is the following: In this equation, X t represents independent and identically distributed random variables (white noise).Y t shows the generated fBm/fLm data.If X t is generated using the Gauss law, then Y t is fBm.If X t is generated using the Lévy law, then Y t is fLm.Here, one can imagine an inverse procedure: for certain data Y t , one might want to ascertain whether Y t is fBm or fLm.
The following Equation ( 4) is an inverse equation of Equation (3) in which Particularly, we collect Y t from a certain geoscientific field.After estimating  using Equation (2), we can obtain random variables X t using Equation (4).Subsequently, we can confirm which probability density function (PDF) is appropriate for X t .We assume that the candidate of the PDF is a Gaussian or Lévy distribution in this study.The PDF of the Lévy distribution cannot be expressed explicitly.The characteristic function is the following (Robust Analysis Inc. [22]; Nolan [23]); is a scale parameter (positive) and  is a location parameter (an arbitrary real number).The Gauss distribution is a special case of the Lévy distribution and  = 2 for the Gauss distribution.Detailed explanations have been made by Nolan [23], Zolotarev [24] and Uchiaikin and Zolotarev [25].

Spectral Analysis
First, we conducted spectral analysis using R.In practice, R is R(m, n), and 3  m 363 and 1901  n  2011.Furthermore, N in Equations ( 3) is 361, and 111 E() were averaged.Figure 3 (lower) shows the relation between  and E() on a double logarithm chart.The relation does not show log-log linearity.Figure 3 shows the relation at the Tsu Observation Station.However, a similar behavior has been observed for all the other observation stations considered in this study.These results suggest that stochastic process characterized by a spectrum attenuation given by a power law-for instance, fBm, fLm and other multifractal models (e.g.FIF)-cannot properly model the average spectrum E() observed for the anomaly R considered in this study.
Although the fractal model is inapplicable to daily precipitation, the procedure explained by Lavallée [3] is applicable to the data.E() on a normal chart.Not log-log linearity, but a certain relation is apparent.The relation is investigated in the following section.

Exponential Filter
We cannot generate R using the fBm/fLm model.Therefore, we tried to derive a filter other than the filter for fractal model to simulate R.We strove to apply the following Equation ( 6) for regression analysis.First, "a" and "b" are parameters which we must estimate.
We conducted regression analyses for all 51 observation stations.The range of correlation coefficient of  versus log E() was −0.74 -−0.97 (−0.92 ± 0.04; mean ± standard deviation).Therefore, we conclude that Equation ( 6) is applicable as a regression curve of  versus E() of precipitation anomaly.Liu et al. [26] discussed a similar stochastic process but with a filter given by a von Karman function.

Probability Law
In the previous section, we derived a new filter instead of a log-log filter (Equation ( 6)).Therefore, Equation ( 4), which was discussed in the preceding section, is modified as shown below.
In that expression, R is used instead of Y t in Equation (4).Equation (7) shows that X t is computed using R.We tried to clarify X t 's nature, specifically, which of the Gauss law or Lévy law is appropriate for generating X t .We compared histograms of X t and PDF of the Gaussian and Lévy law.Parameters of the Gaussian law are estimated using the L-moment method (Hosking and Walllis [27]).Parameters of the Lévy law were estimated using three methods: maximum likelihood, quantile, and the empirical characteristic function.In practice, we used Nolan's software [22,23].Figure 4 (upper) presents a histogram of X t (green circles) and PDFs of the Gaussian law (blue solid line) and Lévy (dashed line) in Tsu.Regarding Figure 4, parameters of the Lévy law were calculated using the maximum likelihood function.According to Figure 4 (upper), the Lévy law is more appropriate than the Gaussian law.Furthermore, we focused on the PDF tail, as earlier studies have done: Lavallée [3].There is an additional justification to focus on the tail of the distribution.In Geophysics, extreme but less frequent events are the ones mainly responsible for significantly perturbing the system under consideration and the potential cause of much damage to nature and society.Properly quantifying the frequency of these extreme events is Figure 4 (bottom) is the same graph as that shown in the upper panel, but on double logarithm graph to emphasize the tail's nature.The lower panel indicates the superiority of Lévy's law more clearly than the upper panel: the lower panel clearly shows that the Lévy law provide a better description of the extreme events located in the tail of the PDF of X t .X t is clearly related to daily precipitation.We intend to emphasize the reproducibility of extreme daily precipitation.Therefore, it is concluded that the Lévy's law is more appropriate.
We can quantitatively estimate goodness-of-fit of the PDFs of Lévy's law.We calculated the correlation coefficient between the histogram and PDFs of Lévy's law for 51 observation stations.Ranges of coefficients of correlation of normal graph were 0.86 ± 0.08 (1.00 ± 0.02), 0.80 ± 0.07 (0.92 ± 0.04) and 0.86 ± 0.06 (1.01 ± 0.04) by the three methods described above (numbers in the parentheses are slopes of the regression lines).Sufficiently high correlation and slopes that are almost unity were shown for all 51 stations for three methods.Regarding tail characteristics (double logarithm graph), similar results were obtained.Furthermore, these analyses demonstrate that Lévy's law is quantitatively superior to the Gaussian law.

Simulation
In the previous section, we show that the Lévy law properly describes the distribution of X t -especially extreme values.In this section we discussed how to use the emodel and Lévy law to simulate daily precipitation.This e-model resembles fBm/fLm, especially the spectral method, but it is not characterized by spectrum attenuation given by a power law but the spectrum is best described by a decaying exponential function.

Lévy Random Number
An expression for synthetic precipitation anomaly can be obtained by modifying Equation (3) in the following way (Equation ( 8)): the power law filter is replaced by the exponential filter given in Equation (6).
Therein, the random number X t is generated using the Lévy law.Four parameters of the distribution for each observation station were obtained as explained in the preceding section.R' is a generated precipitation anomaly.We intend to reproduce R (observed anomaly) using Equation (8).We use the truncated Lévy law (Lavallée [3]) in practice: overly large or small values of X t , which are outside of a certain allowable range, are removed.The allowable range is found using actual precipitation data.

Proportionality Factor of Generated R
We conducted spectral analysis of the generated R'.Results show that the spectral E() of R' is much smaller than that of R.Here, it is noteworthy that an equality is "" in Equation (3) but "=" in Equation (8), and that R is an observed real value and that R' is a generated value which is expected to reproduce R.Theoretically, R' is expected to be proportional to R (in the statistical sense).In fact, the cause of disagreement is apparently differences of two "a" values.How- Then we assume the following equation.
In that equation, k is a proportional factor that stands as a correction coefficient of "a" to reproduce R.We obtained correction factor k for E() of R' to coincide with E() of R.The distribution of k obtained from 51 observations was 2.44 ± 0.43.After we estimated the correction factors "k", we conducted spectral analysis of k R well coincides with that of observed R.We conclude that we are able to simulate R using our e-model including the k-correction procedure.

Correction of Negative Values
For this study, we generated 300 sets of R'.Then, the simulated daily precipitation (R') is calculable as In this case, the simulated anomaly (R'') is estimated as R R R      .We conducted spectral analysis of R''.Then we compared the spectra of R'' and R from the viewpoint of the   E() relation.6) show that E() of R'' is almost identical to that of R for 51 observation stations: removing negative values from simulated time series does not affect the result.

Conclusions
We strove to apply a stochastic model to daily precipitation time series recorded at 51 observation stations in Japan.Fourier analysis of precipitation anomaly suggests that the spectrum of precipitation anomaly is attenuated as an exponential function.Thus stochastic model of the fractal variety with a spectrum characterized by a spec-  trum following a power law must be disregarded.To model precipitation anomaly, we propose a new stochastic model: the e-model.Our findings are the following.
1) The e-model closely resembles fBm/fLm, but an exponential type filter is applicable to our data: anomaly time series data.
2) The e-model use an exponential type of filter: We generated daily precipitation time series using the e-model.The model generates some negative values.Therefore, we modify the negative values to zero values.This modification did not affect the reproducibility of the relation between  and E().
We emphasize the reproducibility of extreme daily precipitation.The e-model with Lévy's law can reproduce extreme daily precipitation.We intend to search for other phenomena to which the e-model and Lévy's law can be applied and which are predicted by the model.

Acknowledgements
We are grateful for Dr. Daniel

Figure 1 .
Figure 1.Locations of observation stations: observation stations having complete data obtained during 1901-2011, stations with 1 year of data were lost, stations with 2 years of data lost, and stations with 10 years of data lost are denoted respectively with green circles, yellow circles, orange circles, and red circles.

Figure 2 ,
black lines and red lines respectively represent R(n) and   R n (upper panel).The lower panel shows the anomaly time series of R(m, n) in 2011 at Tsu Observation Station.We will analyze this R(m, n) in the following sections.

Figure 3 (
Figure 3. (Upper) Power spectra of R  in Tsu on the normal chart.The blue dots are power spectra of R  in Tsu.The red curve is a regression curve; (Bottom) The same figure as the upper panel, but on the log-log chart.Blue dots represent power spectra of R  .The red line is a linear regression line.

Figure 4 .
Figure 4. (Upper) A histogram of X t (green dots), a PDF of the Gaussian law (blue curve) and the Lévy PDF (dashed curve).The histogram size has been reduced for comparison to PDF.Therefore, to be precise, the histogram is PDF; (Bottom) The same figure as shown in the upper panel, but on the double logarithm chart.The curve of the Lévy PDF (dashed curve) provides a better fit to the tail of the histogram (PDF curves) of X t .a potential outcome of the method discussed in the paper (personal communication, Daniel Lavallée, 2013).Figure4(bottom) is the same graph as that shown in the upper panel, but on double logarithm graph to emphasize the tail's nature.The lower panel indicates the superiority of Lévy's law more clearly than the upper panel: the lower panel clearly shows that the Lévy law provide a better description of the extreme events located in the tail of the PDF of X t .X t is clearly related to daily precipitation.We intend to emphasize the reproducibility of extreme daily precipitation.Therefore, it is concluded that the Lévy's law is more appropriate.We can quantitatively estimate goodness-of-fit of the PDFs of Lévy's law.We calculated the correlation coefficient between the histogram and PDFs of Lévy's law for 51 observation stations.Ranges of coefficients of correlation of normal graph were 0.86 ± 0.08 (1.00 ± 0.02), 0.80 ± 0.07 (0.92 ± 0.04) and 0.86 ± 0.06 (1.01 ± 0.04) by the three methods described above (numbers in the parentheses are slopes of the regression lines).Sufficiently high correlation and slopes that are almost unity

Figure 5
depicts a chart of the interannual change of the simulated R R k R      (blue line) and observed R in 2011 (red line) at Tsu Observation Station located in Mie.According to Figure 5, their characteristics are comparable.However, R' shows a negative value on some days.Furthermore, we defined the new variable R'' as shown below. if

Figure 5 .
Figure 5. Interannual change of precipitation R' (blue line) generated using our method and parameters obtained in the preceding section, and actual precipitation R (red line) in 2011 at the Tsu Observation Station.Both lines show similar characteristics.However, R' (blue line) has some negative values.

Figure 6 .
Figure 6.The same figure as Figure 3 (upper).Yellow dots represent power spectra of R  in Tsu.The red line is a linear regression line.The Blue dots and green line are for  R  .
Lavallée's many kind and valuable comments.This research was supported by scientific research funds (No. 21560539 and No. 23360244) from the Japanese Ministry of Education, Culture, Sports, Science and Technology (MEXT) and the "Precise Impact Assessments on Climate Change" of the Program for Risk Information on Climate Change (SOUSEI Program), supported by MEXT.