^{1}

^{*}

^{1}

^{*}

The gauged river data play an important role in modeling, planning and management of the river basins. Among the hydrological data, the daily discharge data seem to be more significant for determining the amount of energy production and the control the risks of floods and drought. Hence, the data need correct measurement, analysis, and reliable estimates. The purpose of the paper is to investigate the question whether all the stations in a river basin exhibit chaotic behavior. For this purpose, the daily discharge data of four gauge stations are examined by using three nonlinear data analysis methods: 1) phase space reconstruction; 2) correlation dimension; and 3) local approximation where all those methods provide identification of chaotic behaviors. The results show that all stations exhibit chaotic character. Taking into account the proven chaotic characteristic of the stations, local approximation method is applied to observe the prediction accuracy. Considering the fact that global warming is a serious threat on natural resources, the prediction accuracy is becoming a key factor to ensure sustainability. Hence, this study is a good example on the implementation of chaotic analysis by means of the obtained results from the methods.

Changing patterns of river flow have potentially significant impacts on water quality, water abstraction, flooding and habitat availability for a range of aquatic and riparian species. It is suggested that water resource planning may need to account for these changes before many of them become statistically significant [

This paper consists of implementation of three nonlinear dynamic methods on daily river discharge data. The first one is the phase space analysis, which describes the evolution of the behavior of a nonlinear system and reconstruction using the delay-time method of embedding theorem that was suggested by Takens [

The nature of dynamics of a real-world system may be stochastic, deterministic or in between. The character of a system can be identified, at least as a preliminary indicator, by using the phase space concept. A popular method for identification of phase space of a time series was presented by Takens [

where, T is the delay time and, d is the term referring to the embedding dimension. The driven systems whose dynamics can be reduced to a set of inherently deterministic behaviors, their trajectories converge towards the subset of the phase space, called the attractor.

The time delay

where, i is total number of samples.

The false nearest neighbor (FNN) algorithm [_{j}, as it behaves in dimension_{j}, it comes to the neighborhood of Y_{j} through dynamical origins. On the other hand, if the vector _{j} by increasing the dimension, m, it is declared as a false nearest neigh boras it arrives in the neighborhood of Y_{j} of the dimension m by projection from a distant part of the attractor. When the percentage of these false nearest neighbors drops to zero, the geometric structure of the attractor has been unfolded and the orbits of the system are now distinct and do not cross (or overlap). A key step in the false nearest neighbor algorithm is to determine how to decide upon increasing the embedding dimension that a nearest neighbor is false. Two criteria are generally used (Sangoyomi et al., 1996). These are:

1) If^{th} vector has a FNN (where ^{th} vector (i.e.,_{t},

2) If^{th} vector has a FNN (where ^{th} vector and to its nearest neighbor with embedding [

The estimate of the dimension of the system exhibits the presence of chaos through the structure of the dimension. In this study, the Correlation Dimension Method (CDM) was examined, where the correlation dimension of the system provides signification either the chaotic behavior by the dimension or the embedding dimension. If the system has a fractal dimension, the character of the system is assumed to be chaotic.

CDM is one of the most efficient methods to determine the presence of chaos. The method uses a fractal dimension, which is non-integer for chaotic systems. For an m-dimensional phase space the correlation function

where H is the Heaviside step function by the Equations (4)-(6):

and

N is the number of points on the reconstructed attractor, r is the radius of the sphere centered on Y_{i} or Y_{j}. If the time series is characterized by an attractor, then for positive values of r the correlation function

where a is a constant; and D_{2} is the correlation exponent or the slope of the

For a finite dataset, there is a radius r below which there are no pairs of points, when the radius approaches the diameter of the cloud of points, the number of pairs will increase no further as the radius increases (saturation). The scaling region would be found somewhere between depopulation and saturation. When _{2} varies linearly with increasing m, without reaching a saturation value, whereas for deterministic process the value of D_{2} saturates after a certain m.

A correct phase-space reconstruction in a dimension m facilities an interpretation of the underlying dynamics in the form of a m-dimensional map f_{T}. According to Equation (9):

where Y_{j} and _{T}. domain into many subsets (neighborhoods), in order to determine a proper value for f_{T}. In other words, the dynamics of the system is described step by step locally in the phase space. Before applying reconstruction procedure it is necessary to have some information such as, embedding dimension and delay time. One of the independent coordinates mentioned above is taken as the time series itself. The remaining coordinates are formed by its _{i} with time performs prediction. Considering the relation between the points X_{t} and

In this prediction method, the change of X_{t} with time on the attractor is assumed to be the same as those of nearby points,^{th} order polynomial

where:

and

In order to obtain a stable solution, the number of rows in the Jacobian matrix A must satisfy the relation in Equation (17):

As stated by Porporato and Ridolfi [

The Root Mean Square Error (RMSE) is a frequently used measure of the difference between values predicted by a model and the values actually observed from the environment that is being modeled. These individual differences are also called residuals, and the RMSE serves to aggregate them into a single measure of predictive power. The RMSE of a model prediction with respect to the estimated variable X_{model} is defined as the square root of the mean squared error.

As shown in Equation (18), X_{obs} is observed values and X_{model} is modelled values at time/place i. The root- mean-square error (RMSE) statistics calculate the variance of the residual. The RMSE is always positive; the best value is zero; the higher the value, the poor the model performance.

1) Normalized Root Mean Square Error

Non-dimensional forms of the RMSE are useful because often one wants to compare RMSE with different units. Normalize the RMSE to the range of the observed data Equation (19):

The quantity^{2} takes the value between the

The daily discharge data of 4 gauge stations of Turkish General Directorate of State Hydraulic Works (DSİ), on Yesil Irmak River basins were examined. The data contains daily period without any missing data in the dataset. The squared stations on

The entire of the dataset is divided into two parts: the first 25 years (1977-2001) of the data are used in the phase-space reconstruction and identification of system behavior and the subsequent 1 year dataset (2001-2002) is used for prediction.

The first step to reconstruct the original phase space is estimating the phase parameters which are; the delay time and embedding dimension. The method of average mutual information (AMI) (Equation (2)) was used to quantify the delay time. TISEAN version 3.0.1 package program [_{t} versus

Gauge station | Observation years | Data length (Day) | Mean flowrate (m^{3}/s) | Maximum flowrate (m^{3}/s) | Minimum flowrate (m^{3}/s) | Standart deviation (σ) |
---|---|---|---|---|---|---|

1401 | 1976-2002 | 9500 | 69.95 | 981 | 2.89 | 78.94 |

1412 | 1976-2002 | 9500 | 6.98 | 125 | 0.77 | 9.181 |

1414 | 1976-2002 | 9500 | 4.30 | 75 | 0.81 | 5.90 |

1418 | 1976-2002 | 9500 | 18.70 | 140 | 1.02 | 23.60 |

The CDM is used to calculate the embedding dimension for the dataset, using the delay times in

Gauge station | Delay time | Embedding dimension |
---|---|---|

1401 | 59 days | 12 |

1412 | 95 days | 12 |

1414 | 74 days | 15 |

1418 | 75 days | 15 |

indication of the existence of deterministic dynamics. The saturated correlation dimensions are shown in

The CDM is used to calculate the embedding dimension for the dataset, using the delay times in

The entire dataset of 26 years was divided into two parts; the first 25 years of data were used in the phase space reconstruction and predictions are made for the subsequent 1 year (2001-2002) of data. ^{2}) reveal that the best prediction achievement when the optimum embedding dimension (m_{opt}) for the lowest NRMSE is selected. The bold values in _{opt} for each station. ^{2} to determine the prediction achievement. Such results certainly indicate the appropriateness of the phase-space-based nonlinear prediction technique, employed herein on daily discharge data of the gauge stations of Yesil Irmak River Basin. The selected evaluation criteria (correlation coefficient (R^{2})) reveal

Gauge station | Correlation dimension (D_{2}) |
---|---|

1401 | 3.78 |

1412 | 3.12 |

1414 | 3.88 |

1418 | 3.81 |

Embedding dimension | NRMSE 1401 | NRMSE 1412 | NRMSE 1414 | NRMSE 1418 |
---|---|---|---|---|

2 | 0.014 | 0.057 | 0.013 | 0.034 |

3 | 0.010 | 0.059 | 0.014 | 0.038 |

4 | 0.011 | 0.069 | 0.012 | 0.032 |

5 | 0.011 | 0.054 | 0.013 | 0.036 |

6 | 0.011 | 0.060 | 0.014 | 0.040 |

7 | 0.012 | 0.060 | 0.014 | 0.041 |

8 | 0.012 | 0.061 | 0.014 | 0.040 |

9 | 0.013 | 0.060 | 0.015 | 0.045 |

10 | 0.012 | 0.061 | 0.015 | 0.050 |

that the best prediction achievement when the optimum embedding dimension (m_{opt}) for the lowest NRMSE is selected. The bold values in _{opt} for each station.

This paper investigates possible chaotic behaviors in the daily discharge data from the gauge stations on Yesil Irmak Basin, Turkey. The analysis was performed on 4-gauged stations with a record interval of 26 years (1976- 2002). The focus of the paper was on identifying chaotic behavior in time series with an immediate concern that if there were chaotic dynamics in the series, how they would be carried in practical implementations.

The results in this paper can be summarized as follows:

1) Phase space of the data series was reconstructed. For this purpose, two components (time lag, embedding dimension) were determined. In the case of determining time lag, both average mutual information (AMI) and autocorrelation function are used in literature. Considering the fact that, AMI is the nonlinear form of autocorrelation function, AMI method was used to determine nonlinear correlation time series to determine the time lag for the phase space reconstruction.

2) In phase of determining the embedding dimension, which also determines the degrees of freedom, false nearest neighbors (FNN) [

3) The local prediction model was also applied to evaluate its predictability performance. In this prediction model, the dynamics of the system are described step by step locally in the phase space. The predicted values are in good agreement with the observations by having high values of correlation coefficient.

Studies reporting the possible presence of nonlinear determinism in river flow and other hydrologic series have often been criticized, essentially due to the implementation of chaotic analysis method as a nonlinear determinism identification tool. Hence the literature of implementation of nonlinear methods in hydrological processes is still being considered as infancy. An important attempt was made in this study, which addressed this issue by identifying the nonlinear deterministic nature of river flows. Understanding how land and water management practices singly and in combination change stream and river flows is a key to maintaining and restoring natural flow regimes [

In view of the question of whether a given river flow (or any other hydrologic and geomorphic) series can be modeled better by stochastic methods or by nonlinear deterministic methods has come under increasing scrutiny in recent times [

For this purpose, we strongly believe that it is time for researchers to evaluate the whole basin data (where data are available), to understand the river basin system behavior for an accurate management and control.