1. Introduction
A wide variety of rainfall-runoff models have been developed and applied for water resources planning which is vital in terms of flood control and management. Traditionally, the hydrologists and water resources researchers have used conventional modeling techniques either deterministic models that includes physics of the underlying process or systems theoretic (black box) models. However these models require a large quantity of data and a complex methodology for its calibration. Most of the hydrological models either show unsuccessful results or become cumbersome. Many researchers report that these models fail to capture the high flows in a hydrograph [1,2] due to limited data sets available in the high flow domain (5% of total calibrating patterns) for capturing the nonlinear dynamics.
Recently the researchers have focused to decompose the data corresponding to flow hydrograph to enhance the performance of the hydrologic models. Mostly the studies have concentrated on using either the statistical techniques or soft decomposing techniques for data decomposition [3]. Studies include automated base flow separation and recession analysis [4], spectral analysis [5], wavelet transforms and runoff time series analysis [6-9], modular neural network (MNN) [10], self-organizing map (SOM) classifier [11,12] and self organizing linear output map (SOLO) [13]. Most of these studies conclude that the decomposition and partitioning of data resulted in better model performance.
Artificial neural network (ANN) has been proposed by researchers which is a system theoretic model that has gained momentum in the last few decades as it has been successfully applied to a wide range of problems in hydrology [2,14-16]. It is used to develop relationship between input and output variables using the existing data. Jain and Srinivasulu [3] proposed an integrated approach to model decomposed flow hydrograph using ANN and conceptual techniques. The streamflow decomposition was carried out based on physical processes which divide the input-output and fit the models for each of the segments [3]. However, the models developed using the distributed approach would have made the solution procedures complex significantly [3]. In this study, efforts are made to develop a simplified ANN based decomposed streamflow model without requiring any prior knowledge or understanding of physical processes. In this study the data is divided into two states namely rise and fall, based on the current state. The proposed model is compared with the feed forward ANN model, on a real case example of Kolar basin, India.
This paper is organized as follows. Section 2 provides a brief introduction on ANN. Section 3 describes proposed methodology. Section 4 illustrates the case study on Kolar basin, India. Section 5 includes Results and Discussion and the paper is concluded with summary and conclusions presented in Section 6.
2. Artificial Neural Network
The ANNs are highly interconnected mathematical models with its structure analogous to that of the human brain. It attempts to develop the massively parallel local processing and the distributed storage properties which are believed to exist in the human brain [17]. Simple processing units of an ANN are called “neurons”. Neurons having similar characteristics are grouped in one single layer (neurons in an input layer receive an input from the external source, and transmit the same to a neuron in an adjacent layer, which could either be a hidden layer or an output layer). Structure of the ANN Model is shown in Figure 1.
The general mathematical form of an ANN Model is given as:
(1)
where, is the input of the ANN Model, is the weight connecting input nodes to hidden nodes, is the weight connecting hidden nodes to output nodes, are the bias at hidden and output layer respectively and are the activation functions at hidden and output layer respectively.
The weights and are usually determined by minimizing the quadratic error function,
(2)
Once the ANN Model is executed then the error at the output layer from an ANN can be computed if output is known.
(3)
where, is the error at the output layer, y is the observed stream flow and y is the estimated stream flow.
Using the process of the feed-forward calculations and back-propagation of the errors the connection strengths are updated and an acceptable level of output is predicted. This is called as training of an ANN. Once the network has been trained, it can be tested using the testing data.
3. Model Development
Determination of significant input variables is a very essential step in ANN Modeling [18,19]. Cross correlation is used to find the relationship between the variables [2,19-21] and is used to represent the most popular analytical techniques for selecting appropriate inputs [18]. Observed relationships between the training samples and the connection weights enhance generalization ability of an ANN model [22].
The inputs to the SD-ANN model were selected on the basis of crossand auto-correlation method as proposed by Sudheer et al. [2]. The significant input variables were found to be the effective rainfalls at lag time steps of t − 9, t − 8, and t − 7 (an d) using the crosscorrelation and the river flow values at lag time steps of t − 1 and t − 2 (and) using the autocorrelation function [23]. The output of the model is the riverflow at time t (). Thus Qt is represented as
(4)
In this study, the hourly input data are divided into two cases based on the previous data sets1) Rise: In the rise pattern the value of runoff at time t is greater than that of time step t − 1, i.e.,.
2) Fall: In the fall pattern the value of runoff at time t − 1 is greater than that of time step t, i.e.,.
Figure 2 shows the proposed methodology in which Model 1 decomposes the data into classes (i.e., rise and fall) based on the input variables, Model 2 is the calibrated ANN, model for the rise and Model 3 is the calibrated ANN model for the fall.
Statistical indices like the coefficient of correlation (Cc), root-mean-square error (RMSE) and Nash-Sutcliffe efficiency (NSE) [24] are used to evaluate the performance of the model. The equations of these statistical indices are,
(5)
(6)
Figure 1. Typical model structure of the FF-ANN model.
Figure 2. Methodology of the streamflow decomposed based ANN model.
(7)
where,
is the Observed Runoff Value, is the Predicted Runoff Value, y is the mean of the observed runoff values andis the mean of the predicted runoff values.
4. Case Study
A case study on the Kolar River basin is chosen to demonstrate the proposed SD-ANN method. FF-ANN and SD-ANN models for forecasting the runoff values at 1-hour lead time have been developed. Data relating to monsoon season (i.e., July, August, and September) for 3 years period (from 1987 to 1989). Note that areal average values of rainfall data for three rain gauge stations were used in the study.
The Kolar River is a tributary of the river Narmada that drains an area about 1350 km2 before its confluence with Narmada near Neelkant (Figure 3). In this study the catchment area up to the Satrana gauging site is considered, which constitutes an area of 903.87 km2. The 75.3-km-long river course lies between north latitude and east longitude Further more details on the basin are given by Nayak et al. [25].
From the total available data for 3 years, 6525 patterns (input-output pairs) were identified for the study and were split into calibration (5500 sets, 1987-1988 data sets) and validation (1025 sets, 1989 data sets). Note that the 1025 sets considered for validation were corresponding to a continuous hydrograph.
The activation function used at the hidden layer and at the output layer is sigmoid function as it is easily differentiable.
Figure 3. Map of Kolar river basin [25].
(8)
To calculate the network parameters back propagation algorithm [26] has been used. Adaptive learning and momentum rates have been employed for the model training [25].
5. Results and Discussions
As discussed earlier, the SD-ANN model developed is used for forecasting the river flow for Kolar Basin at a lead time of 1 hour. The performance of the proposed SD-ANN model and FF-ANN model have been evaluated by means of a variety of statistical criteria such as coefficient of correlation (CC), coefficient of efficiency (NSE) and the root-mean-square error (RMSE) between the actual and estimated flow values. The various statistics stated below in Table 1 indicates that the predicted value of runoff by the SD-ANN is more accurate than that of FF-ANN Model. Performance of both the models in terms of statistical indices is very similar and satisfactory as the correlation coefficient of both the models are very close to the unity. Further it is observed from the Table 1 that the efficiency of both the models is greater than 90% which is highly satisfactory according to Shamseldin [27]. In addition, it is worth noting that the RMSE of the proposed model is less when compared to the FF-ANN model. Also the prediction of high flows is well modeled by the proposed SD-ANN model when compared to the FF-ANN model (Figure 4). It is evident from the results that the decomposition of the streamflow has considerable impact on the performance of models.
Table 1. Statistical indices—comparison between SD-ANN and FF-ANN model.
Figure 4. Computed streamflows for a typical event during validation.
6. Summary and Conclusion
In this study, a simplified ANN based decomposed streamflow model is developed. The proposed data decomposition does not require any prior knowledge or understanding of physical processes. In this study the data is divided into two states namely rise and fall, based on the current state. The performance of the proposed SD-ANN model is compared to that of the feed-forward ANN model in terms of statistical indices such as coefficient of correlation, coefficient of efficiency and root means square error. The exercise was carried out for the hourly data in Kolar river basin, India. It is observed that the proposed SD-ANN model and the FF-ANN model show similar results in terms of statistical indices except the case of RMSE where the former outperforms the latter. Further, the SD-ANN model outperforms the FFANN model in prediction of high flows. The results show the significance of the streamflow decomposition when compared to single hydrograph. The performance of the SD-ANN models has to be tested on various time scales. Further extensions of this model can be examined to improve the forecasting accuracy [28].
7. Acknowledgements
The authors thank the Vellore Institute of Technology, Vellore, India, for providing the necessary facilities to carry out this research work.