In this study, the capability of two different types of models including Hydrological Simulation Program-Fortran (HSPF) as a process-based model and ANN as a data-driven model in simulating runoff was evaluated. The considered area is the Balkhichai River watershed in northwest of Iran. HSPF is a semi-distributed deterministic, continuous and physically-based model that can simulate the hydrologic cycle, associated water quality and quantity and process on pervious and impervious land surfaces and streams. Artificial neural network (ANN) is probably the most successful learning machine technique with flexible mathematical structure which is capable of identifying complex non-linear relationships between input and output data without attempting to reach the understanding of the nature of the phenomena. Statistical approach depending on cross-, auto- and partial-autocorrelation of the observed data is used as a good alternative to the trial and error method in identifying model inputs. The performances of ANN and HSPF models in calibration and validation stages are compared with the observed runoff values in order to identify the best fit forecasting model based upon a number of selected performance criteria. Results of runoff simulation indicated that the simulated runoff by ANN was generally closer to the observed values than those predicted by HSPF.
Streamflow is one of the most important processes in the hydrological cycle and its prediction is vital for water resources management and planning [
Over the last decades, artificial intelligent techniques have been introduced and widely applied in hydrological studies as powerful alternative modelling tools, such as Artificial Neural Network (ANN) [
HSPF is a semi-distributed, conceptual model that combines spatially distributed physical attributes into the hydrologic response units. In this model, surface runoff is simulated primarily as an infiltration-excess process. HSPF has been used for simulation of various hydrological conditions [
Not only ANNs and HSPF have been widely used for runoff, pollution and sediment simulation, but also some researchers have been performed based on the comparison between mathematical models and ANNs [
The study area is Balkhichai River watershed (
Temperature and precipitation are the two basic variables, which are measured at meteorological stations. The training input dataset includes a total of 2557 data records between 2004 and 2010. The testing input dataset
Stations | ||
---|---|---|
Ardebil | Nir | |
Latitude (˚E) | 38.15 | 38.02 |
Longitude (˚N) | 48.17 | 47.59 |
Elavation (m) | 1332 | 1450 |
Available data (years) | 1951-2013 | 1960-2013 |
Mean precipitation (mm) | 445 | 376 |
Mean temperature (˚C) | 10.3 | 7.4 |
consists of a total of 729 data records, which were observed during in the last 2 years (2011-2012). Hourly precipitation and temperature data were utilized as inputs to the HSPF model. Some data about of different land use classes in the Balkhichai watershed were retrieved during 3 days of field survey in the watershed. Then Land use classifications for the Balkhichai River were retrieved from the satellite image processing using the maximum likelihood classification.
In this study, Hydrological Simulation Program FORTRAN (HSPF) was used for simulation of Balkhichai River runoff. HSPF is a set of computer codes, which was developed by the US Environmental Protection Agency. It is based on the Stanford Watershed Model IV [
HSPF is a semi-distributed deterministic, continuous and physically based model. The PERLND, IMPLND, and RCHRES modules are the three main modules of HSPF which help to simulate permeable land segments, impermeable land segments, and free-flow reaches, respectively. Detailed information about these modules can be found in the literatures [
Parameter | Definition | Units | Possible range | |
---|---|---|---|---|
Min | Max | |||
INFILT AGWRC LZSN UZSN DEEPFR INTFW IRC BASETP LZETP | Index of infiltration capacity Base groundwater recession Lower zone nominal soil moisture storage Upper zone nominal soil moisture storage Fraction of groundwater inflow to deep recharge Interflow inflow parameter Interflow recession parameter Fraction of remaining ET from base flow Lower zone ET parameter | mm/h dimensionless mm mm dimensionless dimensionless dimensionless dimensionless dimensionless | 0.25 0.85 50.8 1.27 0 1 0.3 0 0.1 | 12.7 0.999 381 50.4 0.5 10 0.85 0.2 0.9 |
BASINS (Better Assessment Science Integrating Point and Nonpoint Sources) software based on topographic, soil properties and land use data. Then the estimated parameters are introduced to HSPF. The BASINS is developed to promote better assessment and integration of point and nonpoint sources in watershed and water quality management. It integrates several environmental key data sets with improved analysis techniques. Several types of environmental programs can benefit from the use and application of such an integrated system in various stages of environmental management planning and decision making [
ANN inspired by using studies of biological neural system is composed of processing elements called neurons or nodes [
A FFNN (Feed Forward Neural Network) consists of at least three layers of input, output and hidden layers. The input signals presented to the system in input layer are processed and forwarded into the hidden layer. The summation of the weighted input signals is transferred by a nonlinear activation function. The response of the network is compared with the actual observation results and the network error is calculated. The error of network is propagated backwards through the system and the weight coefficients are updated (
No | Validation statistics | Expression | Range |
---|---|---|---|
1 | |||
2 | |||
3 | |||
4 | |||
5 |
The daily discharge data from 2004 to 2010 and from 2011 to 2012 were utilized for “calibration and training” and “validation and testing” the model approach, respectively.
The most common ANN network is the feed-forward network, which uses the back-propagation algorithm for training [
Here, we use the three-layer FFNN with one hidden layer and the common trial and error method to select the number of hidden nodes. Too many hidden layer neurons not only require a large computational time for accurate training, but may also result in overtraining. A neural network is said to be “over-trained” when the network focuses on the characteristics of individual data points rather than just capturing the general patterns presented in the entire training set.
Understanding the temporal relationships between climatic drivers and stream-flow is fundamental for the model development. Some studies use time-series correlation analysis to determine the temporal lag (number of time steps) between climate and flow variables [
Parameter | Value |
---|---|
INFILT AGWRC LZSN UZSN DEEPFR INTFW IRC BASETP LZETP | 2.5 mm/h 0.87 56.23 mm 13.86 mm 0.3 4 0.7 0.1 0.6 |
respectively. Therefore a total number of seven variables were identified as inputs (Equation (1)).
After the appropriate input vector was identified, the network was trained to predict future data based on the past and present data. In the present study, the input and output variables are first normalized linearly in the range of 0 and 1. The normalization is done using the following equation:
where
During comparison of results, the words such as “calibration” and “validation” of HSPF model were used as similar to training and testing of ANN model, respectively. To estimate the relative performance of the models in runoff simulation, values of evaluation criteria obtained from both ANN and HSPF models were compared. The evaluation criteria of ANN model obtained during calibration were compared with the corresponding evaluation criteria obtained during HSPF calibration. The values of ENS, R, ME, RMSE and PWRMSE are statistical evaluation criteria that were showed in
The ENS for the HSPF model ranged from 0.64 to 0.80 for the calibration period and from 0.70 to 0.74 for the validation period. Similarly, the R for the HSPF model ranged from 0.73 to 0.88 for the calibration period and from 0.82 to 0.92 for the validation period. The ME ranged from −1.58 to 1.89 for the calibration period and from −1.15 to −0.84 for the validation period. Also, the RMSE ranged from 1.97 to 4.39 for the calibration period and from 2.09 to 3.86 for the validation period. The PWRMSE for HSPF model ranged from 2.37 to 4.52 for the calibration period and from 2.71 to 3.27 for the validation period.
The ENS for the ANN model ranged from 0.72 to 0.89 for the calibration period and from 0.78 to 0.85 for the validation period. Similarly, the R for the ANN model ranged from 0.86 to 0.93 for the calibration period and from 0.91 to 0.94 for the validation period. The ME ranged from −1.39 to 1.72 for the calibration period and from −0.97 to −0.41 for the validation period. Also, the RMSE ranged from 1.07 to 3.75 for the calibration period and from 1.89 to 2.56 for the validation period. The PWRMSE for ANN model ranged from 1.16 to 3.23 for the calibration period and from 1.23 to 2.67 for the validation period.
The results indicated that both models were generally able to simulate stream flow well during both the calibration/validation periods. However, the simulated stream flows by ANN were better than those predicted by HSPF during the calibration and validation periods. The runoff simulation of the ANN model was found to be better than the HSPF model during calibration and validation as revealed from the values of the evaluation criteria. There was a considerable difference between the values of ENS obtained from the ANN and HSPF models for the year 2004 (
Calibration set | Validation set | |||||||||
---|---|---|---|---|---|---|---|---|---|---|
Evaluation criteria | Name Model | 2004 | 2005 | 2006 | 2007 | 2008 | 2009 | 2010 | 2011 | 2012 |
ENS | ANN | 0.82 | 0.85 | 0.89 | 0.86 | 0.89 | 0.87 | 0.72 | 0.78 | 0.85 |
HSPF | 0.76 | 0.75 | 0.79 | 0.78 | 0.80 | 0.72 | 0.64 | 0.70 | 0.74 | |
R | ANN | 0.92 | 0.88 | 0.93 | 0.91 | 0.90 | 0.92 | 0.86 | 0.91 | 0.94 |
HSPF | 0.88 | 0.73 | 0.86 | 0.84 | 0.82 | 0.84 | 0.74 | 0.82 | 0.92 | |
ME | ANN | −0.14 | −0.85 | 1.12 | −1.39 | 0.89 | 0.41 | 1.72 | −0.41 | −0.97 |
HSPF | −0.32 | −0.95 | 1.49 | −1.58 | 0.95 | 0.67 | 1.89 | −0.84 | −1.15 | |
RMSE | ANN | 1.07 | 1.83 | 1.33 | 2.21 | 2.13 | 2.75 | 3.75 | 1.89 | 2.56 |
HSPF | 2.19 | 2.47 | 1.97 | 2.85 | 3.84 | 3.77 | 4.39 | 2.09 | 3.86 | |
PWRMSE | ANN | 1.16 | 1.48 | 2.11 | 2.72 | 2.89 | 2.12 | 3.23 | 1.23 | 2.67 |
HSPF | 3.02 | 2.61 | 2.37 | 3.97 | 3.19 | 3.98 | 4.52 | 2.71 | 3.27 |
exhibit a closer scatter to the ideal line, thus indicating good runoff simulation for the Balkhichai River watershed. The scatter for the ANN model is obviously better than that of the HSPF model. These scatter plots are considered to be accounted for the application of the ANN model as is revealed by relatively more symmetrical scatter in figures. The ANN model was found to be more successful than the HSPF in relation to better forecast of peak flow. The results of this study, in general, showed that ANNs can be powerful tools in runoff simulation.
One of known advantages of the HSPF model is to make reliable flow simulation when there are available climate and soil data at ungauged site. Rainfall-runoff relation is impacted by climatic parameters and different physical e.g. slope, elevations, vegetation, soil humidity, groundwater, etc. all these parameters make a non- linear and complex relation for rainfall and runoff. Also, they have not completed data in many watersheds. Many different physical models such as HSPF have been developed, but because they cannot engage all necessary parameters, they are not as efficient as needed. Advancing use of ANN, despite its short background and the reliable results calculated by them, gives an idea of its growing popularity and bright future.
This paper reports the results of a comparison between two different models for runoff simulation in the Balkhichai River watershed in Iran, during the period of 2004-2012. The performances of models in “calibration and training” and “validation and testing” stages are compared with the observed runoff values to identify the best fit forecasting model based upon a number of selected performance criteria. The comparison results show that the ANN models have better performances in forecasting the runoff from HSPF. By considering a good training process and suitable algorithms and nodes, the prediction is more accurate. Once the architecture of the network is defined, weights are calculated so that they represent the desired output through a learning process where the ANN is trained to obtain the expected results. The neural network could predict runoff accurately, with good agreements between the observed and predicted values compared with the HSPF model. The ANNs are capable of daily simulation of runoff. However, in low flows, a little bit above estimation is observed. As in hydrological models, ANN does not require watershed information and other physical parameters in the modeling process, which reduces the complexities of modeling the system. Required time for the calibration of the ANNs is much less as compared with the HSPF. Also, for calibration, ANN model needs less expertise and experiences. In comparison to HSPF model, less data are required for simulation using the ANNs. If a number of scenarios are to be made to investigate the response of the catchment, the HSPF may prove to be advantageous in comparison to the ANNs. One of advantages of the HSPF model is to make reliable runoff simulation when there are available climate and soil data at ungauged site. In Iran, it is relatively easier to obtain flow and precipitation records through the governmental online resources compared with physical characteristics of river basins such as soil moisture, soil classes, groundwater level, infiltration and evaporation. So, the black-box models might emerge as a faster tool to implement on flow fore-casting business. The results of this study can be used for future studies, in general, the HSPF and ANN comparison in daily simulations, specifically runoff prediction performance.