Comparison of Two Recurrent Neural Networks for Rainfall-Runoff Modeling in the Zou River Basin at Atchérigbé (Bénin) ()
1. Introduction
Rainfall-runoff modelling has been an unavoidable issue of hydrological research for several decades and has resulted in plenty of models proposed in literature. Following Beck (1991), these models can be classified: metric, conceptual, and physics-based metric models. Another distinction proposed in literature deals with different levels of prior knowledge available which led to three different color-coded types of models: white, grey and black box. In the first case, the model is perfectly known, in the second one, some physical insight is allowed, but several parameters still need to be determined from data (Carcano et al., 2006). In black-box models, unfortunately, no physical insight is possible and the structure of the model is chosen inside families which show good flexibility and have been successfully employed in the past (Sjöberg et al., 1995). Recurrent Neural Networks (RNN) represent one of these families and have been widely investigated in hydrology since the middle 1990’s. It is a type of deep learning that is suitable for time series modelling (Yokoo et al., 2021).
A method categorized into RNN, which is called Long and Short-Term Memory (LSTM) network, has large potential to model time series that has a long-term dependency. Due to this feature of the LSTM, it has been applied in rainfall-runoff modelling. Kratzert et al. (2018) used meteorological data such as precipitation, air temperature, and radiation as input, and then implemented flow discharge models at multiple watersheds in the United States. Furthermore, Kao et al. (2020), Li et al. (2020), and Xiang et al. (2020) applied the encoder-decoder version of LSTM for flow prediction. Their results show the high applicability of LSTM for rainfall-runoff modelling. Although LSTM has an advantage of accuracy, it has a disadvantage over the traditional RNN. It is known that LSTM requires much more computational resources than the traditional RNN because of the complex structure of LSTM. Due to this issue of LSTM, another type of RNN with a simpler structure named Gated Recurrent Unit (GRU) was developed by Cho et al. (2014). Jeong and Park (2019) applied GRU and LSTM for groundwater level modelling. Zohou et al. (2023) used LSTM and GRU in the Ouémé River basin at Savè outlet in Bénin. The accuracy of GRU is compared to LSTM in these studies.
To date, relatively few studies have used RNN rainfall-runoff models in the Zou River basin and a clear picture of its performance is lacking. Furthermore, the hydrological models generally used in the studied region struggle to adequately simulate high flows. The Zou River is one of the main tributaries of the Ouémé River which is the most important river in Republic of Bénin. In order to fill this gap, the present study examines the river flow simulation by using LSTM and GRU. To achieve this, we will, first, optimize the hyperparameters of the models, then, the river discharge at the outlet of the catchment area will be simulated and finally, the performance of the two RNN models is evaluated.
2. Materials and Methods
2.1. Study Area and Data Used
The Zou basin at Atchérigbé is located between latitudes 7˚14'30'' and 8˚33'52'' North and longitudes 1˚30'58'' and 2˚13'32'' East and covers an area of 8491 km2 (Figure 1). It overflows slightly in Togolese territory (2.24%) in central-western Benin. It covers entirely and in part four municipalities (i.e., Bantè, Glazoué, Savalou and Dassa-Zoumè) of hills region and six municipalities of Zou Department (i.e., Djidja, ZaKpota, Bohicon, Covè, Zagnanado and Ouinhi). The climate in this area of central Benin is intermediate between sub-equatorial climate of the coast and the Sudano-Sahelian climate of North Benin (Houssou, 1998). It essentially constitutes an area where the influences of the southwest monsoon and the continental trade wind called northeast harmattan.
Figure 1. Geographical location of the Zou basin at Atchérigbé.
Precipitation data used comes from Météo-Bénin (National Meteorological Agency of Benin), while the National Directorate of Water (DG-Eau) provides the river discharge data. The study area contains seven rainfall stations (Savè, Ouesse, Kokoro, Tchaourou, Bassila, Penessoulou, Toui). The period 1988 to 2010 was chosen for the study (good compromise, given the length of all the data available). This period has been considered because after the year 2010, some stations in the investigated catchment have not been well monitored.
2.2. Methods
2.2.1. Data Preprocessing
Before loading the data into the LSTM and GRU models, a few transformations were applied, such as data normalization and transforming time series into supervised learning series. We use normalization and standardization methods to reduce the complexity of LSTM and GRU models (Le, 2020).
Normalization scales each input variable (precipitation and evapotranspiration) separately in the range 0 - 1, the range of floating-point values where we have the most precision.
(1)
Standardization, like normalization, scales the output variable (rate) by subtracting the mean (called centering) and dividing by the standard deviation to shift the distribution to have a mean of zero and a standard deviation of one (Sun et al., 2021).
Our hydrometeorological data is divided into three main parts to ensure the training, validation, and testing of the LSTM & GRU models (Table 1).
A first data set is used to train the models. This set covers 60% of the dataset (01-01-1988 to 31-12-2001). This data set allows learning the different weights of the neurons constituting our network.
A second data set is used to validate the model parameters (validation set). This set represents 20% of the dataset (01-01-2002 to 31-12-2005). This data sample provides an unbiased evaluation of the model fit on the training data set while adjusting the models hyperparameters.
A third data set is used to test the real performance of the models. This dataset also represents 20% of the dataset (01-01-2006 to 17-10-2010). This is the test sample and it is used only after the model is fully trained (using the training and validation sets). This step allows to provide an unbiased assessment of the fit of the final model on the training dataset.
Table 1. Dataset split.
| Phase |
Percentage |
Period |
| Training set |
60% |
01-01-1988 to 31-12-2001 |
| Validation set |
20% |
01-01-2002 to 31-12-2005 |
| Test set |
20% |
01-01-2006 to 17-10-2010 |
2.2.2. Construction and Validation of Forecasting Models
An artificial neural network is like an assembly of identical structural elements called cells (or neurons) interconnected like the nervous system cells of vertebrates. The information in the network propagates from one layer to another, and they are said to be of a “feed-forward” type (Riad et al., 2004). We distinguish three types of layers:
The neurons in this layer receive the input values from the network and pass them on to the hidden neurons. Each neuron receives a value, so it does not sum.
Each neuron of this layer receives information from several previous layers, performs the summation weighted by the weights, and then transforms it according to its activation function, which is generally a sigmoid function (Xiang et al., 2020); it is the most suitable for the hydrological model. It then sends this response to neurons of the next layer.
These play the same role as the hidden layers, the only difference between these two types of layers is that the output of the neurons of the output layer is not linked to any other neuron.
2.2.3. Recurrent Neural Networks
Recurrent neural network is an artificial neural network with recurrent connections. A recurrent neural network consists of interconnected units (neurons) interacting non-linearly, for which there is at least one cycle in the structure. The units are connected by arcs (synapses) which have a weight. The output of a neuron is a nonlinear combination of its inputs. Recurrent neural networks are suitable for time series analysis.
A Long Short-Term Memory (LSTM) neural network (Hochreiter & Schmidhuber, 1997) is the most widely used recurrent neural network architecture in practice that addresses the gradient vanishing problem. The idea associated with LSTM is that each computational unit is linked to a hidden state h and a state c of the cell, which acts as a memory. The transition from
to
is done by a constant gain transfer equal to one (Abbot & Marohasy, 2014). In this way, errors are propagated to previous steps (up to 1000 steps in the past) without any gradient disappearance phenomenon. The state of the cell can be modified through a gate that allows or blocks the update (input gate). Similarly, a gate controls whether the cell state is communicated at the output of the LSTM unit (output gate). The most common version of LSTMs also uses a forget gate to reset the cell state.
Their architecture is given in Figure 2 (Hochreiter & Schmidhuber, 1997).
The different formulas for each gate (forget gate, input gate, output gate) are presented below:
(2)
(3)
(4)
(5)
(6)
(7)
Figure 2. LSTM network architecture.
GRU network is a variant of LSTM (Chung et al., 2014). GRU networks have performance comparable to LSTM for time series prediction. A GRU unit requires fewer parameters to learn than an LSTM unit. A neuron is now associated with only one hidden state, and the gates of entering and forgetting the hidden state are merged (Fang & Shao, 2022). The output gate is replaced by a reset gate. The architecture of GRU network is given in Figure 3. In LSTM and GRU models, the input data are precipitation and evapotranspiration, while the ouput gives the predicted river flow.
Figure 3. GRU network architecture.
2.2.4. Hyperparameters Optimization for LSTM and GRU Models
While constructing recurrent neural network models, we are faced with the choice of hyperparameters. Indeed, a hyperparameter is a parameter whose value is used to control the learning process. They are adjustment parameters of the machine learning algorithms. It is known that the hyperparameters of an artificial neural network have an influence on the performance of the model, so the number of units in the LSTM layers, the batch size, and the learning rate of the optimizer are selected as optimization objects. Optimizing the hyperparameters of an LSTM or GRU model involves performing a search to discover the set of model configuration arguments that result in the best model performance on a specific data set. The hyperparameters to be optimized during the training phase of LSTM and GRU models are:
These must also be chosen reasonably to find a trade-off between high bias and high variance. Again, this depends on the size of the data used for training.
This is a hyperparameter that plays on the speed of the gradient descent: a more or less important number of iterations is necessary before the algorithm converges, i.e., before optimal learning of the network is achieved.
Several samples that will be transmitted to the network at one time. It is also commonly referred to as a mini lot. If the batch size is smaller, the patterns would be less repetitive and hence convergence would become difficult. If the batch size is large, the learning is slow because it is only after many iterations that the batch size will change.
The number of epochs is the number of times all the training data are presented to the model. It plays an important role in how well the model fits the training data. The architectures of the recurrent neural network models developed consist of three layers, namely:
An input layer made up of vectors comprising the values of the input variables (precipitation and evapotranspiration);
A hidden layer (LSTM or GRU) composed of 100 units;
An output layer is composed of a neuron that predicts the value of the flow.
The optimizer used is the Adam optimizer. Kingma and Ba (2014) list the attractive benefits of using Adam on non-convex optimization problems, as follows: Straightforward to implement; computationally efficient; little memory requirements; invariant to diagonal rescale of the gradients; well suited for problems that are large in terms of data and/or parameters.
The hyperparameters have intuitive interpretation and typically require little tuning. The loss function chosen is the root mean square error. For the training phase of the LSTM and GRU models, the number of epochs was set to 100 to have the same scale of comparison between the models. Model evaluation was performed using the test dataset. We evaluated the models by analyzing the curve of the loss function on the number of epochs (Vannieuwenhuyze, 2019).
To assess the performance of LSTM and GRU models, the Nash Stutcliffe efficiency (NSE), the coefficient of determination (R2), and the Root Mean Squared Error (RMSE) are used.
3. Results and Discussion
3.1. Models Training and Validation
Figure 4 and Figure 5 present respectively the evolution of the loss function (Loss) during the training of LSTM and GRU models against the epochs. It can be seen that the error during the training and test phases converges towards 0.1 after around 2 epochs for LSTM and around respectively 1 and 2 for test and train for GRU model. One can deduct from this fact that models based on machine learning require very few computing resources while allowing them to have very good results.
Figure 4. Loss evolution curve during the training and validation for LSTM model.
Figure 5. Loss evolution curve during the training and validation for GRU model.
3.2. Hyperparameters Tuning Values
Figure 6 and Figure 7 show the values of the selected hyperparameters after optimization.
Table 2 gives a summary of the selected values of hyperparameters after optimization.
Table 2. Hyperparameter value.
| Models |
Learning rate |
Number of unit |
Number of epochs |
Batch size |
| LSTM |
0.0017 |
79 |
13 |
74 |
| GRU |
0.01 |
85 |
35 |
90 |
Figure 6. Value of LSTM model hyperparameters.
Figure 7. Value of GRU model hyperparameters.
Both recurrent neural network models perform better with lower learning rates and a number of units smaller than 100. The number of epochs and the batch size have less influence on the models, although a higher number of epochs slightly improves predictions. The models obtained good results in calibration and validation. After the training phase of the LSTM and GRU models, we obtain good performance of the models (Table 3).
In calibration, the values obtained for the NSE and R2 test largely exceed the acceptable thresholds in hydrology proposed by (Moriasi et al., 2015). Similarly, the root mean square error is close to 0 (Table 3).
Table 3. Performance criteria of the models in calibration.
| Performance criteria |
LSTM |
GRU |
| R2 |
0.888 |
0.9 |
| NSE |
0.886 |
0.9 |
| RMSE |
0.42 |
0.397 |
3.3. Simulation with LSTM and GRU models
After the training phases, we now simulate the river discharge with these two models (Figure 8 and Figure 9). Recession periods were generally well represented. However, the uncertainties associated with the peaks are greater than those associated with low flow. This less accurate predictions of peaks can be partly due to the measurement errors during exceptional flooding years (2007 and 2010) in which over bank full discharge was observed at the gauging station.
The performance of the models is given in Table 4. From this table, it can be seen that GRU performed slightly better than the LSTM for the simulation of river discharge in the Zou basin at Atchérigbé.
Table 4. Performance criteria of the models in validation.
| Performance criteria |
LSTM |
GRU |
| R2 |
0.865 |
0.9 |
| NSE |
0.851 |
0.865 |
| RMSE |
0.329 |
0.301 |
Figure 8. River discharge simulated with LSTM neural network.
Figure 9. River discharge simulated with GRU neural network.
Relatively few studies have investigated the prediction of flow in the Zou basin at Atchérigbé. Bossa et al. (2014) simulated daily discharge using SWAT model. They calibabred the model over the period 2007-2008 and found a coefficient of determination R2 around 0.89, while the validation was done over the period 2001-2006 and gave R2 of about 0.71. Sintondji et al. (2018) implemented a physics-based model (SWAT) to increase the reliability of physical processes, climate and human influences in the estimation of water balance and soil loss through this basin. The results gave R2 around 0.79 in calibration and 0.87 in validation by using monthly data. LSTM and GRU models used in the present study, and with the use of daily data, perform better than SWAT model used in that study. Hounkpè and Diekkrüger (2018) calibrated and validated a distributed model (WaSiM) to evaluate water resources and flood hazard in the Zou catchment, Benin, for the period 1991-2009. Their results revealed that the model performances were acceptable with regards to the uncertainties in discharge measurement mainly in peak discharge. However, the values of their performance criteria are still less than what we obtained in the present study. Agon (2016) investigated the impact of rainfall variability in water resources in Zou basin. He used GR4J model and found in calibration R2 = 0.65 and NSE = 0.76 and in validation R2 = 0.69 and NSE = 0.83. Concerning time-series data problems, models based on RNN have demonstrated superiority in resolving complex tasks. The almost equally excellent performance of the two models in simulating river flow has been also showed by Le et al. (2021). Indeed, this can be explained by the fact that the GRU architecture is known as a simpler variant of the LSTM architecture. Furthermore, the good performance of LSTM and GRU can be related to the fact that the stochastic nature of precipitation data is better taking into account by the RNN models than the statistical models. The use of these RNN models can be therefore extend to other basins such as Ouémé River basin to better face the challenge related to floods in the Bonou outlet of this basin.
4. Conclusion
The main contribution of the paper was to investigate the potential use of LSTM and GRU recurrent neural networks models to simulate river flow in the Zou River basin at Atchérigbé outlet. It is noticed that the trained and evaluated recurrent neural network models were able to achieve high accuracy and efficiency and that the GRU model obtained slightly better results than the LSTM model. Although these models have demonstrated their superiority in simulating river flow, the role of hydrological models in the physical simulation of rainfall-runoff processes cannot be ignored. It would be therefore interesting to investigate more hydrid models that combine supervised learning category models and hydrological models to better solve problem in water resources and management. Future study would investigate the transferability of the trained models to other catchments with different hydro-climatic characteristics.
Acknowledgements
The authors thank researchers and institutions who provided datasets for this work.