^{1}

^{*}

^{2}

^{3}

The objective of this study is to predict groundwater levels (GWLs) under different impact factors using Artificial Neural Network (ANN) for a case study in Tra Noc Industrial Zone, Can Tho City, Vietnam. This can be achieved by evaluating the current state of groundwater resources (GWR) exploitation, use and dynamics; setting-up, calibrating and validating the ANN; and then predicting GWLs at different lead times. The results show that GWLs in the study area have been found to reduce rapidly from 2000 to 2015, especially in the Middle-upper Pleistocene (qp2-3) and upper Pleistocene (qp3) due to the over-withdrawals from the enterprises for production purposes. Concerning this problem, an Official Letter of the People’s Committee of Can Tho City was issued and taken into enforcement in 2012 resulting in the reduction of exploitation. The calibrated ANN structures have successfully demonstrated that the GWLs can be predicted considering different impact factors. The predicted results will help to raise awareness and to draw an attention of the local/central government for a clear GWR management policy for the Mekong delta, especially the industrial zones in the urban areas such as Can Tho city.

Groundwater resources (GWR) play an important role in the provision of domestic and production for millions of people in the Mekong Delta [

There have been many researches on GWR dynamics using hydrogeological or statistical models. For instance, Radu Goru et al. (2001) [

In the Mekong River basin, So Kazama et al. (2007) [

Nguyen Tieng Vang and Tran Van Ty (2017) [

Artificial Neural Network (ANN) is the most popular tool for groundwater prediction. Many studies have been conducted in the area of predicting GWLs. Suja and Sindhu (2016) [

The objective of this study is to predict GWLs under different impact factors using ANN for a case study in Tra Noc Industrial Zone, Can Tho city. This can be achieved by evaluating the current state of GWR exploitation, use and dynamics; setting-up, calibrating and validating the ANN for GWLs; and then predicting GWLs at different lead times.

Can Tho city is the youngest and largest urban area in the Mekong Delta, including 8 industrial zones with a total area of over 2366 ha. These industrial zones are located along the national highways and Bassacriver which is one of the two branches of Mekong river after entering Vietnam. Industrial activities have caused serious environmental problems such as pollution of water sources, microbial contamination, subsidence, etc. Tra Noc Industrial Zone was established and developed since the 1990s including Tra Noc 1 Industrial Zone (Tra Noc Ward, BinhThuy District) and Tra Noc 2 Industrial Zone (Phuoc Thoi Ward, O Mon District) with total planning area of 300 hectares (

Currently, there are 16 groundwater resources (GWR) monitoring stations/wells in Can Tho city, of which two stations (QT08 and QT16) are located in the study area. At each station, there are 3 monitoring wells in 3 aquifers and at different depths (Middle-Upper Pleistocene (qp2-3), Upper Pleistocene (qp3) and Holocene floor (qh)). From 2000 to 2015, the GWLs of Pleistocene (qp3 and qp2-3) in the Tra Noc Industrial Zone had declined rapidly. However, in the Holocene, the trend of groundwater levels (GWLs) was relatively stable.

Data of rainfall at Can Tho station and river water levels at two stations, average withdrawal discharge of industrial use purposes and observed GWLs at Pleistocene aquifer (qp2-3 and qp3 layers) at different monitoring wells were collected. Data and their sources are presented in

An Artificial Neural Network (ANN) consists of input, hidden and output layers and each layer includes an array of processing data. ANN is characterized by its structure representing the pattern of connection between nodes, connection weights, and activation function. ANN models were developed using different sets of combinations of the input parameters and the best combination model was selected based on the performance statistics.

Data of groundwater levels (GWLs) was first used to initialize the ANN model with observed GWLs at a given time to reproduce water level variations using input variables (rainfall, river water levels and withdrawal discharge from pumping). The selected ANN structures via trial and error were first calibrated on a training dataset to perform 1-, 2-, 3-month ahead predictions of future GWLs using past observed GWLs and the input variables. Simulations were then produced on another data set by iteratively feeding back the predicted GWLs, along with real data.

To develop ANN, the neural network toolbox from the Visual Gene Developer (http://www.visualgenedeveloper.net/) [

Data pre-processing was carried out for analyzing and transforming the input

No. | Data (monthly basis) | Year | Sources |
---|---|---|---|

1 | Observation wells | 2004-2015 | Department of Natural Resources and Environment (DONRE) Can Tho city |

2 | Withdrawal discharge | 2004-2015 | |

3 | Groundwater level | 2004-2015 | |

4 | Rainfall | 2004-2015 | Center for Environment and Natural Resources of Can Tho city |

5 | Water level (in Bassac river) | 2004-2015 |

and output variables to minimize noise, and to highlight important relationships. The raw data were normalized between zero and one (unitless).

Pre-processing: y ′ t = [ 0.9 ( y t − a ) ] ( b − a ) + 0.05 (1)

Post-processing: y t = [ ( b − a ) ( y ′ t − 0.05 ) ] 0.9 + a (2)

where y_{t} is the observed data; a, bare minimum and maximum values of observed data, respectively; y ′ t is the normalized value of observed data.

The structure of ANN is determined by trial and error. The number of nodes in the hidden layers and the stopping criteria were optimized in terms of obtaining precise and accurate output. The activation function of the hidden/output layers was set to a sigmoid function as this proved by trial and error to be the best in depicting the non-linearity of the modeled natural system, among a set of other options. There is no well-established direct method for selecting the number of hidden nodes for an ANN model for a given problem. Thus the common trial-and-error approach remains the most widely used method [

There are many kinds of neural networks depending on their structures, function and training methods. A typical feedward neural network with a back propagation learning algorithm to train it was used. A typical neural network is presented below:

N = ∑ w i x i O = f ( N ) = 1 1 + e − N (3)

where x_{i} is the input vector, O is the output vector, w_{i} is a weight factor between two nodes and f(N) is a activation function. Among the different kinds of activation functions, the sigmoid was used in this study. The back propagation learning algorithm is based on a generalized delta-rule accelerated by a momentum term [

To improve the performance of the network, the weight factors were adjusted using following equations:

w i j n e w = w i j o l d + η ⋅ ∑ p δ p j O p i + α ⋅ Δ w i j o l d (4)

where h is the learning rate; α is the momentum coefficient; Δw is the previous weight factor change; O is the output; δ is the gradient-descent correction term;

Input variables | Pleistocene (qp2-3 and qp3) |
---|---|

Groundwater level (stations) | W(t), W(t-1), W(t-2), W(t-3) at QT08, QT16; W(t), W(t-1) at QT09 |

Rainfall (Can Tho station) | R(t) |

Water level (Bassac river) | WL(t) at Can Tho and Long Xuyen station |

GWR withdrawal | Tra Noc Industrial Zone |

Predicted GWL stations | QT08, QT16 |

Lead time (month) | 1, 2 and 3 |

ANN structures^{ } | 14-15-1 for qp2-3 and 12-15-1 for qp3 |

Total input nodes: from 8 to 14; Total output node: 1; ANN structures were tested with various Hidden layers (from 1 to 5) and Hidden nodes (from 5 to 15) to select the best ANN structure; The optimum structures for qp2-3 and qp3 are 14-15-1 and 12-15-1 (with respectively to the input, hidden and output nodes), respectively.

and p stands for pattern. The learning rate (η) and the momentum coefficient (α) were randomly generated from 0.01 to 1 and from 0 to 1, respectively.

The back propagation algorithm is applied as follow:

1) Normalize the training data and initialize all weights (normally a small random value between minus one to one);

2) Compute the output of neurons in the hidden layer and in the output layer;

3) Compute the error and update the weights;

4) Update all weights and repeat steps 2 and 3 for all training data;

5) Repeat steps 2 to 4 until the error converges to an acceptable level.

The performance of the trained network was checked by determining the error between the predicted value and the observed one.

Available data was divided into two distinct sets namely the training/calibration and validation sets. As the training set is used by neural network to learn the patterns present in the data, 70% of data was allocated to the calibration set (2004-2012), 30% to validation set (2013-2015). In this study, the networks were selected based on best performance on the training set, and a final check on the performance of the trained network was made using the validation set.

Three different criteria were used in order to evaluate the suitable networks and their abilities to produce accurate predictions.

The Root Mean Square Error (RMSE):

RMSE = 1 N ∑ i = 1 N ( X i − Y ) 2 (5)

Efficiency Index: EI = ∑ i = 1 n ( X i − X ¯ ) 2 − ∑ i = 1 n ( X i − Y ) i 2 ∑ i = 1 n ( X i − X ¯ ) 2 (6)

The R efficiency criterion:

R = cov X Y s x s y (7)

cov X Y = ∑ i = 1 n ( ( X i − X ¯ ) ( Y i − Y ¯ ) ) n − 1 S x = ∑ i = 1 n ( X i − X ¯ ) 2 n − 1 S y = ∑ i = 1 n ( Y i − Y ¯ ) 2 n − 1

where X_{i} is the observed data, X ¯ is the mean observed data, Y_{i} is the calculated data and n is the number of observations. RMSE indicates the difference between the observed and calculated (ANN) values. The lowest the RMSE, the more accurate the prediction is. The best fit between observed and calculated values is indicated by EI and R^{2}.

The total exploitation rate of groundwater resources (GWR) in Tra Noc Industrial Zone from 2004 to 2016 is shown in ^{3}/day; 18,876 m^{3}/day and 20,210 m^{3}/day, respectively. It is clear that the total exploitation of GWR was increased up to almost six times for the period of 7 years. However, the enforcement of Official Letter No.2946/UBND-KT dated 23/6/2010 of the People’s Committee of Can Tho city [

In addition, the enterprises in Tra Noc Industrial Zone have used combination of different water sources for production and daily usage. Only 18.18% of enterprises used GWR; the others used tap water and GWR accounted for 63.64%; and the remained used combined sources (data is not shown here). However, the exploitation of GWR for production showed the increasing trend again after 2012.

It can be seen in

In addition, these two figures demonstrate that there was possible GWR recharge from rain water as there was the a little lag-time of GWLs and rainfall amount. According to the DONRE of Can Tho city (2011) [

All trainings were carried out by the neural network toolbox from the Visual Gene Developer. By means of trial and error for different ANN structures, the input layer consisted of various input nodes, and a 3-monthly time-lag was included (time lags t, t-1, t-2, and t-3 considering t is the value of a given variable at the present time step), and optimum ANN structures were obtained. The output of the network is a prediction of the GWLs at three lead times (1-, 2-, 3-month). The number of hidden neurons was determined through trial and error.

The results of ANN structure selection show that the optimum structures for qp2-3 and qp3 are 14-15-1 and 12-15-1 (with respectively to the input, hidden and output nodes), respectively. The number of nodes in the hidden layer has a slightly impacts on the accuracy of prediction. Therefore, these two structures were selected for 1-, 2-, 3-month GWLs prediction at QT08 and QT16, respectively.

The comparison between observed and 1-month predicted GWLs at QT08 at qp2-3 and qp3 layers, respectively are presented in

The correlations between GWLs and other impact factors such as rainfall, water levels in Bassacriver and GWR withdrawal for industrial uses were tested. The results show high negative correlations between GWLs and GWR withdrawal for industrial uses. In contrast, there are low correlations between GWLs and rainfall/water levels in Bassacriver (data is not shown here). Therefore, further study should consider the future projection of GWR pumping for different purposes.

Performance statistics are summarized in ^{2}). The best fit between observed and predicted values shows high values of Efficiency Index (EI)

QT08 (qp2-3) | QT16 (qp3) | |||||
---|---|---|---|---|---|---|

Lead time | 1-month | 2-month | 3-month | 1-month | 2-month | 3-month |

Index | Calibration (2004-2012) | |||||

EI | 0.97 | 0.96 | 0.94 | 0.98 | 0.98 | 0.97 |

RMSE (m) | 0.14 | 0.16 | 0.21 | 0.17 | 0.19 | 0.22 |

R^{2 } | 0.98 | 0.97 | 0.96 | 0.98 | 0.98 | 0.98 |

Index | Validation (2013-2015) | |||||

EI | 0.96 | 0.97 | 0.95 | 0.97 | 0.97 | 0.95 |

RMSE (m) | 0.07 | 0.06 | 0.07 | 0.08 | 0.09 | 0.09 |

R^{2 } | 0.97 | 0.90 | 0.96 | 0.98 | 0.97 | 0.96 |

and the R efficiency (R^{2}) with all EI and R^{2} values are greater than 90%. Regarding the Root Mean Square Error (RMSE), RMSE statistic, which is a measure of residual variance that shows the global goodness of fit between the predicted and observed GWLs, is very good as evidenced by a low RMSE values during both

calibration and validation periods. As can be seen, the variation in RMSE statistics lies between a minimum of 0.06 m to a maximum of 0.22 m.

From

Greater demand of groundwater resources (GWR) for domestic and industrial production purposes cause the widespread exploitation of the resources. GWLs in the study area reduced rapidly from 2000 to 2015, especially in the Middle-upper Pleistocene (qp2-3) and upper Pleistocene (qp3) layers due to the over-withdrawals of GWR in almost all the enterprises in the area. As the result, the Official Letter No. 2946/UBND-KT of the People’s Committee of Can Tho City was issued and taken into enforcement in 2012, to monitor the exploitation.

Application of Artificial Neural Network (ANN) has successfully demonstrated that the groundwater levels (GWLs) can be predicted by considering different impact factors. The predicted results will help to draw an attention of the local/central government to devise and formulate a clear GWR management policy for the Mekong delta, especially the industrial zones in the urban areas such as Can Tho city.

There are high negative correlations between GWLs decline and GWR withdrawal for industrial uses; therefore, further study should consider scenarios of GWR pumping for different purposes.

The authors express their sincere thanks to the Ministry of Education and Training (MOET) for supporting this study.

The authors declare no conflicts of interest regarding the publication of this paper.

Ty, T.V., Phat, L.V. and Hiep, H.V. (2018) Groundwater Level Prediction Using Artificial Neural Networks: A Case Study in Tra Noc Industrial Zone, Can Tho City, Vietnam. Journal of Water Resource and Protection, 10, 870-883. https://doi.org/10.4236/jwarp.2018.109050