Research on Water Quality Parameter Prediction Model Based on TCN ()

1. Introduction
As a precious natural resource of mankind, the problem of water pollution is imminent. In order to solve this problem, researchers at home and abroad have been carrying out relevant studies and proposing a variety of solutions. Li Kaolin [1] et al. used neural network assisted Kalman filter algorithm to process water quality monitoring data, which can reduce the average absolute error of dissolved oxygen by more than 27%, the mean square error by more than 39%, and the root mean square error by more than 22%. The average absolute error on pH can also be reduced by more than 20%, providing data support for scientific aquaculture by farmers. Wang Xinyu [2] et al. built a water pollution detection model based on the graph convolutional neural network and transformed the anomaly detection problem into a classification problem to achieve the purpose of water quality detection. The results showed that this method effectively improved the accuracy of water pollution detection. However, there is still room for improvement in experimental accuracy. TCN has become a new method in the field of water quality prediction, and many scholars have made some improvements on this basis. This model can solve the problem of nonlinear time series prediction. Experimental results show that the TCN model is superior to other models and can predict different time delays in a stable way. Water quality prediction is to build a model with historical water quality data, and then predict the water quality parameters at a certain point in the future.
2. Neural Network Model
2.1. RNN
The circular network structure of RNN allows to make full use of the sequence information in the sequence data itself, so it has many advantages in processing time series [3] . In addition, the error correction ability is realized by backpropagation and gradient descent algorithms. But there are also many problems, as the time series grew, the researchers found that RNNS were weaker in long time series, meaning that RNNS had poorer long-term memory. At the same time, as the length of the sequence increases, the depth of the model increases, and the problem of gradient disappearing and gradient explosion cannot be avoided when calculating the gradient [4] . The unit structure diagram is shown in Figure 1.
2.2. LSTM
In order to solve this problem, LSTM has been proposed. Unlike traditional RNNS, LSTM saves the important features learned into long-term memory, and selectively retains, updates or forgets the stored long-term memory according to the learning situation [5] . The core structure of LSTM can be divided into four parts, namely, the forgetting gate, the input gate, the cell state and the output gate [6] . The unit structure diagram is shown in Figure 2.
1) To a certain extent, the forgetting gate forgets and retains part of the information in
of the cell state of the previous layer. The mathematical expression is as follows:
(2-1)
2) The input gate is composed of two parts, sigmoid layer and tanh layer, representing the new information learned by the network at the current time, and its mathematical expression is as follows:
(2-2)
3) Cell state is to multiply the old cell state
with forgetting factor ft, discard unwanted information, and add the new information learned by the current network to obtain the cell state at the current moment. The mathematical expression is as follows:
(2-3)
4) The output gate is used to output the current hidden state ht of the cell, and determines the information to be output in the current cell state Ct through a sigmoid layer and a tanh layer. The mathematical expression is as follows:
(2-4)
2.3. TCN
Time convolutional network (TCN) has gradually emerged in the development process of neural networks. It can not only remember information, but also carry out efficient parallel computation. TCN is applied to long time series tasks, and the effect is better than CNN [7] .
The TCN model is based on the CNN model, and has been improved by using causal convolution as the applicable sequence model and using dilatative convolution and residual modules to remember historical data. Different from traditional CNN, causal convolution cannot see future data and is a one-way structure, which is a strict time-constrained model [8] . Simple causal convolution has the problem of traditional CNN, that is, the modeling length of time is limited by the size of the convolution kernel. If you want to grasp a longer dependency, you need to linearly stack many layers.
Unlike traditional convolution, expansive convolution allows the input to be sampled at intervals, and the sampling rate is controlled by the number of layers. Therefore, the expansion convolution causes the size of the effective window to increase exponentially with the number of layers. In this way, the convolutional network can obtain a large receptive field with relatively few layers, as shown in Figure 3.
Residual modules are an effective way to train deep networks, which enable networks to transmit information in a cross-layer manner. As shown in Figure 4, a residual block contains two layers of convolution and nonlinear mapping, and adds WeightNorm and Dropout to each layer to regularize the network [9] .
3. Simulation and Analysis
3.1. Data Preprocessing
River water quality indexes include total phosphorus (P), total nitrogen (N), dissolved oxygen (BOD), ammonia nitrogen (NH3-N), permanganate index (CODMn) and pH value [10] . The purpose of water quality prediction is to build a model with historical water quality data, and then predict the water quality parameters at a certain point in the future, so as to provide scientific basis for pollution prevention and water source protection. Since CODMn is directly related to water pollution in water quality prediction, this paper adopts CODMn as a comprehensive index reflecting the evaluation of river water quality parameters to study and forecast the changes of COMDn of water quality data of Taihu Lake from 2004 to 2016.
In the process of data acquisition, due to equipment failure or operation error, there are often some missing values and outliers in the data, which interfere with the experimental process and results. Considering the accuracy of water quality prediction, the mean filling method is used to fill in the missing values, and the calculation formula is as follows:
(3-1)
3.2. Selection of Evaluation Indicators
In order to ensure the validity of the error measurement results, the mean absolute error, mean square error and goodness of fit are used to reflect the accuracy of the model prediction. The mean absolute error (MAE) is the average of the absolute error, which can better reflect the actual situation of the predicted error. Mean square error (MSE) can evaluate the degree of data change. The smaller MAE and MSE values are, the higher the accuracy of model prediction is. Goodness of Fit (R2) is an indicator to measure the goodness of fit of a model, and its value ranges from 0 to 1. The closer the value is to 1, the higher the model fits the data [11] . The three evaluation indicators are defined as follows:
(3-2)
(3-3)
(3-4)
where
is the true value,
is the predicted value, and
is the mean of the true values in the test set.
3.3. Performance Evaluation
The time convolutional neural network formed by series of residual modules is used to predict the value of CODMn, and the predicted results are compared with the predicted results of RNN and LSTM. The residual module includes two layers of extended causal convolution, two layers of weight normalization, two layers of activation function, two layers of regularization and one residual connection. Weight Norm is added to the extended causality convolution of each layer to normalize the weight. Relu nonlinear activation function is used to change the linear relationship of data. The Dropout layer is also added to introduce a regularization method to mitigate the occurrence of overfitting.
In the prediction experiment of water quality parameter prediction model based on RNN, LSTM and TCN, By using the computer hardware resource allocation of Intel i7-12700H 8-core processor and Nvidia GeForce RTX 3060 graphics card and the deep learning framework PyCharm, the sorted sample set was input into the network for iterative training, and the optimal value of each parameter was determined according to the loss situation. The final neural network model was constructed, and the experimental parameters were configured as follows: the ratio of training set, verification set and test set was divided into 6:2:2, Relu was used for activation function, MSE was used for loss function, Adam was used for optimizer, and MAE, MSE and R2 were used for model evaluation indexes. Other hyperparameter Settings: time step is 6, training period is 1000, Dropout is 0.2.
The difference is that in the prediction experiment of the water quality parameter prediction model based on RNN, the batch size is 256 and the learning rate is 0.001. The loss curve of the trained model is shown in Figure 5, and the experimental prediction result is shown in Figure 6. In the prediction experiment of the water quality parameter prediction model based on LSTM, the batch size was 256 and the learning rate was 0.005. The loss curve of the trained model is shown in Figure 7, and the experimental prediction result is shown in Figure 8. In the prediction experiment of the water quality parameter prediction model based on TCN, the batch size is 128, the learning rate is 0.01, and the number of hidden layer nodes is 32. The loss curve of the trained model is shown in Figure 9, and the experimental prediction results are shown in Figure 10.
As can be seen from Table 1, the MSE and MAE of the TCN model are lower than those of other models, so the TCN model has better prediction effect and the highest prediction accuracy. The prediction error of the RNN model is much higher than that of the TCN model, mainly because the recursive model of the RNN has problems such as gradient explosion or disappearance. When there is a large amount of training data, the output cannot be calculated in parallel, so the prediction accuracy is not high. However, the TCN model successfully avoids this problem, and the prediction results are more accurate when there is more input data. Compared with the prediction results of the RNN model, MSE decreased by 0.30814%, MAE decreased by 0.84763%, R2 increased by 6.42713%, and the prediction accuracy was higher. Compared with the prediction results of LSTM model, MSE decreased by 0.07215%, MAE decreased by 0.66228%, R2 increased by 5.76598%, indicating better prediction effect.
![]()
Figure 6. CODMn value prediction based on RNN neural network.
![]()
Figure 8. CODMn value prediction based on LSTM neural network.
![]()
Figure 10. CODMn value prediction based on TCN neural network.
![]()
Table 1. Prediction results of water quality parameters.
4. Conclusion
In this paper, the prediction of water quality parameter model is realized by constructing TCN neural network model. Table 1 shows that the MSE and MAE of TCN are lower than that of RNN and LSTM, indicating that the error of TCN is small. The R2 of TCN is closer to 1 than that of the other two neural networks, so the fitting degree of TCN is higher. Therefore, the prediction accuracy of TCN is slightly higher than that of RNN and LSTM.