Water Quality Evaluation Using Back Propagation Artificial Neural Network Based on Self-Adaptive Particle Swarm Optimization Algorithm and Chaos Theory

To overcome the shortcomings of the traditional methods of water quality evaluation, in this paper, a novel model combines particle swarm optimization (PSO), chaos theory, self-adaptive strategy and back propagation artificial neural network (BP ANN) that was proposed to evaluate the water quality of Weihe River in China. An improved PSO algorithm with a self-adaptive inertia weight and a chaotic learning factor tuned by logistic function was developed and used to optimize the network parameters of BP ANN. The values of average absolute deviation (AAD), root mean square error of prediction (RMSEP) and squared correlation coefficient are 0.0061, 0.0163 and 0.9903, respectively. Compared with other methods, such as BP ANN, and PSO BP ANN, the proposed model displays optimal prediction performance with high precision and good correlation. The results show that the proposed method has the good prediction ability for evaluating water quality. It is convenient, reliable and high precision, which provides good analysis and evaluation method for water quality.


Introduction
The water quality evaluation is an important link of its research system, and has almost become an indispensable important part of all the environmental quality evaluation, not only accurately orienting the pollution level of lakes/rivers and the trend of future development, but also more efficiently utilizing and protect-ing water, thereby, it provides a directional and principled scheme and basis for the conservation of water source [1] [2] [3] [4].
Currently, there are several main methods of water quality evaluation including single factor evaluation, principal component analysis and comprehensive water quality identification index [5]- [12].The mechanism of the single factor evaluation method is that using the classification of the worst single index of water quality to determine the classification of the comprehensive water quality; the method is simple and clear, and can directly attain the relationship between water quality and evaluation criteria, but fails to get a comprehensive evaluation, furthermore, the accuracy of the evaluation results is poor.The principal component analysis (PCA) is an integrated model for water quality assessment and it can be used to establish comprehensive evaluation index and the effect is better, but it's difficult to get a better evaluation result if the participating index is too more to reduce the contribution rate of the principal component [9] [13] [14] [15] [16] [17].Because the water quality is affected by many factors, there is a complex non-linear relationship between the evaluation index and water quality standard.
These traditional processing methods can't be addressed complex nonlinear problems well and the traditional mathematical evaluation method gradually replaced by intelligent optimization algorithm.
In recent years, the artificial neural network (ANN) technology has attracted much attention; it has fast training speed and can approach all linear and nonlinear complex practical problems, furthermore, widely used in water quality evaluation [13] [18] [19] [20].For example, the back propagation artificial neural network (BP ANN) is used to the water environment quality evaluation model; the Radial Basis Function Artificial Neural Network (RBF ANN) is adopted to evaluate water quality.The traditional neural networks have some shortcomings, including slow convergence speed, easy to trap into local extremum, so that many improved neural network models have been successfully applied to water quality evaluation [9] [14] [21].
Particle swarm optimization algorithm (PSO) is one of the hot topics in the field of intelligent optimization; it has stronger global searching capability, but it's easy to be premature convergence, in turn, the BP ANN has a strong local search capability.Therefore, in order to improve the BP ANN's shortcomings of easy to fall into local optimum and depend on the choice of initial weight, this paper proposes an improved PSO algorithm based on chaos theory and adaptive strategy, and it's used to optimize the parameters of BP ANN, thus obtaining a hybrid artificial neural network prediction model, called CSAPSO BP ANN at the same time, to discuss the prediction effect of the model through making the CSAPSO BP ANN model apply to water quality evaluation.

Improved Particle Swarm Optimization Algorithm CSAPSO
PSO algorithm have many advantages, including its easy implementation, and fewer parameters need to be adjusted, the convergence speed and efficiency are better, which make PSO become a typical swarm intelligence algorithms.When, in an n-dimensional search space, the total number of particles is m, each particle is assumed to be a potential solution.The particle is updated their speed and position by the formulas (1) and (2) in the solving iteration process [22] [23] [24] [25].The improved algorithm proposed in this study is called CSAPSO algorithm.
The self-adaptive adjustment strategy is adopted to adjust inertia weight factor ( ω ), ω was defined as follows [26] [27]: ( ) ( ) where, max w and min w denote the maximum and minimum weights, respectively.
( ) gbest P k denotes the global best fitness at the k-th iteration, lbestave P denotes the average local best fitness , max K is the maximum iteration.
The learning factors 1 c and 2 c of the improved algorithm are obtained from the chaotic sequences generated by the classical Logistic map [28].
According to the formula (3), the position ( , i d x ) of each dimension of the current particle ( i x ) is mapped to the [0,1] interval : where, the [ ] After K iterations, the chaotic sequences ( ) where, i cx denotes the Chaotic variables, k i cx is i cx 's value after the K-th iteration, K is the iteration number of chaotic map.

CSAPSO BP ANN Model
In BP ANN, the model establishes the nonlinear relationship between input and output by determining the weights and deviation of each layer in the network, from structural analysis, the nonlinear relationship between the input and the output can be understood as: The CSAPSO BP ANN model is obtained by using the CSAPSO algorithm to optimize the weight vector ih w , the weight vector ho w and the deviation vector o b , the particle is designed as: The procedure for CSAPSO BP ANN can be summarized as follows: Step 1. Model initialization.The connection weights, biases and population parameters of the model are initialized randomly.
Step 2. Model training.Using the improved PSO algorithm to optimize the parameters of BP ANN, particles structure refer to that above design.
Step 3. Adjustment of model parameters.Through the output error, all parameters of the model are adjusted until the number of execution times arrives at the set value or the error meet the set conditions.
Step 4. Finish the output.After training, the model output each parameter, and then through the training model for testing.
Table 1 shows each parameter of the model.

Experimental Data
Based on the national surface water environmental quality standard (GB3838-2002), according to the six types of standards corresponding to the limits of the concentration of pollution factors (as shown in Table 2) to generate water quality assessment of the 718 groups of data, and using it to regard as modeling database, the 70% of the database data (503 groups) is used for network training, and the 30% (215 groups) is applied to network verification.In the test sample, 10 sets of test data were used to test the reliability of the model.The test sample was shown in Table 3.
The average absolute deviation (AAD), the root mean square error of prediction (RMSEP) and the squared correlation coefficient (R 2 ) are adopted to evaluate the accuracy and reliability of model, and defined as follows: Table 1.CSAPSO BP ANN model parameter setting.
where, N denote the number of the data sample, i y and ave y respectively represent the predicted and predicted average values, i y and ave y are the experiment and experiment mean values.

Model Structure
The CSAPSO BP ANN model applied a three-layer network architecture.In the input layer node, it adopted 4 water quality evaluation indicators, namely DO, VP, CODmn and NH3-N.The number of input nodes is 4. The number of nodes in the output layer is 1, which represents the prediction of water quality.
In the hidden layer, the number of neurons of different problems is generally not the same, heuristic method is used to optimize the number of hidden layer, according to the number of neurons increased from 5 to 15, a total of 11 CSAPSO BP ANN models were obtained.By calculating the AAD, RMSEP, R 2 and the best fitness value of each model, the optimal number of hidden layer nodes is determined, as shown in Table 4.
Generally, the network with the least error and higher correlation coefficient is regarded as the optimal network structure.In this study, the smallest AAD with RMSEP and the maximum R 2 structure is chosen as the optimal, according to Table 4.The hidden layer, which contains 9 neurons, is the best optimal PSO-BP hybrid neural model.

Results and Discussion
The   There are two points for verification purposes: one is to verify the training effect of the model; the other is to mildly adjust the network parameters, so that the network performance is better.From AAD and RMSEP, the prediction error is small and the precision is high, as can be seen from the R2, the correlation between the predicted value and the real value of the test is better.Judging from the prediction performance of the 3 sets, the performance of the test set is slightly worse.
The above experimental results demonstrate that the model CSAPSO BP ANN    In conclusion, the CSAPSO BP ANN model has the best comprehensive performance.The AAD and RMSEP's data show that the prediction accuracy of the CSAPSO BP ANN model reach maximum, the R2 also reflects the best correlation of the model.
From efficiency and accuracy, the data from the table also reflects the dominate of the model CSAPSO BP ANN4.For accuracy, the RMSEP of the CSAPSO BP ANN model reach minimum, but its predicted capability is the strongest.Based on the execution time, the CSAPSO BP ANN, PSO BP ANN and BP ANN decreased sequentially.Due to the involvement of intelligent algorithms, the execution time will be bound to improve.Since the intelligent algorithm belongs to the iterative evolutionary algorithm, it can consume more time.The training of BP ANN is not the introduction of intelligent algorithm to make the execution time of the model smaller.And the CSAPSO BP ANN model introduced the adaptive strategy and chaotic mechanism into the improvement of the intelligent algorithm, which makes the model take a long time to execute.But on the whole, the execution time is not long and all are within acceptable limits.prediction cost, thereby, using it to evaluate water quality is feasible and effective.

Conclusions
2) The performance of the CSAPSO BP ANN water quality evaluation model is very excellent; owing to the smaller error between the predicted value and the experimental value and the higher correlation, the water quality can be predicted well.
3) The proposed water quality evaluation model can provide a new idea for other prediction fields.
indicate the weight vector between input layer and hidden layer respectively, the weight vector between hidden layer and output layer and the deviation vector of hidden layer.That is to say, the performance of the network depends on the three main parameters ( , , ih ho o w w b ) of the network.
structure of the CSAPSO BP ANN model was 4-9-1.According to the data examples of water quality evaluation standard, CSAPSO BP ANN was trained and verified, and the training curve is shown in Figure 1.

Figure 2
shows the interrelation between the predicted value and the expected value in the training set.In the graph, the line and dot indicate the expected value and the predictive value data points respectively.The vertical distance between the dot and the line show the absolute error between the predicted value and the expected value.For graph, the predicted data points are basically kept near the straight line, not only do show that the prediction performance of the CSAPSO BP ANN model is better in the training set, and the predicted value of the model is in good agreement with the expected values, but demonstrates the good prediction performance of the model.After the model passed training, in order to verify the reliability of the trained network model, the model can be used to verify the data in the validation set.

Figure 3 Figure 2 .
Figure 2. Prediction effect in training set.

Figure 3 .
Figure 3. Prediction effect in validation set.

Figure 4 .
Figure 4. Prediction effect in testing set.

From the convergence curveFigure 5 .
Figure 5. Convergence curve chart of each comparison model.

1 )Figure 6 .
Figure 6.Comparison chart of predicted values and expected values of each model.
a Dissolved oxygen.b Volatile phenol.c Permanganate index.d Ammonia nitrogen.

Table 3 .
Measured data for water quality evaluation.

Table 4 .
Optimize CSAPSO BP ANN topological structure.As we can see from the figure, it's fast that the convergence rate of model, in the first 100 iterations.The convergence error decreases rapidly, especially in the first 50 iterations.After 350 iterations, the convergence error is stable and close to 0. The convergence rate is faster, in terms of the accuracy, the convergence error is close to 0, so the precision is higher.Undergo 350 iterations, the model has been well trained.

Table 5 .
Related data of the model prediction.

Table 6
statistics the evaluation index data of each model in water quality evaluation.

Table 6 .
Values of ARD, R 2 , time and RMSEP for the comparison models.