Deep Learning Based Model Predictive Control for a Reverse Osmosis Desalination Plant

Reverse Osmosis (RO) desalination plants are highly nonlinear multi-input-multioutput systems that are affected by uncertainties, constraints and some physical phenomena such as membrane fouling that are mathematically difficult to describe. Such systems require effective control strategies that take these effects into account. Such a control strategy is the nonlinear model predictive (NMPC) controller. However, an NMPC depends very much on the accuracy of the internal model used for prediction in order to maintain feasible operating conditions of the RO desalination plant. Recurrent Neural Networks (RNNs), especially the Long-Short-Term Memory (LSTM) can capture complex nonlinear dynamic behavior and provide long-range predictions even in the presence of disturbances. Therefore, in this paper an NMPC for a RO desalination plant that utilizes an LSTM as the predictive model will be presented. It will be tested to maintain a given permeate flow rate and keep the permeate concentration under a certain limit by manipulating the feed pressure. Results show a good performance of the system.


Introduction
Recently, there have been an increased interest and commercialization of desalination systems due to significant improvement in technology and the advantageous developments in membrane technology. The dynamics of an RO desalination system are highly nonlinear, constrained and subject to uncertainties such as membrane fouling and varying feed water quality. Therefore, the design of a suitable controller for the RO desalination system is a very challenging task. There have been several approaches for controlling nonlinear systems in general such as the linear quadratic regulator (LQR) [1], proportional integral derivative controller (PID) [2], backstepping control [3] and sliding mode control (SMC) [4]. Nevertheless, all these techniques usually do not take into account the actual constraints of the process and just consider the control effects. Furthermore, the parameters of the controllers are chosen aimlessly, hence the optimality of the system cannot be guaranteed.
Model predictive control has been applied to control RO desalination processes [5] [6] [7] [8] [9]. It is obvious that the performance of the model predictive controller largely depends on the quality of the predictive model used, especially if the system is complex and highly nonlinear. Several techniques have been used for system identification for the MPC, e.g., Kalman filtering [10], maximum likelihood estimation [11] [12]. However, it is known that the Kalman filter requires knowledge of the mathematics behind the system, which we know is very difficult to obtain for highly complex processes such as the RO desalination system with several unknown disturbances, and the physical phenomena such as membrane fouling. Artificial Neural Networks (ANNs) have proven to be very good function approximators and do not need any mathematical model, but the input-output data of the system [13]. There have been applications of ANNs for the MPC control [14] [15] [16], especially the Multilayer Perceptron (MLP). The MLP has some limitations to time variant systems, because the learned results are static input-output maps. Furthermore, the prediction steps of the MLP are limited [17].
In [18] [19], Recurrent Neural Networks (RNNs) were introduced into the structure of the MPC, because they can capture the system dynamics and provide long-range predictions [20]. It is well-known that RNNs have issues with vanishing and exploding gradients, which makes their training difficult sometimes, therefore we propose to use a special form of RNN, i.e., the Long Short Term Memory (LSTM). LSTM is a special version of RNNs structure that was designed to model chronological sequences and their long-range dependencies more precisely than conventional RNNs [18].
Even though, it is not new to combine MPC with recurrent neural networks [21] [22] [23], the application of LSTM as the predictive model for the MPC for desalination processes is hardly found in literature. This fact motivated us to put our focus on system identification using LSTM with a view towards closed-loop control with MPC for control of a RO desalination plant. The new contributions of this paper are the following:  Introduction of LSTM as the predictive model in MPC to capture nonlinearities  The combined structure of LSTM and MPC is new to RO desalination control The remainder of the paper is outlined as follows. In Section 2 the model of the RO desalination plant and some scenarios for assessing the performance of the control system in closed loop will be described. Following this, in Section 3, a D. Karimanzira, T. Rauschenbach Journal of Applied Mathematics and Physics section about the methods and materials will be given, in which the method of system identification using a LSTM and the problem formulation for the MPC using the identified LSTM as the prediction model will be described. Finally, the results of the system identification and the closed loop simulations control performance and discussions will be given in Section 4.

RO Desalination Plant Model and Control Scenarios
In this section, the model of the RO desalination plant and some scenarios for assessing the performance of the control system in closed loop will be described.

RO Desalination Plant Model
A RO desalination plant shown in Figure 1 is used as the nonlinear plant on which the LSTM-based Model predictive control algorithm is applied to control the nonlinear process. The configuration of the system includes two tanks: a feed tank, and another tank for draining permeate. Furthermore, the plant includes reverse osmosis unit and a high pressure pump. A high-pressure pump is used to pump the water from the feed tank to the pressure ( f P ) into the RO unit. From the inflows and outflows of the feed tank, it is obvious that the feed water total dissolved solids (feed TDS)-feed water concentration ( f C ) is changing constantly, because some TDS leave with permeate ( p C ), some TDS are lost due to adhesion on the membrane surface and some TDS ( in C ) enter the system with the filling water ( in Q ) for the feed water tank. The permeate concentration ( p C ) and the brine concentration ( b C ) and the total permeate quantity ( p Q ) and brine ( b Q ) at the outlet of the membrane module define the operating conditions of the RO unit itself and they can be controlled by adjusting the feed pressure at the RO unit inlet.
The same can be done to characterize the permeate tank to get the following two equations of mass and salt balances: Substituting as we did previously, the two equations that describe the permeate tank can be obtained to Equations (6) and (8) The differential Equations (1), (5), (6) and (8)   ( ) where 3 2 m m s kPa  is the water permeability coefficient at the reference  (12) and (14), respectively:  is the cake layer resistance, and  is the intrinsic membrane resistance.

Control Scenarios
Since this study has the main focus on process control, the system performance should be evaluated in closed loop control, where the system will be tracking a setpoint. So, the objective of the controller is to bring the RO desalination system quickly and smoothly to target set-point of the permeate flow rate and keep the permeate concentration under

Methods and Materials
The procedure for using an LSTM as the predictive model in the MPC comprises of several steps starting from 1) generating a dataset by acquiring data from the system using perturbations of the manipulated variables, here in our case, the feed pressure; 2) dividing the dataset into training and validation sets and training the LSTM on the training dataset while testing the network on the validation dataset for early stopping. There are some hyperparameters which need to be selected to find the best performance. This can be done manually, whereby several network configurations are trained and the best performing network selected, or one can use Bayesian optimization to find the parameters automatically [25]; 3) integrating the LSTM with the best performance with the MPC and 4) finally run closed loop simulations with LSTM-based MPC to evaluate its control performance.

Internal Model Using Long Short Term Memory Network
The task of system identification is main focus of this section and comprises of Journal of Applied Mathematics and Physics approximating the RO desalination system as described by Equations (1)- (14).
The p-step ahead prediction issue is supposed to be of vital important interest for the control using MPC. Deep neural networks are universal function approximators and can be used to capture the nonlinear dynamics of systems.
They are relatively simple to obtain and evaluate in real-time. To them belongs the LSTMs that can better capture temporal dependencies in the dynamical system. Especially for predictive control, the LSTMs are particularly useful. They can be used to make the required p-step ahead predictions of state variables, based on the fact that the prediction for time-step p depends solely on the current state and all control actions in time-step . The time-step 1 p − predictions used in the time-step p prediction are equally dependent on the current state and all control behaviour in time-step , etc. Figure 2 shows the LSTM structure for the p-step ahead prediction problem.
It is made out of repeating cells with four interacting components forming each layer, and in our case each cell represents a time-step, so that the state of the cell serves as the input for a cell representing time-step 1 k + . Each cell contains user-specifiable N number of hidden nodes that encode the state representation. These cells use several gating functions, like the "forget", "input" and "output" gating functions, that serve to modulate the propagation of signals between cells. This cell structure avoids the gradient vanishing or exploding problem.
The basic LSTM cell structure (Figure 2(b)) is fully mathematically described in the appendix of [19]. It has three inputs denoted by [ ] 1 ,  [ ] 1 , (16) [ ] 1 tanh , The two layers are then composed to determine the information to be stored as the cell state. The operator * denotes point-wise multiplication. The point-wise multiplication of the input gate and the vector k C  of the new candidate values gives the amount of information to be added to the LSTM cell state.
This result is added with the result of the forget gate k f multiplied with previous cell state ( ) [ ] 1 , where 2 k y ∈  is the cell output which corresponds to the state vector prediction for time-step k. 0 h is initialised in this study by using 0 y . The regressors required to predict

An LSTM is characterized by the values of the weights and biases for the different gates Forget gate (
, and for all layers, and these values constitute the set of parameters. These parameters are learnt from training data by minimizing the predictive error of the model on the training set as determined through a user-specified loss function. The learning process is performed through the back-propagation through time (BPTT) algorithm that estimates the gradient of the loss function as a function of the weights, and an optimization algorithm that uses the calculated gradient to adjust the existing weights. The adaptive moment estimation algorithm (Adam) [26] is an example of an optimization algorithm that is widely Journal of Applied Mathematics and Physics used. In the BPTT, the weights are initialized, the information is passed through the different gates, the output k h and current cell state k C are calculated, the gradient through back propagation through time at time step k are calculated using chain rule and finally using all gradients, the weights associated with input gate, output gate, and forget gate are updated.

Data Acquisition
For training the LSTM, a dataset which covers the whole operating range of the RO desalination plant was collected by perturbation of the manipulated variable, the feed pressure and recording the dynamic system response. A pre-defined sequence of the manipulated variable, is introduced into the system and the dynamic response is recorded. Such a signal for the feed pressure and the dynamic response for permeate flow rate p M and permeate concentration p x , total permeate quantity p Q and permeate concentration p C are shown in Figure 3 and Figure 4(a) and Figure 4(b), respectively. K T denotes the final time-step for the perturbation experiment. The perturbation is sampled at t ∆ .
, p k Q , is the measured system output at time-step k after . K T p − data points can thus be extracted from each experimental sequence.
The input to the system, the feed water concentration is an uncertainty. Therefore, Gaussian noise was added to its signal before it was used to excite the system (Figure 3).
Using the normal approach in machine learning, before training the LSTM, the labeled dataset is split into three parts with one part for training (data used for adapting the network weights), one part for validation and the last part for testing.

Nonlinear Model Predictive Control Problem
The structure of the model predictive controller for a RO desalination system is shown in Figure 5 , p y y  . Typically, the objective function is chosen to penalize large control effort, which means higher power consumption for the actuator, and discrepancies between the state vector and the set-point at each time instance. Constraints on input and output may also be factored into the MPC formulation. Since MPC performance depends on the quality of the system's predictions, a reasonably accurate model obtained through system identification is crucial.
The Equations (23)-(26) below describe the MPC problem Figure 5. Schematic representation of a model predictive controller with full state feedback.
where p + ∈  is the prediction horizon, In general this problem of optimization is not convex and therefore does not have special structures suitable for global optimality. Therefore, this is a Non-Linear Programming (NLP) problem, and it can be solved with modern off-the-shelf solvers.  Table 1.

Results and Discussions
The results will be discussed in two parts, the first part is about the results of the system identification and the second part describes the closed loop results for the MPC.

Model Identification Results
To measure the LSTM model predictive capability, we used the mean absolute and inject the input to the plant Step 4 At 1 k t + , obtain the plant measurement m y Step 5 Corresponding to m y , estimate the states ( ) The hyperparameters in the prediction model such as the learning rate, batch size, dropout filtersize etc., need to be explored carefully to achieve the best prediction results. We utilize Bayesian optimization to search for these hyperparameters efficiently. From the Bayesian optimization, the best LSTM for system identification was found with the key parameters shown in Table 2.   Figure 7(a), Figure 8(a), Figure 9(a) and Figure 10(a) reveal a good fit of the LSTM to the training data for the permeate flow rate, total permeate flow, permeate concentration and the total permeate concentration, respectively, and testifies to the model's ability to reflect highly dynamic outputs from highly dynamic training data. The validation to determine the predictive capability of the model on a different data set was performed and Figure 7 Figure 10(b) show that the model succeeded in capturing the general trends for previously unseen test data for the permeate flow rate, total permeate flow, permeate concentration and the total permeate concentration, respectively.

LSTM-Based MPC Closed-Loop Control Performance
The MPC controller in this study was implemented in Python version 3.6.5 through the scipy.optimize.minimize function, and the sequential least squares quadratic programming (SLSQP) algorithm was selected as the option for this solver.
The parameters for the MPC controller were set as shown in Table 4 and its main objective was to track a target set point trajectory as fast and as smooth as possible. The LSTM-based system was compared to a system which uses the true model of the RO desalination system and the results will be discussed in the following.
The response graphs in Figure 11 show that the LSTM-based MPC strategy successfully tracks the signal showing the robustness and successful set point tracking ability of the controller employed to RO desalination system. To be able to compare the performance of the two controllers quantitatively, we designed a D. Karimanzira, T. Rauschenbach

Conclusion
A nonlinear model predictive controller for RO desalination systems has been presented. To take model uncertainties, constraints, nonlinear dynamics into account, the system utilizes an LSTM Network as the predictive model. The LSTM can capture complex nonlinear dynamic behavior and provide long-range predictions even in the presence of disturbances. The main aim was to control the permeate flow rate obeying the constraints on the permeate concentration by manipulating the feed pressure. The LSTM based MPC was tested on reference signals which exhibits, the possible nonlinear process dynamics occurring inside a real RO desalination plant. It can be seen from the response graphs that the NMPC strategy successfully tracks the reference signal. These results illustrate and prove the tracking ability of LSTM-based MPC controller. Almost offset free and very close set point tracking is obtained using the strategy. Journal of Applied Mathematics and Physics