Application of Artificial Neural Networks Based Monte Carlo Simulation in the Expert System Design and Control of Crude Oil Distillation Column of a Nigerian Refinery

This research work investigated comparative studies of expert system design and control of crude oil distillation column (CODC) using artificial neural networks based Monte Carlo (ANNBMC) simulation of random processes and artificial neural networks (ANN) model which were validated using experimental data obtained from functioning crude oil distillation column of Port-Harcourt Refinery, Nigeria by MATLAB computer program. Ninety percent (90%) of the experimental data sets were used for training while ten percent (10%) were used for testing the networks. The maximum relative errors between the experimental and calculated data obtained from the output variables of the neural network for CODC design were 1.98 error % and 0.57 error % when ANN only and ANNBMC were used respectively while their respective values for the maximum relative error were 0.346 error % and 0.124 error % when they were used for the controller prediction. Larger number of iteration steps of below 2500 and 5000 were required to achieve convergence of less than 10−7 for the training error using ANNBMC for both the design of the CODC and controller respectively while less than 400 and 700 iteration steps were needed to achieve convergence of 10−4 using ANN only. The linear regression analysis performed revealed the minimum and maximum prediction accuracies to be 80.65% and 98.79%; and 98.38% and 99.98% when ANN and ANNBMC were used for the CODC design respectively. Also, the minimum and maximum prediction accuracies were 92.83% and 99.34%; and 98.89% and 99.71% when ANN and ANNBMC were used for the CODC controller respectively as both methodologies have excellent predictions. Hence, artificial neural networks based Monte Carlo simulation is an effective and better tool for the design and control of crude oil distillation column. L. T. Popoola, A. A. Susu 267


Introduction
Neural networks were inspired by the power, flexibility and robustness of the biological brain.They are computational analogs of the basic biological components of a brain-neurons, synapses and dendrites.Artificial neural networks (ANN) consist of many simple computational elements (summing units-neurons-and weighted connections-weights) that work together in parallel and in series [1].An ANN has the ability to learn relationships between given sets of input and output data by changing the weights.This process is called training the ANN [2].The most well known training algorithm is the Back Propagation (BP) algorithm ( [3] [4]).It minimizes the total sum of square error, which is the difference between the desired and actual output, using the gradient descent method.One of the most important properties of a trained ANN is its ability to generalize, which means that ANN can generate a satisfactory set of outputs from inputs that are not used during its training process [5].The performance of the ANN model is a function of several design parameters such as the number of hidden layers, the number of hidden neurons in each hidden layer, the size of the training set and the training parameters.Theoretical work in ANN has shown that a single hidden layer is sufficient to approximate any complex nonlinear function under quite general conditions.While too many hidden neurons can hinder the ANN's ability to generalize data not seen during training by causing over-fitting, too few hidden neurons can cripple its ability to learn the mapping at hand [6].
An important stochastic and probabilistic tool that can be used in simulating the artificial neural networks model is the Monte Carlo simulation.Monte Carlo means using random numbers in scientific computing.More precisely, it means using random numbers as a tool to compute something that is not random.Monte Carlo analysis is a computer-based method of analysis developed in the 1940's that uses statistical sampling techniques in obtaining a probabilistic approximation to the solution of a mathematical equation or model [7].Numerical simulations of stochastic processes have become an important task in many engineering fields.Monte Carlo approaches are particularly suitable tools for those simulation purposes.Their usefulness in diverse engineering applications and other fields has been well established over a period of decades [8].
Zhang et al. [9] studied the microscopic structures of the binary mixture of methanol-hexane under different conditions by the Monte Carlo (MC) method.Alexander et al. [10] presented a spatial model using Monte Carlo simulation for the mean and correlation of highly dispersed count down and applied it to individual-level counts of the nematode Wuchereria bancrofti, a parasite of humans which causes the disease lymphatic filariasis.Gilks et al. [11] rejuvenated particles based on Markov chain Monte Carlo.Khu et al. [12] investigated the reduction of Monte-Carlo simulation runs for uncertainty estimation in hydrological modelling.Kuo et al. [13] used the data from Monte Carlo simulation to verify the proposed method integration of ART2 neural networks and genetic K-means algorithm for analyzing web browsing paths in electronic commerce.Yuxiang et al. [14] investigated the diffusion behavior of methanol in different critical media (n-pentane, n-hexane, n-heptane and acetone) using Monte Carlo (MC) method.Yeh et al. [15] proposed a methodology based on Monte Carlo simulation and ANN to estimate the reliability of a threshold voting system, which is a generalization of k-out-of-n systems.Sugiyama [16] reviewed three software packages for Monte Carlo simulation/risk analysis on a spreadsheet.A Monte Carlo particle model associated with neural networks for tracking problem had been examined [17].A general regression neural network and Monte Carlo simulation model for survival and growth of Salmonella on raw chicken skin as a function of serotype, temperature and time for use in risk assessment had been developed [18].Liu [19] investigated volumetric estimation of existing accumulated oil and gas in reservoir using Monte Carlo simulation.Safdari et al. [20] used artificial neural networks and Monte Carlo simulation in terms of uncertainty for the prediction of budget deficit in Iran.
Though previous research works had applied artificial neural networks in oil refineries ( [21]- [25]), none had applied artificial neural networks based Monte Carlo simulation in the expert system design and control of crude oil distillation column of a refinery as far as literature review is concerned.This makes this research work to be first of its kind to apply the method in oil refinery.An expert system is a computer system employing expert knowledge to attain high levels of performance in solving the problems within a specific domain area [26].Today, expert systems have demonstrated their potential, gained credibility and are being widely used to solve a variety of problems in industry and government [27].One of these complex problems for the control of which an expert system is amenable, is a crude oil distillation column [25].The crude separation process involves many complex phenomena which have to be controlled in its best placement [23].Controlling distillation column starts by identifying controlled, manipulated and load variables.Controlled variables are those variables that must be maintained at a precise value to satisfy column objectives [24].The stages involved in transforming crude oil into finished products using crude oil distillation column in a petroleum refinery had been discussed ( [19] [25] [28] [29]).Figure 1 shows a typical configuration of a crude oil distillation system.
Where ADU and VDU are atmospheric and vacuum distillation columns; TPA, MPA and BPA are the top, middle and bottom pump-arounds of ADU; LGO and HGO are light gas oil and heavy gas oil for the atmospheric distillation column; VLGO and VHGO are light gas oil and heavy gas oil for the vacuum distillation column.
The objective of this research work is to develop MATLAB computer program for the algorithms of the artificial neural networks based Monte Carlo simulation and test it on existing running data of a crude oil distillation column in a Refinery.Also to compare the computed results with the real data of the running crude oil distillation column of the Refinery using only artificial neural networks (ANN) and artificial neural networks based Monte Carlo (ANNBMC) simulation separately.The results obtained when only artificial neural networks model was used for expert system design and control of the examined crude oil distillation column of the refinery had been presented elsewhere [25].

Scope
The network training is accomplished based on the standard method of error back-propagation.The search for the optimum adjustment of the weights and biases is realized with the aid of a gradient descends method operating with a generalized delta rule.Monte Carlo simulation is used as a stochastic and probabilistic tool in generating random numbers used in adjusting the weights in the neural network.

Network Configuration and Data Conditioning
The output signal y k generated by neuron k of the ANN is given as [30]: where y k = Output signal; ϕ = Activation function; m = Total number of inputs to the neuron; j = Input; kj w = Synaptic weight of input j for neuron k; x j = Input Signal; b k = Bias value of neuron k.
The activation function ( ϕ ) is required to be nonlinear and monotonically increasing from zero to unity.The logistic sigmoid function is given as [31]: where ( ) The derivative of the logistic sigmoidal function which must be computed millions of times during the network training is [31]: where ( ) x ϕ′ = Derivative of the sigmoidal function of input x.The signal flow through the network is summarized as [32]: where J (L) denotes the number of neurons j (L) in layer L, represents the weight for the signal from neuron j (L−1) in layer L − 1 to neuron j (L) in layer L, and ( ) is the bias of neuron j (L) in layer L.
The original raw data values z j are transformed into the input signals x j using the equation where z denotes the mean value of the raw data and 1 z − σ is the inverse of the standard deviation of the raw data.
The generated process values (p) are determined using the equation ( ) where ( ) is the inverse function of the standard normal distribution of the network output y, z σ and z are the standard deviation and mean values of the raw data z.

Artificial Neural Networks Back-Propagation Algorithm
The network training is accomplished based on the standard method of error back-propagation [33].The free values of the neural network (weights and biases) are adjusted so that the network is capable of reproducing the training data with a sufficient precision.Thus, for each sequence of process values ( ) ( ) work is intended to generate a prognosis P n for the subsequent process value ( ) n z t with a minimum prediction error given as [2]: and error energy given as [34]: in which N = Length of the observed process record (training data); r = Order of the tapped delay line memory.
The search for the optimum adjustment of the weights and biases is realized with the aid of a gradient descend method operating with a generalized delta rule.For each predicted process value the prediction error of the neural network is retraced through the complete network (back-propagation) to compute changes of the weights and biases.This is done iteratively, as described subsequently, until the prediction error approaches the global minimum.
One sequence of r + 1 successive process values ( )  is randomly selected from the training data with the aid of a discrete uniform distribution over the N -r possible choices.Then, the error signal e(q) in the current iteration step q is determined with Equation ( 9) and is used to compute the local gradients kj E w ∂ ∂ in the weight space, proportional to which the weights and biases are to be changed.Let ( ) ( ) q be the argument of the activation function in the previous layer 1 L − .The new weights for the next iteration step are stated as [30]: where for the neuron ( ) 1 L j = in the output layer, and .
for the neurons ( ) When the weight adjustment in iteration step q is completed, the next sequence of r + 1 successive process values ( )  is randomly selected to proceed with the weight adjustment in iteration step q + 1.This procedure of iteratively adjusting the weights and biases is referred to as sequential training mode, which possesses the advantage of being stochastic in nature.This induces a good performance in the search for the global minimum of the objective function.

Monte Carlo Simulation
A set of weighted particles (samples), drawn from the posterior distribution of the model parameters, is used to map integrals to discrete sums.More precisely, the posterior can be approximated by the following empirical estimate [35]: where the random samples , are drawn from the posterior distribution and ( ) .d δ denotes the Dirac delta function.Consequently, any expectations of the form [36]: may be approximated by the following estimate [37]: where the particles ( ) θ are assumed to be independent and identically distributed for the approximation to hold.

Algorithm and Architecture for the Design and Control of Crude Oil Distillation Column Using Artificial Neural Networks Based Monte Carlo Simulation
The algorithm for design and control of crude oil distillation column using artificial neural networks based Monte Carlo Simulation stated in sub-sections 3.0.1,3.0.2and 3.0.3above is shown in   13) inputs with one hidden layer (nine nodes) and six ( 6) outputs (13-1-6) with a total of 28 nodes distributed over the layers.Figure 3 and Figure 4 show the neural network architecture for the design and control of CODC respectively.

Results
In this research work, artificial neural networks (ANN) model and artificial neural networks based Monte Carlo  (ANNBMC) simulation were developed separately for both the design and controller of the crude oil distillation column (CODC) to check for the accuracies and differences in their outputs from the network architectures used.They were both validated using experimental data obtained from functioning crude oil distillation column of Port-Harcourt Refinery, Nigeria.Out of the one-hundred and thirty (130) experimental data sets obtained, ninety percent (90%) were used for training the network while the remaining ten percent (10%) were used for testing the network to determine its prediction accuracy.MATLAB program was written for the neural networks model and artificial neural networks based Monte Carlo simulation.were used for the design of the CODC with neural network architectures of 10 hidden neurons respectively.

Discussion of Results
The maximum training error obtained when artificial neural networks (ANN) model with architecture of 10 hidden neurons was used for the crude oil distillation column (CODC) design was below 6 while convergence was achieved below 400 iterations as shown in Figure 5.For the CODC design using ANNBMC, the training network converged to less than 10 −6 below 2500 iterations with training error of less than 1.6 as shown in Figure 6.A maximum training error of less than 1.5 was achieved with convergence of below 700 iterations for the CODC controller using ANN only while the maximum training error was less than 1.2 and converged to a value of less than 10 −7 after about 5000 iterations for the CODC controller using ANNBMC as depicted in Figure 7 and Figure 8 respectively.The prediction error decreases with increasing iteration number during the network training in all the plots of training errors against iteration numbers.Although larger number of iteration steps was required to achieve convergence of less than 10 −7 when ANNBMC was used for both the design of the CODC and controller than when ANN only was used, lesser values of training errors were exhibited by the network architecture.The maximum relative errors between the experimental data and the calculated data obtained from the output variables of the neural network for CODC design were 1.98 error % and 0.57 error % when ANN only and ANNBMC were used respectively.For the CODC controller, the maximum relative errors between the experimental data and the calculated data obtained from the output variables of the network architecture were 0.346 error % and 0.124 error % when ANN only and ANNBMC were used respectively.This is an indication that minimum errors were achieved when ANNBMC was used for both the CODC design and controller prediction which corresponds to its training network architecture having lesser training error values with higher number of iteration steps in both task.Thus, using ANNBMC shows that a better training of the network with higher number of iteration steps reduces the relative error between the experimental data and calculated data.
The linear regression analysis performed between the experimental data obtained from the refinery and the calculated data obtained from the neural network architecture for the comparative test results for the prediction of T 10 of AGO, T 90 of Diesel, T 100 of Kerosene, Naphtha, Kerosene, Diesel and AGO flow rates using ANN and ANNBMC are shown in Figures 9-15 respectively.The correlation coefficients obtained for T 100 of Kerosene, T 90 of Diesel, T 10 of AGO, naphtha, kerosene, diesel and AGO flow rates were 0.9387, 0.9879, 0.9236, 0.9315, 0.8065, 0.9499 and 0.9010 respectively when ANN only was used for the CODC design.The correlation coefficients obtained for T 100 of Kerosene, T 90 of Diesel, T 10 of AGO, naphtha, kerosene, diesel and AGO flow rates were 0.9965, 0.9998, 0.9954, 0.9888, 0.9838, 0.9942 and 0.9958 respectively when ANNBMC was used for the CODC design.This analysis performed shows that the prediction accuracies for the CODC design were better when ANNBMC was used.The minimum and maximum prediction accuracies were 80.65% and 98.79%; and 98.38% and 99.98% when ANN and ANNBMC were used for the CODC design respectively.This is an indication that the network architecture for the ANNBMC was rigorously trained for better prediction accuracies and thus can be used to predict design variables (output variables) of the crude oil distillation column than using only ANN.
The linear regression analysis performed between the experimental data obtained from the refinery and the calculated data obtained from the neural network architecture for the comparative test results for the prediction of stripping steam to main column, LDO stripper, HDO stripper, reflux flows 1, 2 and 3 using ANN and ANNBMC are shown in Figures 16-21 respectively.The regression coefficients executed between the experimental and calculated data were 0.9825, 0.9916, 0.9934, 0.9283, 0.9600 and 0.9717 for the stripping steam to main column, LDO stripper, HDO stripper, reflux flow 1 (Top Pump around), reflux flow 2 (Kerosene Pump around) and reflux flow 3 (Light Diesel Oil Pump around) respectively when ANN only was used for the CODC controller design.When ANNBMC was used for the CODC controller design, the regression coefficients were 0.9948, 0.9971, 0.9918, 0.9889, 0.9913 and 0.9944 for the stripping steam to main column, LDO stripper, HDO stripper, reflux flow 1 (Top Pump around), reflux flow 2 (Kerosene Pump around) and reflux flow 3 (Light Diesel Oil Pump around) respectively.The minimum and maximum prediction accuracies were 92.83% and 99.34%; and 98.89% and 99.71% when ANN and ANNBMC were used for the CODC controller respectively as both have excellent predictions.However, using ANNBMC for the CODC controller still predicts excellently well than ANN.These results also reflected in the minimal difference between the training error (<1.6 and <1.2) of each of the network architecture for both the ANN and ANNBMC respectively for the CODC controller as shown in Figure 7 and Figure 8. Excellent predictions for both methodologies resulted from their maintenance at particular values for various inputs of the network architecture.The little deviations between their output variables and that of the PID controller from the refinery resulted from their excessive usage by the PID controller to meet the product specifications (T 100 of Kerosene, T 90 of Diesel and T 10 of AGO, naphtha, kerosene, diesel and AGO flow rates).Thus, the artificial neural networks based Monte Carlo simulation controller is effective for the predictions of the output variables and maximally relating the non-linear behaviour existing among various variables of the process.

from the previous layer 1 L
− .The parameters α and η are introduced to control the numerical behaviour of the iteration.Whereas the learning rate η > 0 determines the degree with which the actual error gradients effect the weight change, the momentum factor [ ) 0,1 α ∈ acts as a delay parameter in the weight adjustment.

Figure 2 .
The inputs to the column are crude oil and steam flow while the outputs are Naphthalene, Kerosene, Light Diesel Oil and Heavy Diesel Oil.Some quantity of Naphthalene, Kerosene and Light Diesel Oil (reflux flows) are returned into the column while stripping (distillate flows) is sent to the storage tank.

Figure 2 .
Figure 2. Algorithm for design and control of crude oil distillation column using artificial neural networks based monte carlo simulation [2].

Figure 3 .
Figure 3. Architecture for the design of crude oil distillation column [25].

Figure 5 and 6 Figure 5 .
Figure 5. Training error vs iteration number for CODC design using ANN with 10 hidden neurons.

Figure 6 .
Figure 6.Training error vs iteration number for CODC design using ANNBMC with 10 hidden neurons.

Hint:
Comp a = Computed Values for the CODC Design using Artificial Neural Networks (ANN); Comp b = Computed Values for the CODC Design using Artificial Neural Networks Based Monte Carlo Simulation (ANNBMC); Exp 1 = Experimental Values of the CODC Design; Err a (%) = Error Percent between Exp 1 and Comp a .Err b (%) = Error Percent between Exp 1 and Comp b .

Hint:
Comp c = Computed Values for the CODC Controller using Artificial Neural Networks (ANN); Comp d = Computed Values for the CODC Controller using Artificial Neural Networks Based Monte Carlo Simulation (ANNBMC); Exp 2 = Experimental Values of the CODC Controller; Err c (%) = Error Percent between Exp and Comp c .Err d (%) = Error Percent between Exp and Comp d .

Figure 7 andFigure 7 .
Figure 7. Training error vs iteration number for CODC controller using ANN with 10 hidden neurons.

Figure 16 .
Figure 16.Comparative test results for the prediction of stripping steam to main column using ANN and ANNBMC.

Figure 17 .
Figure 17.Comparative test results for the prediction of LDO stripper using ANN and ANNBMC.

Figure 18 .
Figure 18.Comparative test results for the prediction of HDO stripper using ANN and ANNBMC.

Figure 19 .Figure 20 .Figure 21 .
Figure 19.Comparative test results for the prediction of reflux flow 1 using ANN and ANNBMC.

Table 1 (a), Table 1(b) and Table 1(c) show
the test comparison results obtained between the values from the trained ANN and ANNBMC networks for the design of CODC while Table 2(a) and Table 2(b) show the results obtained for the controller of the CODC compared with the experimental data of the CODC obtained from the Port-Harcourt Refinery, Nigeria.

Table 1 .
(a) Test results obtained using ANN and ANNBMC for crude oil distillation column design compared with experimental data from the refinery; (b) Test results obtained using ANN and ANNBMC for crude oil distillation column design compared with experimental data from the refinery; (c) Test results obtained using ANN and ANNBMC for crude oil distillation column design compared with experimental data from the refinery.

Table 2 .
(a) Test results obtained using ANN and ANNBMC for crude oil distillation column controller compared with experimental data from the refinery; (b) Test results obtained using ANN and ANNBMC for crude oil distillation column controller compared with experimental data from the refinery.