Application of Bayesian Approach in the Parameter Estimation of Continuous Lumping Kinetic Model of Hydrocracking Process

Hydrocracking is a catalytic reaction process in the petroleum refineries for converting the higher boiling temperature residue of crude oil into a lighter fraction of hydrocarbons such as gasoline and diesel. In this study, a modified continuous lumping kinetic approach is applied to model the hydrocracking of vacuum gas oil. The model is modified to take into consideration the reactor temperature on the reaction yield distribution. The model is calibrated by maximizing the likelihood function between the modeled and measured data at four different reactor temperatures. Bayesian approach parameter estimation is also applied to obtain the confidence interval of model parameters by considering the uncertainty associated with the measured errors and the model structural errors. Then Monte Carlo simulation is applied to the posterior range of the model parameters to obtain the 95% confidence interval of the model outputs for each individual fraction of the hydrocracking products. A good agreement is observed between the output of the calibrated model and the measured data points. The Bayesian approach based on the Markov Chain Monte Carlo simulation is shown to be efficient to quantify the uncertainty associated with the parameter values of the continuous lumping model.


Introduction
Hydrocracking is a catalytic process in which the hydrocarbon molecules with longer chains break into lighter hydrocarbons with shorter chains.Hydrocrack-S.S. H. Boosari et al. ing units in the petroleum refineries usually feed with the heavy fractions of residual oil that are not commercially valuable and instead produces more valuable lighter fractions of hydrocarbons like gasoline and diesel.The hydrocracking process is especially important due to maximizing the use of crude oil as its resources are substantially reducing.Turning to use unconventional oil and gas reservoirs in recent years is an evidence of this importance [1] [2] [3].Hydrocracking is a catalytic reaction at high-temperature and high-pressure conditions in the presence of hydrogen molecules [4].Due to coke formation, poison deposition, and solid state transformation, catalyst deactivation occurs during the process lifetime causing a reduction in the production yield.To compensate the catalyst deactivation's effects during the catalyst lifetime, reactor operating temperature should be adjusted (mainly increase) to maintain the desired production yield [5].In this point of view, simulation is an essential tool to adjust the operating temperature while keeping the production's yields in an acceptable range.Process simulation is also necessary for reactor design to achieve selective intermediate distillate production [6].
Hydrocarbons cover a broad range of molecular weights with different types of organic compounds that makes it difficult to monitor the consumption/ production rates in the mixture of reactions in the hydrocracking process.Therefore, process engineers usually lump a group of organic compounds regardless of their molecular shape but based on their true boiling point (TBP) temperature having similar physio-chemical properties.Accordingly, kinetic modeling of heavy oil hydrocracking can be done based on the discrete or continuous lumping approach [4] while each lump is indexed with its TBP temperature range.
The discrete lumps kinetic model considers each lump characterized by its TBP temperature as one reactive species [6] [7] [8] [9].For example, Sedighi et al. [10] studied a 6-lump kinetic model by considering the vacuum gas oil (VGO) having TBP temperature greater than 380˚C, diesel (260˚C -380˚C), kerosene (150˚C -260˚C), heavy naphtha (90˚C -150˚C), light naphtha (40˚C -90˚C), and gases (<40˚C) as discrete lumps of the process.In the discrete lumps approach, the heavier products associated with a higher TBP temperature can partially convert to the lighter products in a parallel/series reaction chain while each reaction path has its own reaction kinetic rate.Therefore, the number of model parameters is directly proportional to the number of accounted lumps.When the number of model parameters is higher, the parameter estimation algorithm needs a higher amount of measured data in order to accurately estimate the parameter values.Balasubramanian et al. [11]  desired lump fractions [6].Therefore, the continuous model has an advantage over the discrete lumped model, because any fraction of lumped products can be calculated from the same continuous distribution curve.
Both the discrete and the continuous kinetic models have unknown parameters that need to be estimated in order to calibrate the model.Regression techniques have been successfully used to minimize the misfit between the modeled and measured data.Sadeghi et al. [16] and Elizalde et al. [6] applied the continuous lumping model over different sets of measured data to minimize the least square error between the modeled and measured points and obtained a point estimate of the model parameters.Kumar et al. [17] applied hybrid particle swarm optimization to estimate the continuous lumping parameter values.
However, there are some uncertainties associated with the value of the parameters that are not considered in the point estimation methods.The measurement errors, the model structural error (due to model simplifications), and errors associated with the operating conditions (like isotherm reactor assumption) are the main sources of uncertainty in the hydrocracking kinetic models that can affect the value of the estimated parameters.To address the uncertainties in the parameter estimation, a probabilistic approach can be considered.Different techniques have been applied in scientific fields to deal with uncertainties.
Commonly, Monte Carlo simulations over a pre-assumed range of parameters [18] [19] [20], the autoregressive moving average for uncertainty analysis associated with the time series [21] [22] [23], the generalized likelihood uncertainty estimation (GLUE) method [24], and the Bayesian approach [25] [26] [27] [28] are used uncertainty quantification.Albrecht [29] studied four types of reaction models (with increasing complexity) including linearized second-order reaction, single elementary reaction, single elementary reaction coupled with temperature dependency, and catalytic reaction cycle aiming at evaluating the confidence interval associated with each model parameters.He applied different techniques to regress the model parameters to experimental observations with artificially added noise; and concluded that for highly non-linear models, the Markov Chain Monte Carlo (MCMC) algorithms utilizing a Bayesian approach accurately estimates uncertainty.
Parameter estimation and uncertainty analysis by using Bayesian approach are widely used in different fields of science.Alikhani et al. [30] applied Bayesian inference to estimate the confidence interval of groundwater residence time distributions by using multiple groundwater age tracers as observed data points.
Alikhani et al. [31] evaluated the information content of long-term measured data from a municipal wastewater treatment plant to evaluate the confidence interval of activated sludge model parameters.They assessed the level of obtained information associated with the measured data points by comparing the entropy of the posterior distribution of parameters to their prior distributions.They concluded that in multi-dimensional parameter estimation, the level of information obtained from a set of measured data is different for each parameter and therefore suggested to perform sensitivity analysis to select a subset of parameters to be estimated.Sun et al. [32] applied the Bayesian approach to estimate the isomer kinetic decay rate in the phytate (IP 6 ) degradation pathway.Their reaction modeling network is conceptually similar to the discrete lumps kinetic models where higher order molecules in a hierarchical degradation pathway convert to lower order molecules.Therefore, the Bayesian inference approach is assumed to be suitable for this study to obtain the confidence interval of hydrocracking model parameters.
Nonetheless, Bayesian parameter estimation framework needs to be solved by applying MCMC algorithms that usually requires a relatively larger number of model simulations.Therefore, an efficient numerical algorithm [33] should be selected to reduce the overall computation time of each simulation run.
In this study, the continuous lumping kinetic model is slightly modified to take into consideration the temperature dependency of parameters.The Bayesian parameter estimation approach is applied to measured data obtained from a hydrocracking unit of a petroleum refinery to obtain the posterior credible intervals of the model parameters.Finally, the Monte Carlo simulation is performed by taking samples over the posterior range of estimated parameters to evaluate the confidence interval of the model output concentrations in different operating conditions.

Description of the Model
The continuous lumping kinetic model in this study is explained in detail in [6] and [13] based on the original model presented in Laxminarasimhan et al. [15].
The first-order reactivity (k) of each component can be related to its θ by: ( ) where max k is the reactivity of the component with the highest TBP ( and α is the model parameter in the operating temperature T. The mass balance equation is then can be obtained as: where the right-hand side shows the consumption (the first term) and production (the second term) rate of the ( ) , c k t as the concentration of the component with reactivity k.

( )
D K is the species-type distribution function that transfers the system with N discrete components to a continuous lumping space with the following equation: The model is called continues because a cumulative distribution is assigned to the fraction yield of the species.( ) , p k K represents the production yield of component with reactivity k from cracking of component with reactivity K and is given by: ( ) where A and B are defined as: ( ) where 0 a , 1 a , and δ are the model parameters.0 S is the condition that sa- tisfies the , and can be obtained as: Solving the continuous model results in a distribution of the component concentrations ( ) , c k t .To obtain the weight fraction of each discrete fraction, the following integration can be performed: where ( ) , i j c t is the concentration of a specific fraction between the i TBP and j TBP .
To take into consideration the temperature effect on the model parameters, an Arrhenius-type relationship [34] is adopted into the model parameters as:

Bayesian Parameter Estimation
The main idea of the Bayesian inference application in the parameter estimation is extracted from the work by Alikhani et al. [31].In this approach, a prior distribution for each parameter should be assigned first.The prior range can be obtained by using the values reported in other studies for the similar operating conditions.
The confidence about the prior range can be improved by applying the Bayes' theorem given the measured data points.The Bayes' theorem can be shown as: The results of applying Bayes' theorem would be obtaining the posterior distribution ( ) p σ Y  λ of model parameters λ given a measured data set of Y  ; while ( ) p λ represents the prior distribution.A normal distribution is consi- dered for all the parameters in this study.
In Equation ( 2), ( ) represents the likelihood function.By assuming Gaussian distribution of errors, the likelihood function can be shown as: | , e 2π where n is the total number of measured data points, i y  is a single measured point while ( ) i y λ is its corresponding model output for a certain parameter set of λ .σ is the likelihood standard deviation, representing the quantified uncertainty associated with the measured data points.λ represents the five ki- netic parameters ( max k , α , 0 a , 1 a , and δ ) at reference temperature ( ) The Bayesian parameter estimation is particularly useful for the system of unknown parameters that are highly dependent on the operating conditions in which the calibrated parameter value of other studies is not suitable to be used in another study.Nonetheless, the information about the parameter values in other studies can be used to construct the prior distributions; and Bayesian approach can extract the information in the observed data set to enhance our confidence about the parameter values [31].To obtain the posterior distribution by using the Bayesian approach, the mechanistic model needs to be solved by using Markov Chain techniques in a random walk approach [30].Metropolis-Hasting algorithm is applied in this study to sample parameter sets of λ from posterior distribution ( )

Measured Data Points
Measured data points obtained from the case study introduced in [5].The measured data were collected from a hydrocracking unit of a petroleum refinery where VGO feeds in 395˚C and 20 Mpa into 2 reactors in series.Reactors are filled with a NiO-WO 3 /SiO 2 -Al 2 O 3 catalyst having the density of 790 kg/m 3 , specific surface area of 235 m 2 /g, and porosity of 0.5.The hydrocracking products were collected (Table 1) in the form of light petroleum gas, LPG, (<39˚C), naphtha (39˚C -150˚C), kerosene (150˚C -250˚C), diesel (250˚C -380˚C), and VGO (>380˚C).The measured data points at four different reactor temperatures (390˚C, 410˚C, 430˚C, and 450˚C) and six different residence times are shown in Table 1.Part of the measured data points is also shown in Figure 1 and Figure 2. The parameter values obtained by GA optimization were used as starting Table 1.Measured data points of hydrocracking process used in this study [5].This finding is in good agreement with the parameter-temperature correlation presented in Elizalde et al. [6].The small value for the likelihood standard deviation ( σ ) shows that the level of uncertainty in the presented system is low; indicating that the errors associated with the measured data and model structural errors are low.

Results and Discussion
In Table 2, the posterior range of the parameters can dynamically be reevaluated by introducing a new measured data set.In this way, the current posterior range can be introduced as a given prior range to the Bayesian approach and the modified posterior range can be obtained by applying new observed data.
To evaluate the ability of the model to meet the observed data, the Monte Carlo simulation is performed and 5000 realization parameter sets randomly sampled from the 95% posterior range.The model outputs were statistically analyzed and the 95% confidence interval of weight fraction of hydrocracking products are obtained and illustrated in Figure 2.

Conclusions
In this study, the parameter estimation of the hydrocracking process model is The Genetic Algorithm and the Bayesian parameter estimation approach are applied to obtain the point estimate and the credible interval of the model para- The following general conclusions obtained from the presented study: 1) Continuous lumping kinetic model was able to simulate the hydrocracking of VGOs in the range of 390˚C to 450˚C, and the residence time of up to 2 hr.
2) The model parameters were estimated by maximizing the likelihood function between the model outputs and measured data points.
3) The temperature dependency of the model parameters was successfully embedded into the continuous lumping model and the temperature dependency coefficients were estimated.
4) The uncertainty associated with the parameter values was evaluated by applying Bayesian theorem, and MCMC technique and the posterior range of parameters were obtained.
5) Monte Carlo simulations were performed to evaluate the confidence interval of hydrocracking products' yield.
considered a 5-lump model based on carbon number and ended up to estimate 40 unknown parameters for their model.The continuous lumping kinetic model [12] [13] [14] [15] can overcome this drawback where the number of parameters is independent of the number of lumped fractions in the model.In the continuous model approach, a distribution of reaction products over a range of TBP temperatures represents the model's output.Integration over a discrete boiling point represents the concentration of S. S. H. Boosari et al.
The model is defined based on a dimensionless temperature (θ) valued between 0 to 1 showing the range of TBPs between the lowest ( low TBP ) to highest ( high TBP ) TBP values in the hydrocracking process: values at temperature T and refetrence temperature r T , respectively.β is the temperature dependency coef- ficient for any model parameter η .By this modification, the model can be ap- plied to a broader range of operating temperatures with a unique set of parameter values.The continuous kinetic model is theoretically applicable for a broad range of hydrocracking products and operating conditions if the model outputs are in a good agreement with the experimental results.

1 a
Genetic algorithm (GA) used to obtain the point estimates of the modified con-S.S. H.Boosari et al.    tinuous kinetic model parameters by maximizing the likelihood function (Equation(12)).In total, 10 model parameters plus the likelihood standard deviation treated as unknown parameters to be estimated by applying the observed measured data.The results from GA for two operating temperatures at 390˚C and 450˚C are shown in Figure1.The results from GA optimization indicate that the continuous lumping model is reasonably able to capture the trend and magnitude of the measured data.Moreover, results in Figure1show that applying the temperature dependency of parameters into the kinetic model works very well for the same set of parameter values at all the four reactor temperatures.In this study, 390˚C is considered as the reference temperature and the five main parameters of the continuous model ( max k , α , 0 a , , and δ ) are estimated in this temperature.The observed data points at the other temperatures are applied to estimate the temperature dependency values of the parameters.

Figure 1 .
Figure 1.Modeled (lines) vs. measured (symbols) for the reactor temperature at left) 390˚C, and, right) 450˚C.The GA's point estimate of the model parameters is used to obtain the modeled data.

Figure 2 .
Figure 2. 95% model output confidence interval (floating bars) vs. measured data (circles) at four different operating temperatures with three different residence times.

Figure 2
show that most of the measured data points are fall inside the 95% model output range (shown as floating bars).The results also show that for the lightest fraction (LPG), continuous lumping model weakly predicted the observed data.Nonetheless, the model is predicting the moderate and heavy fractions with a good agreement.The model confidence intervals reflect the uncertainty level associated with the parameter values and can be used in decision-making step, obtaining factors of safety in designing step, and increasing the level of accuracy in the data measuring (and sampling) step; and generally, can enhance the modeling and simulation efficiency.
meters.A good agreement between the calibrated model output and the measured data is observed showing that the continuous lumping model is able to simulate the presented hydrocracking process.The results also show that the modified continuous lumping model considering the temperature dependency of the parameters extends the ability of the model on different operating temperature.Applying the Bayesian approach resulted in the 95% credible interval of model parameters reflecting the uncertainty associated with parameter values.A probabilistic simulation is also performed by using the posterior range of the parameters to obtain the confidence interval of model outputs.The results show that for the studied process unit, the uncertainty associated with the measured data and the model structural error is quantitatively low.The Bayesian approach based on the Markov Chain Monte Carlo simulation is shown to be efficient to quantify the uncertainty associated with the parameter values of the continuous lumping model.

Table 2 .
95% Bayesian credible interval and the standard deviation of the modified continuous lumping kinetic model parameters.
S. S. H.Boosari et al.assessed.One system consists of five fractions (LPG, naphtha, kerosene, diesel, and VGO) is modeled by using continuous lumping approach.The continuous model is modified by considering the effect of reactor temperature in the parameter values.