^{1}

^{2}

^{*}

The determination of water saturation is a key step for the reservoir characterization and prediction of future reservoir performance in terms of production. The importance of water saturation has been further identified when the reservoirs refer to rocks with low porosity and permeability such as shale and tight formations. In this communication, two advanced artificial intelligence strategies consisting of least square support vector machine (LSSVM) and gene expression programming (GEP) have been applied in order to develop reliable predictive models for the calculation of water saturation of shale and tight reservoirs. To this end, an extensive core and log data bank has been analysed from 12 wells of a Mesaverde group tight reservoir located in the largest Western US. The results indicate that the estimated water saturation data by the models developed in this study are in satisfactory agreement with the actual log data. Furthermore, new methods proposed in this study are useful for the characterization of shale and tight reservoirs and can be applied to the relevant software.

The reservoir characterization is a fundamental task for the determination of future reservoir performance. A large number of errors may enter the calculations associated with a future prediction of reservoir performance if there is no accurate and appropriate reservoir characterization which can cause the lose of the important values of reserves estimations and hydrocarbon production, etc. [

Over the years, many attempts have been carried out for accurate calculation of formation water saturations. Furthermore, there are few studies to estimate water saturation of shale and tight reservoirs. Fertl and Hammack [

Al-Blushi et al. [

As a result, the previously published models available in the literature fail to cover a wide range of petrophysical properties to estimate water saturation. Furthermore, the literature models have not been proposed on the basis of large numbers of water saturation data points. As a result, the prediction of water saturation by the literature models requires time-consuming calculations, reading graphs, optimization of the coefficients, etc. Therefore, the development of simple-to-use predictive models as well as empirically derived methods is needed. In this study, a large and extensive data bank is used, including more than four thousands petrophysical data points for the development of a reliable artificially intelligent based model which is based on the least square support vector machine (LSSVM) and an empirically derived method using gene expression programming (GEP) algorithm. Additionally, the most important error parameters are calculated to visualize the accuracy of the models proposed in this study as well as graphical error analysis including scatter diagram and contour map.

A literature survey on the previously published researches demonstrated that true formation resistivity (R_{t}), porosity induced by neutron log (PHID), porosity induced by density log (PHIN), effective prosity from density log (PHIDE), effective prosity from neutron log (PHINE), effective porosity induced by density and neutron logs (PHIDNE), effective porosity (PHIE), total porosity (PHIX), bulk density (RHOB), photoelectic (PE), and volume of shale from gamma ray (V_{sh}) are known as the most effective parameters for the calculation of water saturation (S_{w}) [_{sh}, R_{t}, PHIN, PHIDE, PHIDNE. Therefore, all of the paramters are correlated with collectively. The petrophysical properties which have the highest effects on the water saturation data available in the data bank are considered as input parameters for the model development.

Parameter | R_{t}/(Ω∙m) | PHIN/1 | V_{sh}/% | PHIDE/1 | PHIDNE/1 | PHIE/1 | S_{w}/1 |
---|---|---|---|---|---|---|---|

Min. | 6.463 | 0.037 | 0 | 0 | 0 | 0 | 0.095 |

Avg. | 29.983 | 0.158 | 0.562 | 0.038 | 0.035 | 0.029 | 0.801 |

Max. | 323.999 | 0.57 | 1 | 0.5 | 0.45 | 0.297 | 1 |

Type | Input | Input | Input | Input | Input | Input | Output |

Least squares support vector machines are least squares forms of support vector machines (SVM), which are a set of associated supervised learning methods that investigate data and identify patterns, and that are used for sorting and regression analysis offered by Suykens et al. [

In these equations w characterizes the linear regression (regression weight), T is symbolic of the transpose matrix, e is training items regression error, b is the model linear regression intercept, and shows the feature map. The cost function of LSSVM algorithm, Q_{LSSVM} is calculated below [

In which

With the assumption of linear regression between independent and dependent LSSVM variables, Equation (1) can be re-written as [

With the subsequent equation the Lagrange multipliers,

By means of Kernel function the first linear regression equation will be changed into a nonlinear form [

In the above equation,

Radial basis function (RBF) is the utmost used relation for calculating the Kernel function [

Here

where S is the water saturation, pred. and exp. stand for the predicted, and experimental or actual data, respectively, and n_{s} is the initial population number [

Ferreira [

( u × v ) + ( f l ) (11)

where u, v, f and l express the input variables for estimating the target variable (water saturation), and ÷, × and + stand for the fitness functions.

For developing predictive models to estimate the water saturation data using two modelling strategies viz. the LSSVM and GEP algorithms, the same input variables including PHIE, V_{sh}, GR, R_{t}, PHIN, PHIDE, PHIDNE have been considered. The database gathered should be randomly divided into two sub-sets. The first sub-set is called “Training” and the second is “Test” set which have been applied to develop models and check the prediction performance, respectively. Around 80 % of the entire data is assigned to the training set, and the rest is allocated to the test set. In this study, two important statistics error parameters have been used through a comprehensive error analysis in order to visualize the accuracy and performance capability of the developed models for the water saturation prediction. The statistical error parameters implemented in this study are squared correlation coefficient and average absolute relative deviation (AARD) as follows:

In the first stage, the LSSVM algorithm was coupled with an optimization strategy known as coupled simulated annealing (CSA) [^{2}). As a result, the values tuned by the CSA technique for the LSSVM model in order to estimate the water saturation data are σ^{2} = 1.4181 and γ = 328.2432. To propose a new empirically derived equation based on the GEP algorithm, three genes with 30 chromosomes are applied as a starting condition. Additionally, the AARD is considered as the accuracy function so that the optimal form of the newly developed model has the lowest AARD. Furthermore, a function set including power, cube root, ×, ÷, - and + is selected during applying the GEP methodology. The final model obtained by the GEP algorithm developed in this study is a simple-to-use equation with lowest possible coefficients as follows:

where S_{w} denotes the water saturation, PHIDNE stands for the effective porosity induced by density and neutron logs, PHIDE indicates the effective porosity from density log, PHIN shows the porosity induced from neutron zoned, PHIE is the effective porosity, V_{sh} expresses the volume of shale from gamma ray, and finally R_{t} stands for the true formation resistivity.

As a result, the optimal condition to apply the equations above is the range of the petrophysical properties used which have previously been summarized in _{sh}, PHIDE, PHIDNE, PHIE have a minimum value of zero and should not be in the denominator individually. Therefore, this condition is considered to develop the equations presented above. Although the equation proposed in this study is also applicable for calculating water saturation of conventional reservoirs, the equations proposed in this study have been developed based on the data from shale and tight reservoirs.

The error parameters calculated for the LSSVM model and the new method (Equation (14)) are AARD= 3% and R-squared= 0.96, and AARD= 10.6 and R-squared = 0.77.

R_{t}/(Ω∙m) | PHIN/1 | V_{sh}/1 | PHIDE/1 | PHIDNE/1 | PHIE/1 | S_{w}/1 | LSSVM | LSSVM ARD | New Method | New Method ARD |
---|---|---|---|---|---|---|---|---|---|---|

11.2522 | 0.27511 | 1 | 0 | 0 | 0 | 1 | 1.00 | 0.27 | 1.00 | 0.39 |

10.2244 | 0.30458 | 1 | 0 | 0 | 0 | 1 | 1.00 | 0.00 | 1.00 | 0.47 |

10.0831 | 0.30795 | 1 | 0 | 0 | 0 | 1 | 1.00 | 0.16 | 1.00 | 0.43 |

12.6846 | 0.24391 | 1 | 0 | 0 | 0 | 1 | 1.00 | 0.00 | 1.00 | 0.38 |

13.7581 | 0.21807 | 1 | 0 | 0 | 0 | 1 | 1.00 | 0.13 | 1.00 | 0.07 |

16.0662 | 0.18945 | 1 | 0 | 0 | 0 | 1 | 1.00 | 0.46 | 1.00 | 0.14 |

20.0791 | 0.16955 | 1 | 0 | 0 | 0 | 1 | 1.00 | 0.34 | 0.98 | 1.68 |

42.3458 | 0.10567 | 0.55381 | 0.03636 | 0 | 0 | 0.78798 | 0.80 | 0.96 | 0.80 | 0.89 |

39.7585 | 0.11076 | 0.56332 | 0.04821 | 0.00595 | 0.00595 | 0.73624 | 0.72 | 2.77 | 0.73 | 0.78 |

37.3472 | 0.11451 | 0.51135 | 0.06638 | 0.02423 | 0.02423 | 0.6802 | 0.70 | 2.30 | 0.66 | 3.42 |

28.6273 | 0.12697 | 0.65267 | 0.09144 | 0.02602 | 0.02602 | 0.64359 | 0.63 | 1.76 | 0.65 | 1.17 |

28.5779 | 0.1255 | 0.66401 | 0.08862 | 0.02236 | 0.02236 | 0.65461 | 0.64 | 2.31 | 0.67 | 2.12 |

29.0527 | 0.12163 | 0.60497 | 0.1013 | 0.03547 | 0.03547 | 0.62585 | 0.63 | 0.59 | 0.60 | 3.49 |

29.8872 | 0.11417 | 0.59226 | 0.09137 | 0.02807 | 0.02807 | 0.66372 | 0.67 | 0.80 | 0.64 | 3.65 |

29.5548 | 0.11734 | 0.61305 | 0.08331 | 0.02222 | 0.02222 | 0.68381 | 0.66 | 2.79 | 0.67 | 1.86 |

29.2953 | 0.12034 | 0.58618 | 0.08827 | 0.02971 | 0.02971 | 0.66611 | 0.66 | 1.07 | 0.64 | 3.88 |

29.8534 | 0.11828 | 0.64503 | 0.13148 | 0.04734 | 0.04734 | 0.54482 | 0.51 | 7.00 | 0.51 | 5.66 |

30.621 | 0.12051 | 0.73746 | 0.12583 | 0.03331 | 0.03331 | 0.54123 | 0.57 | 5.38 | 0.57 | 5.12 |

32.8421 | 0.11708 | 0.67132 | 0.12041 | 0.03672 | 0.03672 | 0.54446 | 0.54 | 1.25 | 0.56 | 2.02 |

36.0467 | 0.10965 | 0.55381 | 0.10376 | 0.03815 | 0.03815 | 0.58221 | 0.62 | 6.05 | 0.57 | 2.73 |

_{t} range of 0 - 100. From the results obtained, it could be concluded that the methods proposed in this study (the LSSVM and GEP) can be reliable alternatives for the previously published models available in the literature as they may fail to cover a wide range of petrophysical properties to estimate water saturation, and also require time-consuming calculations, reading graphs, optimization of the coefficients, etc. As a result, the accuracy and future applicability are two main advantages of the models developed in this study. The LSSVM model could predict the water saturation data with higher accuracy than the new equation proposed. On the other hand, the equation proposed based on the GEP algorithm is more simple-to-use so that it can be used for future calculations and soft wares related to water saturation and reservoir characterization. Therefore, a combined application of both LSSVM and GEP algorithms is recommended in order to accurately predict water saturation of shale and tight reservoirs.

The current study aimed to propose reliable models for the prediction of water saturation of shale and tight gas reservoir. The modelling approaches implemented in this study were the gene expression programming, and least squares support vector machine. The results obtained in the current study indicated that the two methods developed in this study could be applied for the characterization water saturation of shale reservoirs. The R-squared error values of 0.96 and 0.77 (average absolute relative deviation (AARD) of 3% and 10.6%) were obtained for the LSSVM model and the newly proposed equation, respectively. As a result, the methods proposed by gene expression programming in this study is a capable alternative for the previously published models which require complex and time-consuming calculations.

Kamari, A. and Sheng, J.J. (2018) New Methods to Calculate Water Saturation in Shale and Tight Gas Reservoirs. Open Journal of Yangtze Gas and Oil, 3, 220-230. https://doi.org/10.4236/ojogas.2018.33019