Predictive Modeling of Gas Production , Utilization and Flaring in Nigeria using TSRM and TSNN : A Comparative Approach

Since the discovery of oil and gas in Nigeria in 1956, much gas has been flared because the operators pay little or no concern to its utilization, and as such, trillions of dollars have been lost. In this paper, a model is proposed using Time Series Regression Model (TSRM) and Time Series Neural Network (TSNN) to model the production, utilization and flaring of natural gas in Nigeria with the ultimate aim of observing the trend of each activity. The results show that TSNN has better predictive and forecasting capabilities compared to TSRN. It is also observed that the higher the hidden neurons, the lower the error generated by the TSNN.


Introduction
Natural gas was first discovered in Nigeria in 1956, at Afam, Rivers State, in association with oil during the drilling of Oloibiri well (now in Bayelsa State), which was the first commercial oil discovery in the country.However, Nigeria's natural gas development is still at its infancy, but with very high potential for growth.Various literatures cite that Nigeria is more endowed with natural gas reserves than oil.Nigeria has been considered an oil rich nation in Africa as shown in Figure A1 nevertheless currently, the country is Africa's largest natural gas holder with a proven reserve of 186.99 tcf and the 7th as shown in Figure A2 and has been described as a gas province with oil pockets.Unfortunately, the Multinational Oil Company at the time of discovery-Shell BP, paid little or no attention to the utilization of this resource since it was not their primary drilling objective.At this early stage of development, there was no legislation governing the utilization of natural gas in the country, while, at the same time, there was little or no market for the commodity.A recent study conducted by the World Bank revealed that developing countries account for more than 85% of gas flaring and venting worldwide, with Nigeria being the largest [1].The first utilization of gas could be traced to 1963 when Shell BP sold and supplied the resource obtained from fields in Aba and Ughelli to industries around the areas.Later, the company started supplying the commodity to the then Electricity Corporation of Nigeria (ECN), later named National Electricity Power Authority (NEPA), and now known as Power Holding Corporation of Nigeria (PHCN) at its Afam plant in Rivers State.The commodity was also supplied to the River State Utility Board in Port Harcourt, capital of Rivers State.
Gas flaring refers to the burning of natural gas that is associated with crude oil when it is pumped up from the ground.This is a means of disposal either because there is no market for the gas or the operator does not elect (or cannot use) the gas for a non-wasteful purpose.On the other hand, venting is the release of natural gas that cannot be processed for sale or use because of technical or economic reasons.Gas flaring in Nigeria dated back to the onset of oil production in Oloibiri in 1958 with the flaring of about 4.5 million scf/day of associated gas.An average of 1000 scf of associated gas is produced for every barrel of oil produced.The amount of associated gas flared increased in proportion to the volume of oil produced and rose progressively to about 2.6 billion scf/day in 1996 when crude oil production averaged 2.4 million barrels per day.The volume of associated gas flared has only decreased slightly to 2.3 billion scf/day in subsequent years despite various regulations and measures put in place to discourage the flaring of gas associated with oil production.
Natural gas flares cause various degrees of pollution such as variations in the chemistry, meteorological, biological, and chemical parameters of the air and atmosphere, as well as soil conditions in the immediate environment of the flare.Local farmers have complained about retardation of growth and productivity of farm crops around gas flares, as well as scarcity of animals around the gas flare environment.
The problem is that in Nigeria, not much has been done in predicting and forecasting the production, utilization and flaring of the natural gas.In this paper, we seek to use a combination of two statistical models-Time Series Regression and Artificial Neural Network in solving this problem.

Gases Associated with Gas Flaring
When natural gas is flared, a combustion reaction takes place in the form stated below [2]: Presented below, as an example, is the combustion reaction of propane.
During a combustion reaction, several intermediate products are formed, and eventually, most are converted to CO 2 and water.Some quantities of stable intermediate products such as carbon monoxide, hydrogen, and hydrocarbons will escape as emissions.
For a complete reaction, carbon (IV) oxide and water vapour are formed.However, when the reaction is incomplete, carbon (II) oxide is formed alongside carbon (IV) oxide and water vapour.
Depending on the location, impurities such as sulphur, nitrogen and hydrogen sulphide are also found with natural gas.These gases undergo a combustion reaction to form acid gases such as oxides of nitrogen NO x , oxides of sulphur SO x and hydrogen sulphide H 2 S.
Complete combustion requires sufficient combustion air and proper mixing of air and gas.Smoke may result from combustion, depending upon gas components and the quantity and distribution of combustion air.Gases containing methane, hydrogen, CO, and ammonia usually burn without smoke.Gases containing heavy hydrocarbons such as paraffins above methane, olefins, and aromatics, cause smoke.

Application of Artificial Neural Network in Petroleum Engineering
Artificial Neural Network (ANN) have been used to address some of the fundamental problems in petroleum engineering that conventional predictive models have been unable to solve, especially when engineering data for design, interpretations and calculations have been less adequate.Also, with recent advances in pattern recognition, classification of noisy data, nonlinear feature detection, market forecasting, sickness recognition in human blood in medicine, and process modeling, ANN technology is very well suited for solving problems in the pe-troleum industry.Several authors have developed ANN models to solve several problems in the petroleum industry.Juniardi and Irashagi [3] developed ANN model to predict permeability and skin factor of faulted Reservoir.Arehart [4] developed a 3-layer back propagation neural network model to determine the grade of a drill bit while it is drilling.Ashenayi et al. [5] used a hybrid 3-layer back propagation neural network model to identify beam pump malfunctioning from down hole pump cards.Erahaghi et al. [6] used a multiple ANN to train and recognize patterns (C D , P D , t D , S, d D …) for specific conceptual reservoir model.Kumoluyi [7] discussed the general application of neural networks and their potential uses in some areas of petroleum engineering.They found that one advantage of feed forward networks in pattern recognition is their ability to recognize patterns regardless of position, rotation and scaling.The application of pattern recognition is essential in well log interpretation of multiphase flow and in seismic data processing.Mohagheh et al. [8] used a 3-layered forward back propagation neural network model to estimate the heterogeneity of some reservoirs.Briones et al. [9] developed a 3-layer radial basis neural network (RBFNN) model to relate gas-oil ratio (GOR) and API gravity to the corresponding molar composition (C 1 , C 2 , C 3 , C 4 , C 5 , C 6 , C 7 and CO 2 ).Mc Vay et al. [10] used a feed forward back propagation neural network model to train the actual refracture treatment design, basic well information, and well performance in order to determine the Sand Volume, Fluid Type, Injection Rate and Acid Volume as the majoring factors that influence the well deliverability during hydraulic fracturing.Manmath et al. [11] used ANN model to predict fluid distribution taking oil, water, and gas production as input data.Wong et al. [12] developed a back propagation neural network (BPNN) model to estimate formation permeability in the RAVVA oil and gas field offshore in India.Garrmouch and Smaoul [13] developed a 3-layered back propagation neural network model to estimate formation permeability of tight gas reservoir.Soto et al. [14] used a neural network model to predict the permeability and porosity of zone C of the Cantagallo field in Colombia.Shelley et al. [15] developed two separate neural network models for well completion analysis and optimization to identify the factors that affect production and measure their contributions to the production result.Nikravesh et al. [16] developed several neural network models for water flood management in fractured reservoir to predict the wellhead pressure and future production in quarterly basis.
Application of neural networks in time series forecasting [17]- [20] is based on the ability of neural networks to approximate nonlinear functions.The most popular treatment of input data is feeding the neural networks with either the data at each observation, or the data from several successive observations.Denote the data at instant k as y(k), where y may be a vector, then the above treatment can be described as ( ) , , , , where NN() stands for the neural network forecaster and l is the number of successive observations.This treatment considers the time series as a nonlinear time series and tends to generate a nonlinear "autoregression" model to fit the series.So far, there have been few papers describing how to choose inputs for the neural network forecaster in order to achieve better forecasting performance.It is our belief that the performance of a neural network forecaster is much affected by input data patterns.
Autocorrelation analysis has been often used in time series forecasting using statistical approaches such as ARMA models.This analysis is mainly used in detecting the autocorrelations between successive observations of time series, and used in the well-known ARIMA models with Box-Jenkins methods that are very efficient in forecasting linear time series [21].
Autocorrelation analysis can be used to determine the correct input patterns for nonlinear time series forecasting with a neural network.The scheme contains three phases: detection of input patterns, determination of the number of neurons in hidden layer(s), and construction of the neural network forecaster.In the detection phase, autocorrelation analysis is used to identify input patterns of time series for training.Determination of the number of neurons in hidden layer(s) is done with Baum-Haussler rules [22].The neural network forecaster is then constructed with the determined input patterns and the number of neurons in hidden layer(s).

Time Series Regression Model (TSRM)
We recall the linear regression model (LRM) given as: which is made up of the predicted part and the residual part.The residual is the difference between the observed and the predicted values which is ascribed to unknown sources.n is the number of observations, y i is the ith observation, ( ) , , , is the predictor variable vector related to y i , ( ) is the parameter vector, and e i is the error associated with ith observation.
Writing (1) in time series notation, we have ( ) Explicitly, this is written as ; 1, 2, , where y t is the dependent variable, x t is the independent variable (in this case, the "years"), α is the intercept, β is the parameter associated with the independent variable, x t , and e t is the stochastic term or error associated with the model.
We minimize (3) with respect to α and β, ( ) The predicted model becomes and the residual is given as

The Time Series Neural Network (TSNN) Model
The statistical neural network (SNN) model structurally is composed of two parts: the predictive and the residual, as is in classical regression, given as ( ) where ( ) . Thus Equation ( 6) can be written as 1 0 ( ) is the vector of the input variable, g(.) is the transfer (or activation) function and

( )
, , w α β γ = are the weights (or parameters) associated with the input vector, hidden neuron and the transfer function respectively, while e i is the error associated with the network.We note that when there is no hidden neuron, the SNN reduces to the ordinary regression model.
We propose a simple time series neural network model, The terms and symbols are as explained in the SNN model, except that t refers to "time" or "period".The weights are estimated using Taylor's first order approximation, The least squares estimate of the parameter θ is ( ) and the estimated model is while the network error is given as In this paper, we used the symmetric saturated linear transfer function, ( ) The data used in this study are annual data on natural gas production, utilization and flared in Nigeria  (1995)(1996)(1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006).This makes a total of 49 data set.In each case (production, utilization and flared), the TSNN model formulation is: 1-2-1, 1-5-1, 1-10-1.All input variables were standardized, that is, converting them to the range (0, 1) before feeding them into the network.This is to avoid the application of extremely small weighting factors in the case of large input values.
Similarly, the output values are "destandardized" to provide meaningful results since all values leaving the network are automatically output in a standardized format.This is done by simply reversing the standardization algorithm used on the input nodes.
We used SPSS for the TSRM part of the analysis, while a neural code was written for the analysis of the TSNN using MATLAB R2009a, and interesting results were obtained.

Model Selection Criteria
Here we discuss several criteria that have been used to choose between the two models.Several criteria are used for this purpose.In particular, we discuss these criteria: (i) R 2 ; (ii) adjusted ( ) R R ; (iii) Akaike information criterion (AIC); and (iv) Schwarz Information criterion (SIC).All these criteria aim at minimizing the residual sum of squares (SSE).However, except for the first criterion, criteria (ii), (iii), and (iv) impose a penalty for including an increasingly large number of predictors.Thus there is a tradeoff between goodness of fit of the model and its complexity (as judged by the number of predictors).

Results and Discussions
Figure 1 is a time plot of the production, utilization and flared natural gas in Nigeria oil and gas industry.The time plot of all the variables that are of interest in the study shows that gas utilization and production rate steadily accelerated upward from the base year till the end.More so, flared gas also had an upward trend except that it is an oscillatory trend.At times it rises and fall but later maintained the upward trend.The plot shows that during the first ten years of production, there was zero gas utilized and a geometrical increase in the produced and flared gas.Following this period, a steady increase in the amount of gas utilized while a gradual decline was observed for produced and flared gas for about ten years.However, the plot flattens during the last ten years showing that the volume flared remained constant whereas there was a corresponding sharp increase in the volume utilized with production.Some spikes on the plot at points 16, 23 and 38 corresponding to 1974, 1981 and 1996 represent the highest volume of gas flared.The amount of gas flared was higher than those utilized until 2004.
Figures A3-A5 show the prediction of natural gas in Nigeria.The graph show that TSNN have a higher prediction than TSRM, while their errors are in the reverse.

Time Plot of the Stationarized Variables
Correlogram of the data shows that the variables are non-stationarized since their respective lag value is zero and autocorrelation values are big (Figure 2).This necessitated the need to check for the unit root test of the respective variables.However, Using Augmented Dickey Fuller Unit root test trend and intercept authenticates the proof that the initial data of the variables has a unit root since their respective P-value are greater than 5%.Meanwhile at first difference, the three variables seems to be okay as it has been stationarized since both time plots seem to have constant means, their respective correlogram have none of its P-value to be zero and smaller autocorrelation values.Figure 2 shows the time plot of the stationarized variables Furthermore, ADF result below illustrates that the variable can now be used for time series model since their respective P-values are less than 5% which shows to be normal.
Table below shows the descriptive statistics of the variables which vividly indicates that the differencing variable were shown to be positively symmetric as their respective mean values (436.0625,1339.313 and 9903.2500) are bigger than their median values (258.5000,1058.000 and 116.0000), flared and production are negatively skewed as their respective skewness value is less than zeros, kurtosis of the three variables are mesokurtic since their respective K-value > 3.However, the regression result below indicates that the model is of best perfect fit as the coefficient of determination = 1 (one).Furthermore, flared and utilized has a positive joint contribution to the production of gas as it's P-value is < zero (significant).The results of the analysis in Tables 1-3 shows the MSEs of the TSNN are by far smaller than the MSE of the TSRM.Hence, from Table 4, all the models of the TSNN are preferred than the TSRM.

Regression Result
Table 5 and Table 6 summarize the results for model adequacy and selection.The percentages of the 2 R (as well as the R 2 ) are higher in all models of the TSNN than in the TRSM.This ascertain the fitness of the TSNN over TRSM.

HN Inequality
Gas Production

AIC SIC
Gas Production obtain two normal equations respec- tively.Solving the normal equations, we obtain the estimates of the parameters α and β : from 1958-2006, obtained from the Annual Abstract of Statistics of the Nigeria Bureau of Statistics (NBS), formerly Federal Office of Statistics (FOS)-(1970-1990), Ministry of Petroleum Resources (MPR)-(1991-1994) and NAPIMS-

Figure 1 .
Figure 1.Time plot of Nigeria's natural gas.

Figure 2 .
Figure 2. Time plot of the stationarized variables.

Figure 3 Figure 3 .
Figure 3 is a combo chart which helps us to see the relationship among production, utilized and flared gas during the period of investigation.Tables 1-3 summarizes the results of the model adequacy of the two models.The MSE compares the variations in the errors generated by the different models.The model with the smallest MSE is considered a better

Table 1 .
Estimated model adequacy for gas production.

Table 2 .
Estimated model adequacy for gas utilization.

Table 3 .
Estimated model adequacy for gas flared.R 2 measures the fit of each of the models.Since more than one model is involved, R 2 will not be adequate for comparison.Thus for model fit, 2 R , AIC and SIC is considered.In case of R 2 and 2 R , the higher the value, the better the model.The AIC and SIC, like the MSE, adjures a model to be a better one if it is less than another model under comparison.

Table 4 .
Model selection based on MSE.

Table 5 .
(a) Model selection based on R 2 and 2 R ; (b) Model Selection based on AIC and SIC.