Modeling the Nigerian Bonny Light Crude Oil Price: The Power of Fuzzy Time Series

Desmond Chekwube Bartholomew; Ukamaka Cynthia Orumie; Chukwudi Paul Obite; Blessing Iheoma Duru; Felix Chikereuba Akanno

doi:10.4236/ojmsi.2021.94024

Open Journal of Modelling and Simulation > Vol.9 No.4, October 2021

Modeling the Nigerian Bonny Light Crude Oil Price: The Power of Fuzzy Time Series

Desmond Chekwube Bartholomew^1*, Ukamaka Cynthia Orumie², Chukwudi Paul Obite¹, Blessing Iheoma Duru¹, Felix Chikereuba Akanno¹
¹Department of Statistics, Federal University of Technology Owerri, Owerri, Nigeria.
²Department of Mathematics and Statistics, University of Port Harcourt, Port Harcourt, Nigeria.
DOI: 10.4236/ojmsi.2021.94024 PDF HTML XML 218 Downloads 1,187 Views Citations

Abstract

Several authors have used different classical statistical models to fit the Nigerian Bonny Light crude oil price but the application of machine learning models and Fuzzy Time Series model on the crude oil price has been grossly understudied. Therefore, in this study, a classical statistical model—Autoregressive Integrated Moving Average (ARIMA), two machine learning models—Artificial Neural Network (ANN) and Random Forest (RF) and Fuzzy Time Series (FTS) Model were compared in modeling the Nigerian Bonny Light crude oil price data for the periods from January, 2006 to December, 2020. The monthly secondary data were collected from the Nigerian National Petroleum Corporation (NNPC) and Reuters website and divided into train (70%) and test (30%) sets. The train set was used in building the models and the models were validated using the test set. The performance measures used for the comparison include: The modified Diebold-Mariano test, the Root Mean Square Error (RMSE), the Mean Absolute Percentage Error (MAPE) and Nash-Sutcliffe Efficiency (NSE) values. Based on the performance measures, ANN (4, 1, 1) and RF performed better than ARIMA (1, 1, 0) model but FTS model using Chen’s algorithm outperformed every other model. The results recommend the use of FTS model for forecasting future values of the Nigerian Bonny Light Crude oil. However, a hybrid model of ARIMA-ANN or ARIMA-RF should be built and compared with Chen’s algorithm FTS model for the same data set to further verify the power of FTS model using Chen’s algorithm.

Keywords

ARIMA, Artificial Neural Network, Chen’s Algorithm, Fuzzy Time Series, Random Forest

Share and Cite:

Bartholomew, D. , Orumie, U. , Obite, C. , Duru, B. and Akanno, F. (2021) Modeling the Nigerian Bonny Light Crude Oil Price: The Power of Fuzzy Time Series. Open Journal of Modelling and Simulation, 9, 370-3900. doi: 10.4236/ojmsi.2021.94024.

1. Introduction

Several companies since 1907 have attempted to discover oil that has commercial value, but failed [1]. It was until when British and Shell Petroleum got licensed and began search for oil in 1937. According to [2], crude oil was discovered in Oloibiri, Niger Delta region of Nigeria by Shell-BP in 1956 while the first commercial well was drilled in 1958. Bonny Light crude oil is grouped as a light-sweet crude oil produced in Nigeria. It has an important benchmark in all West African crude oil-producing countries because it yields good gasoline which made it popular crude for U.S. refiners. Some other quality crude oil such as Odudu, Esquavos, Forcados, and Bonnie are also extracted in Nigeria. Prior to the discovery of crude oil in Nigeria, Nigeria strongly relied on agricultural exports such as palm produce, cocoa, cotton, timber, groundnut and rubber to improve the economy of the country [3] [4]. The agricultural sector contributed about 95% of the foreign exchange earnings in Nigeria. This made it possible to generate over 60% of Nigeria’s employment capacity and gross domestic earnings of approximately 56% [5]. Crude oil has been the engine of Nigeria economy for decades and has also played a key role in its development and success. Currently, Nigeria tops oil-producing countries in Africa, and the country depends heavily on the oil sector. In the year 2000, crude oil exports accounted for about 83% of the Federal Government’s revenue and about 98% of export earnings [4]. It also generated more than 14% of Nigeria’s Gross Domestic Product (GDP), provided about 65% of government budgetary revenues and 95% of foreign exchange earnings. Nigeria’s proven oil reserves were estimated to be between 16 to 22 billion barrels in the year 1997 by the United States Energy Information Administration (EIA) [6]. In 2010, about 10% of United States (U.S.) overall oil imports were provided by Nigeria; and among all the countries exporting oil to the U.S, Nigeria was ranked as the fifth-largest source. But in July, 2014 due to the alternative use of shale production in America, the supply declined. Currently, the largest consumer of Nigerian Oil is India [7]. Oil has been and will maintain this leading role as the world’s major commercial energy source [8]. Given Nigeria’s economic development dependency on crude oil and the recent plummet in oil price around the world, there is a need to critically model the oil prices which will aid to forecast what crude oil price will be in the future. This will help to equip the country on how to adapt to the inevitable downtrend in the crude oil price due to the emergence of hydraulic fracking evolution. This hydraulic fracking is an environmentally friendly drilling technique that makes it possible to extract natural gas from shale.

Nigeria has become the eleventh largest country that produces oil in the world [9], and since the Bonny Light oil is preferred over other sour crudes, it has positively affected the Nigerian economy, having India as the largest buyer of this oil. Light oils generate high profit and are also in high demand for refiners. Price direction, fluctuation and volatility have always been an important aspect for investors in oil sector [10]. Covid-19 pandemic which brought down economic prosperity in Nigeria and the world at large also affected the price of crude oil [11]. The price of crude oil experienced a sharp fall during the pandemic as most countries were struggling on how to contain the devastating effect of the pandemic within their borders. Air travel was cancelled for international and domestic flights, while sea and land travels were also restricted. This halted the economic production within countries. The Covid-19 pandemic contracted the country’s economy by 6.1% in the second quarter of the year 2020. The decline of 6.1% is Nigeria’s lowest in the last 10 years [11]. Deniz [12] noted that countries have different effects on renewable energy consumption due to the volatility in oil price, using panel data analysis for countries involved in importation and exportation oil. The relationship between energy consumption and oil price shocks has been a discussion in different kinds of literature. Since oil price changes and it is exposed to both external and internal shocks: the wealth of countries and their energy consumption level is predicted by the oil price shock, most especially developing countries that are dependent on oil such as Nigeria. Odusami [13] posited that the variation in the oil prices predicts the consumption level (energy and wealth inclusive). Nigeria as an oil-exporting country has been negatively affected by oil price change because it also imports refined products, e.g. automobile gas oil, premium motor spirit, and kerosene, etc. [14].

There is a significant decrease in the price of crude oil and because Nigeria is a mono-product country that relies heavily on the oil sector, it has a negative impact on the Nigerian economy. For her economy, this assertion is unchallengeable and many researchers such as [15] [16] have used a variety of traditional statistical models to fit the Nigerian Bonny Light crude oil price but the application of machine learning models and FTS on the crude oil price has been grossly understudied. This gap in the knowledge of the ability of machine learning models to model the crude oil price necessitated this study. We, therefore, attempt to check three different models: a traditional statistical model (ARIMA), a dynamic process with linguistic values as its observations (Fuzzy Time Series, FTS) and two machine learning models (Artificial Neural Network, ANN and Random Forest, RF) in modeling the Nigerian Bonny Light crude oil price. The crude oil price will be estimated using the high-performance model.

Some researchers have studied the Nigerian Bonny Light crude oil price and the relevant literatures are as follow: Suleiman et al. [15] reported that the best ARIMA and GARCH models for forecasting crude oil price in Nigeria are of order (3, 1, 1) and (2, 1) respectively. Wiri et al. [16] estimated 18 models and selected the best ARIMA model using the AIC. The ARIMA of order (1, 1, 1) was chosen as the best performing model because it had the least AIC (4.578) value. Omekara et al. [17] proposed a multiplicative SARIMA (1, 1, 1) (0, 1, 1)12.Aliyu [18] also studied the demand function of gasoline, kerosene, diesel, fuel oil and liquefied petroleum gas using the structural time series models (STMSMs) and reported that there are both price and income inelasticity in the demand for petroleum products in Nigeria. Ajayi et al. [19] applied NARDL approach, VAR model and Bai-Perron Structural Breaks Test, to model the impact on consumption of energy caused by the shocks in the oil price in Nigeria. This study indicated that despite the changes in the oil price, the energy consumption associated with oil is still the same or considerable due to low investment in other sources of energies. Usoro et al. [20] fitted crude oil series variance with ARCH (2), ARCH (3) and GARCH (3, 3) models and also fitted to the ARIMA (0, 1, 1) variance of the error. This study suggested that GARCH and ARCH captured the fluctuation in the series while bilinear non-linear component of the model parameter did not show evidence of volatility clustering. Therefore, when fitting volatile series, GARCH and ARCH models are preferred instead of bilinear model. Ojugo and Yoro [21] modeled oil market price using ARIMA model and forecasted its direction by analyzing while seeking the optimal solution. They found out that demand-supply curve rises despite the plummet in the trend and policies as of the time of the study. In Canada and the United States, Valadkhani [22] discovered that there has been a considerable upward marginal impact shift on consumer energy costs since 1999 using a Markov-regime switching model and Bai-Pearson Sequential technique. As a result, a percentage increase in oil prices raises the price of used energy.

2. Methodology

The data on Nigerian Bonny Light crude oil price used for this research purpose is secondary and was extracted from the Nigerian National Petroleum Corporation (NNPC) and Reuters. The monthly data ranges from January, 2006 to December, 2020. The data will be divided into two sets namely: the training set (in-sample-period) and the test set (out-of-sample period). Seventy percentage of the total points of the data constitutes the training set (in-sample period) while the remaining thirty percent is used as the test set. The training set will be used to estimate the parameters of the models while the test set to validate the model in order to know the performance of the model on new dataset. The rolling window estimation approach will be used in this study. The historical fixed set of data (the training set) will be used to predict future number continuously over a period of time (the test set).

In this work, four different models will be compared on the described data. The best model will be selected using some simple criteria to include: the modified Diebold-Mariano test, Root Mean Square Error, Mean Absolute Percentage Error and Nash-Sutcliffe. The explanatory variables for the machine learning models are time, lag 1 values of crude oil price, crude oil production and crude oil exportation.

2.1. Autoregressive Integrated Moving Average

Box and Jenkins [23] first developed the Autoregressive Integrated Moving Average (ARIMA) model. As the name implies, Autoregressive (AR) and the Moving Average (MA) models were combined on stationary data. The order of the ARIMA model is usually denoted as ARIMA (p, d, q) where “d” represents the frequency of differencing done to make the data stationary, “p” is the number of spikes that crosses the significant line of the Partial Autocorrelation Function (PACF) plot and “q” is the number of spikes that crosses the significant line of the Autocorrelation Function (ACF) plot. The process of identifying the model order was explained in [24]. The Augmented Dickey Fuller test with a trend and an intercept is used to test for stationarity.

Given a time series data $x_{t}$ , the ARIMA (p, d, q) model is given as:

$ϕ (B) {(1 - B)}^{d} X_{t} = θ (B) Z_{t}$ (1)

where

$ϕ (B)$ is the characteristic polynomial of order “p” for the autoregressive component of the model;

$θ (B)$ is the characteristic polynomial of order “q” for the moving average component of the model;

${(1 - B)}^{d}$ is the differencing of order “d” of the data;

$X_{t}$ is the observed value at time t;

$Z_{t}$ is the random error associated with observation at timet.

The model obtained by inspecting the ACF and PACF plots is compared to models with parameters close to it in order to identify a better model using the Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC). The model identified to have the least values of both AIC and BIC is suggested and subjected to residual checks using the ACF plot, time series plot and normality histogram plot of the residuals to test if the residuals are white noise. White noise is very crucial in time series forecasting and if the residuals are not white noise, then the model should be improved before it can be used for prediction purposes. The concept of identifying a white noise residual is explained in section 3.1 of this study.

2.2. Artificial Neural Network

The ANN model as invented by Frank Rosenblatt in [25] is a machine learning method used in modelling complex nonlinear relationships between the response and explanatory variables. ANN has been used and explained by [26].

The mathematical representation of the ANN model is given as:

$\overset{⌢}{y} (x_{i}, w) = Φ_{0} (α + \sum_{h = 1}^{H} w_{h} Φ_{h} (α_{h} + \sum_{j = 1}^{J} w_{j h} x_{i j}))$ (2)

where

$\overset{⌢}{y} (x_{i}, w)$ is the estimated response variable;

$w_{j h}$ is the weight from the input to hidden nodes;

$w_{h}$ is the weight from the hidden to output node;

$x_{i j}$ is the input node;

$α$ and $α_{h}$ are bias that can be interpreted as the intercept in a linear regression;

$Φ_{0}$ and $Φ_{h}$ are activation functions.

The transmission from input layer to the hidden layer is done using the logistic activation function while the linear activation function is used for the transmission from the hidden layer to the output layer. The connection between the nodes is assigned weights. The quadratic error function given in Equation (3) is used in this study to determine the weights.

$E_{Q} = \sum_{k = 1}^{K} \sum_{i = 1}^{n} {(\hat{y} (x_{i}, w) - y_{i k})}^{2}$ (3)

where

${\overset{⌢}{y}}_{k} (x_{i}, w)$ is the estimated response variable;

$y_{i k}$ is the response variable.

The importance of each input node is estimated using the Olden method [27].

The explanatory variables usually constitute the input node and were normalized using the min-max normalization method to help the neural network to converge quickly [24] [28].

2.3. Random Forest

Random forest is an easy-to-use machine learning method. It is an ensemble of decision trees for regression and classification. The output is the mode of the classes (for classification) or mean prediction (for regression) of the individual trees [29].

Random forest searches for the best feature among a random subset of features when splitting a node. This results in obtaining a better model. When splitting a node, only a random subset of the features is taken into consideration by the algorithm. The steps used to grow each tree are given in [30]. If two trees are correlated, it will increase the forest error rate. Figure 1 is a plot of a random forest.

Source: Analyticsvidhya.com.

Figure 1. A random forest plot.

2.4. Fuzzy Time Series (Ruey Chen Tsaur’s Algorithm)

Usually, Time series and regression models are used in forecasting and prediction using statistical methods but these models have many drawbacks in practice. The drawbacks arise because of some required number of assumptions that are unsatisfactory in regression models and poor performance when there are abnormal changes or series is non-stationary when using time series models. In order to overcome these drawbacks, various models have been recommended such as the random forest [30], artificial neural network [31], support vector regression [32], multivariate adaptive regression spline [33], adaptive spline threshold autoregressive model [34], etc.

All these models, both the traditional and the recommended models were developed primarily for solving forecasting problems. In the same approach towards solving these forecasting problems but now in which the historical data are presented as linguistic values, the fuzzy time series (FTS) model was proposed by [32] and tested with Enrollment data from the University of Alabama (EnrollmentUA). The model consists of two major processes: 1) fuzzification and 2) the establishment of fuzzy relationships and forecasting. Lengths of intervals will result in various forecasting results during fuzzification process Therefore; effective lengths of intervals should be used [35]. The forecasting results that were based on the effective lengths of intervals were found to outperform those based on arbitrary ones. So many authors have study and even extended the FTS model (Chen [36]; Huarng [37]; Huarng and Yu [38]; Singh [39]; Teoh et al. [40]; Liu et al. [41]; Yu and Huarng [42]; Khashei et al. [43]; Bas et al. [44] and Egrioglu et al. [45] ).

Chen [36] algorithm is simple, does not require complex matrix operations in the establishment step of fuzzy relationships and the algorithm produced better results with the same enrollment data [46]. This advantage encouraged the usage of Chen’s algorithm for this research. The algorithm of Chen’s method is summarized in Table 1.

The Chen’s algorithm and other models in this work were implemented using R programming software version 4.05.

Table 1. Chen’s algorithm.

2.5. Performance Measures

A simple performance measure to include: RMSE, MAPE and NSE were calculated. The model with the least RMSE and MAPE values and the highest NSE value are chosen as the best model. The modified Diebold-Mariano test was also implemented to test the hypothesis that model 2 is a better model than model 1 at 5% level of significance.

3. Result

This section contains the results of the four models. The highest and least US$/barrel within the study period were in June, 2008 (138.74 US$/barrel) and April, 2020 (14.28 US$/barrel) respectively. There had been a continuous increase and decrease in the Nigerian Bonny Light crude oil price with an average price of 77.02 US$/barrel for the 180 months study period.

3.1. The ARIMA Model

The time series plot of the Bonny light crude oil price for the periods (Jan. 2006 to Dec. 2020) is displayed in Figure 2. As illustrated in [47], the parameters (p, d, q) of the ARIMA model were estimated using the train set (Crude oil price between Jan. 2006 and Jun. 2016). The Augmented Dickey-Fuller test of lag 4 shown in Table 2 suggests that the train set is not stationary (p-value (0.6387) > 0.05) and this is also evident in the time series plot of Figure 3. The stationary data was obtained after the first differencing as shown in the time series plot in Figure 4 and the Augmented Dickey-Fuller test of lag order 4 is now significant (p-value (0.000) < 0.05). The first order of differencing suggests a “1” value for the “d” parameter of the ARIMA model. The lags die out quickly in the ACF plot

Figure 2. Time series plot of Bonny light crude oil price in Nigeria from Jan. 2006-Dec. 2020.

Table 2. Augmented Dickey-Fuller test.

Figure 3. Time series plot of Bonny light crude oil price in Nigeria from Jun. 2006-Jun. 2016.

Figure 4. Time series plot of first differenced series.

in Figure 5 while there is a sharp cutoff after the first lag in the PACF plot in Figure 6. This suggests an order “1” and “0” for the “p” and “q” parameters respectively. To choose the optimal ARIMA model, the suggested ARIMA model of order (1, 1, 0) was compared to models with parameters similar to it, as indicated in [47], using the AIC and BIC. As shown in Table 3, the ARIMA (1, 1, 0) model was chosen as the best order of ARIMA model since it had the lowest AIC and BIC values. The parameter of the model is therefore displayed in Table 4. Because there was no trend, no spikes cut through the significant line of the ACF plot, and the residuals are normally distributed, the ARIMA (1, 1, 0) errors are white noise, as seen in the time series plot of the residual in Figure 7. The residuals are then white noise, and the model can now be used for prediction.

Table 3. The different ARIMA models.

Table 4. The estimates of the coefficient of ARIMA (1, 1, 0).

Figure 5. The ACF plot of the crude oil price.

Figure 6. The PACF plot of the crude oil price.

3.2. The Artificial Neural Network Model

Here, the best model is selected by trying different node sizes in the hidden layer of the ANN and the resulting model with the best trade off values for RMSE, MAPE and NSE is selected. Therefore, different node sizes ranging from 1 to 8 were used in the hidden layer of the ANN model. Then the model with node size of one in the hidden layer is chosen for the ANN model considering the comparison values in Table 5. The ANN plot with one hidden node, four input nodes and one output node is therefore shown in Figure 8. Thus, the ANN is of order ANN (4, 1, 1). The weights of the neural network were estimated by minimizing the quadratic loss function. Figure 9 shows that Lag 1 values of the crude oil price has more importance in estimating the current crude oil price in Nigeria, followed by quantity produced, time and quantity exported.

3.3. The Random Forest Model

Here, the best model is selected by trying different number of explanatory

Figure 7. The plot of the residuals of ARIMA (1, 1, 0) model.

Table 5. Checking ANN node sizes in the hidden layer.

variables and as in ANN, the model with the best trade-off in the RMSE, MAPE and NSE is selected. The performance measures of the different choices of the explanatory variables selected are given in Table 6. It is evident that using the four explanatory variables (lag 1, production, time and export) at each node for splitting the node had the best tradeoff in the RMSE, MAPE and NSE values in both the training and test sets. Again, the RF model shows that the first lag variable of the crude oil price had more importance in estimating the Nigerian Bonny Light crude oil price. The mean square error (%IncMSE) and the node impurity (%IncNodePurity) increases the most when the “Lag1” variable is randomly permuted, followed by the “Time” variable as shown in Figure 10.

Figure 8. The Neural Network of the ANN (4, 1, 1) model.

Figure 9. Variable importance of the ANN (4, 1, 1) model.

Table 6. Checking the RF models with different variable numbers.

Figure 10. Variable importance of the RF model.

3.4. Fuzzy Time Series (Ruey Chen Tsaur’s algorithm)

With the steps in Table 1, the following results for the test set are obtained:

1) The universal set $U = [D^{'} - D_{1}, D^{″} + D_{2}]$ where $D^{'}$ and $D^{″}$ are minimum and maximum value of time series data with $D_{1}$ and $D_{2}$ are fixed numbers with positive value. For this paper, we have chosen $D_{1} = 8$ and $D_{2} = 10$ for the modeling of the test set. Given that the minimum and maximum values of the test set are 14.28 and 79.59 respectively, then

$U = [6.28, 89.59]$

2) The method [48] was used to get the number of subintervals for the linguistic interval and the size of the subinterval was determined by dividing the range by the number of subintervals.

3) The subintervals are hereby defined as:

$u_{1} = [6.28, 18.18]$ ; $u_{2} = [18.18, 30.08]$ ; $u_{3} = [30.08, 41.98]$ ; $u_{4} = [41.98, 53.88]$ ; $u_{5} = [53.88, 65.79]$ ; $u_{6} = [65.79, 77.68]$ ; $u_{7} = [77.68, 89.59]$ .

4) The fuzzification vector is given thus:

$fuzzify = (\begin{array}{l} 4 4 4 4 4 5 4 4 4 4 4 4 4 5 5 5 5 6 6 6 6 6 6 6 6 7 7 6 \\ 5 5 5 6 6 6 6 6 5 5 5 5 6 6 5 3 1 2 3 3 4 4 3 3 4 4 \end{array})$

where the n^th value of the fuzzification vector represents the n^th fuzzified subinterval (linguistic values).

5) The fuzzy relationship created based on the fuzzification vector is given in Table 7.

6) The transition matrix obtained from the fuzzy relationship table in Table 7 is given by:

$Transition Matrix = (\begin{matrix} 0 .00 & 1 .00 & 0 .00 & 0 .00 & 0 .00 & 0 .00 & 0 .00 \\ 0 .00 & 0 .00 & 1 .00 & 0 .00 & 0 .00 & 0 .00 & 0 .00 \\ 0 .20 & 0 .00 & 0 .40 & 0 .40 & 0 .00 & 0 .00 & 0 .00 \\ 0 .00 & 0 .00 & 0 .07 & 0 .80 & 0 .13 & 0 .00 & 0 .00 \\ 0 .00 & 0 .00 & 0 .08 & 0 .08 & 0 .62 & 0 .23 & 0 .00 \\ 0 .00 & 0 .00 & 0 .00 & 0 .00 & 0 .19 & 0 .75 & 0 .06 \\ 0 .00 & 0 .00 & 0 .00 & 0 .00 & 0 .00 & 0 .50 & 0 .50 \end{matrix})$

Table 7. Fuzzy relationship for the valid set.

It is important to note that the transition matrix is calculated as State matrix divided by sum of row values. The state matrix is given by:

$Statematrix = (\begin{matrix} 0 & 1 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 & 0 & 0 \\ 1 & 0 & 2 & 2 & 0 & 0 & 0 \\ 0 & 0 & 1 & 12 & 2 & 0 & 0 \\ 0 & 0 & 1 & 1 & 8 & 3 & 0 \\ 0 & 0 & 0 & 0 & 3 & 12 & 1 \\ 0 & 0 & 0 & 0 & 0 .00 & 1 & 1 \end{matrix})$

For instance, the value at row 4 column 4 of the transition matrix is obtained as: $\frac{12}{0 + 0 + 1 + 12 + 2 + 0 + 0} = \frac{12}{15} = 0.80$ , the same method is applied to get the remaining values of the transition matrix.

6) The forecast fuzzy output is the same as the defuzzified crisp values which is the fuzzy predicted values of the Bonny Light Crude oil price.

3.5. Comparison of the ARIMA, ANN, RF and Fuzzy Time Series Models

The ARIMA (1, 1, 0), ANN (4, 1, 1), RF and FTS models were compared in both the training and the test sets using the RMSE, MAPE and NSE performance measures to get a high-performance model. As an additional comparison measure, the modified Diebold-Mariano test was also used to test the hypothesis that model 2 is the same as model 1. When comparing (ARIMA-ANN), the ARIMA is model 1, while the ANN is model 2. The alternative hypothesis is that the ANN is a better model than the ARIMA model. This implies that a significant p-value shows that model 2 is better than model 1. This test revealed that ANN, RF and TFS are better than ARIMA in the test sets, FTS is better than RF and ANN while ANN and RF performed likely in the test set. This is also verified by examining the RMSE, MAPE and NSE values for these models. This result is in agreement with the works of [49] in China and [50] using New York Stock exchange data.

4. Discussion

The ARIMA model that had been used in most articles [15] [16] in modeling the

Table 8. Comparison of the ARIMA (1, 1, 0), ANN (4, 1, 1), RF and FTS models.

Footnote: *sig. at 5% and model 2 performs better than the first model.

Nigerian Bonny Light crude oil price performed poorly when compared to the two machine learning models—ANN and RF and the FTS model as shown in Table 8 and Figure 11. The best ARIMA order as revealed by this study differs from the ones identified by [15] and [18], this suggests that the Bonny Light crude oil characteristics are unsteady and require close monitoring via remodeling. The RF model with four variables selected at each node for splitting the node has the best trade-off of RMSE, MAPE and NSE values in both the training and test sets when compared to the ARIMA model but performed the same way as ANN (4, 1, 1) in the test set as shown in Table 8. The FTS model performed better than all the other models in the test sets but had the same power as RF in the train set alone as shown in Table 8. The previous price of crude oil has a major role to play in estimating the current price of crude oil as the “Lag 1” variable had the highest contribution in the model.

5. Conclusions

In this study, we have attempted to compare the following models: ARIMA (1, 1, 0), ANN (4, 1, 1), RF and FTS in modeling the Nigerian Bonny Light crude oil price. The four models were built using the train set and the ability of the model obtained was validated using the test set. However, for FTS, the model used for the test set was based on the subintervals created using the said data set. The true values and the predicted values obtained by using the four models for the test set are shown in Table 9.

The modified Diebold-Mariano test, RMSE, MAPE and NSE were used to identify the best model. The FTS model fits the Bonny Light crude oil price better ARIMA (1, 1, 0), ANN (4, 1, 1), and RF. This has also revealed the power of the FTS model with Chen’s algorithm as shown by [46]. The FTS model is hereby recommended as the best model for estimating the price of the Nigerian Bonny Light crude oil.

Figure 11. Plot of the True Value (TV), RF, Fuzzy, ARIMA and ANN forecast values for the test set.

Table 9. The True Value (TV), ARIMA, ANN and RF forecast values for the test set.

These findings recommend that a hybrid ARIMA-ANN as in (Zhang [51]; Alshaimaa [52] ) and ARIMA-RF models be fitted on the Nigerian Bonny Light crude oil price data and the result be compared with the FTS model to know if the hybrid models will perform better the FTS model.

Declaration

Availability of Data and Material

Yes (Figshare).

https://figshare.com/articles/dataset/Modeling_the_Price_of_Crude_Oil_in_Nigeria_Identification_of_a_reliable_model_for_application/14588679

Authors’ Contribution

Desmond Chekwube Bartholomew: conceptualization, methodology, software, writing original draft preparation. Chukwudi Paul Obite: methodology, formal analysis, writing original draft preparation. Ukamaka Cynthia Orumie: validation, data curation, writing, review and editing. Blessing Iheoma Duru: investigation, writing, reviewing and editing. Felix Chikereuba Akanno: visualization, writing, review and editing. All authors read and approved the final manuscript.

Conflicts of Interest

The authors declare no conflicts of interest regarding the publication of this paper.

References

[1]	Energy Insight (n.d.) Bonny Light Crude. https://www.mckinseyenergyinsights.com/resources/refinery-reference-desk/bonny-light-crude
[2]	Nigerian National Petroleum Corporation (n.d.) History of the Nigerian Petroleum Industry. https://nnpcgroup.com/NNPC-Business/Business-Information/Pages/Industry-History.aspx
[3]	Frynas, J.G. (1999) Oil in Nigeria: Conflict and Litigation between Oil Companies and Village Communities. Lit Verlag, Munster.
[4]	Nwanne, N.C. and Eze, O.R. (2019) Effect of Government Oil Revenue on Agricultural Sector Growth: Evidence from Nigerian Economy. European Journal of Social Sciences, 58, 164-177.
[5]	World Bank (2013) Nigeria Economic Report. No. 1, World Bank, Washington DC.
[6]	EIA (1997) Country Analysis Brief: Nigeria, Independent Statistics and Analysis, U.S. Energy Information Administration. https://www.eia.gov/international/content/analysis/countries_long/Nigeria/nigeria.pdf
[7]	Chika, I. (2014) US Shut Its Door on Nigeria’s Oil Exports. http://grandmotherafrica.com/u-s-shuts-its-door-on-nigerias-oil-exports/
[8]	OPEC (Organization of the Petroleum Exporting Countries) (2018) Annual Statistical Bulletin 2018. Organization of the Petroleum Exporting Countries.
[9]	William Carpenter, J. (2020) The Main Oil Producing Countries in Africa. https://www.investopedia.com/articles/investing/101515/biggest-oil-producers-africa.asp
[10]	Blanchard, O. and Gali, J. (2007) The Macroeconomic Effect of Oil Shock: Why Are the 2000s So Different from the 1970s? Working Paper No. 13368, National Bureau of Economic Research, Cambridge. https://doi.org/10.3386/w13368
[11]	Yomi, K. (2020) Africa’s Largest Economy Has Suffered Its Worst Contraction in over a Decade. https://www.yahoo.com/entertainment/africa-largest-economy-suffered-worst-154220869.html? guce_referrer=aHR0cHM6Ly93d3cuZ29vZ2xlLmNvbS8&guce_referrer_sig= AQAAAHmnFhHGAjqTJa91EiPkTvQkRh5ynOxJrjI_rge5u8TSvAnEGH7Sq_aWxi_jg7QTKjCQENwmGXRR1tFoTWjvh8P3 LhFNq0deNFXDeOrhjeXhQBnmWe8xMbfnTjQoviHDaknYV0fYns9_Ho1OBjLiTIBAXc975seSYpI18_x71UA-&guccounter=2
[12]	Deniz, P. (2017) Oil Price and Renewable Energy: Oil Dependent Countries. Journal of Research in Economics, 3, 139-150. https://doi.org/10.35333/JORE.2019.52
[13]	Odusami, B.O. (2010) To Consume or Not: How Oil Prices Affect the Movement of Consumption and Aggregate Wealth. Energy Economics, 32, 857-867. https://doi.org/10.1016/j.eneco.2009.11.010
[14]	Wiri-Agri, E.M., Inusa, M.-L.D. and Kennedy, N.D. (2016) Impact of Oil Price Volatility on Macroeconomic Variables and Sustainable Development in Nigeria. International Journal of Economics and Financial Research, 2, 33-40.
[15]	Suleiman, S., Alabi, M.A., Issa, S., Usman, U. and Umar, A. (2015) Modeling and Forecasting the Crude Oil Price in Nigeria. International Journal of Novel Research in Marketing Management and Economics, 2, 1-13.
[16]	Wiri, L. and Tuaneh, G.L. (2019) Modeling the Nigerian Crude Oil Prices Using ARIMA, Pre-Intervention and Post-Intervention Model. Asian Journal of Probability and Statistics, 3, 1-12. https://doi.org/10.9734/ajpas/2019/v3i130083
[17]	Omekara, C.O., Okereke, O.E., Ire, K.I. and Okamgba, C.O. (2015) ARIMA Modeling of Nigerian Crude Oil Production. Journal of Energy Technologies and Policy, 5, 1-5.
[18]	Aliyu, B.A. (2014) Modeling Petroleum Product Demand in Nigeria Using Structural Time Series Model (STSM) Approach. International Journal of Energy Economics and Policy, 4, 427-441.
[19]	Ajayi, P.I., Longe, A.E., Omitogun, O.A. and Muhammed, S. (2019) Oil Price Shocks and Energy Consumption in Nigeria. Izvestiya Journal of Varna University of Economics, 63, 275-293.
[20]	Usoro, A.E., Ikpang, I.N. and George, E.U. (2020) Volatility Measures of Nigeria Crude Oil Production as a Tool to Investigate Production Variability. African Journal of Mathematics and Computer Science Research, 13, 1-16. https://doi.org/10.5897/AJMCSR2019.0785
[21]	Ojugo, A.A. and Yoro, R.E. (2020) Predicting Futures Price and Contract Portfolios Using the ARIMA Model: A Case of Nigeria’s Bonny Light and Forcados. Quantitative Economics and Management Studies, 1, 237-248. https://doi.org/10.35877/454RI.qems139
[22]	Valadkhani, A. (2014) Dynamic Effects of Rising Oil Prices on Consumer Energy Prices in Canada and the United States: Evidence from the Last Half Century. Energy Economics, 45, 33-44. https://doi.org/10.1016/j.eneco.2014.06.015
[23]	Box, G.E.P. and Jenkins, G.M. (1976) Time Series Analysis: Forecasting and Control. Revised Edition, Holden-Day, San Francisco.
[24]	Nwosu, U.I., Obite, C.P. and Bartholomew, D.C. (2021) Modeling US Dollar and Nigerian Naira Exchange Rates during COVID-19 Pandemic Period: Identification of a High-Performance Model for New Application. Journal of Mathematics and Statistics Studies, 2, 40-52. https://doi.org/10.32996/jmss.2021.2.1.5
[25]	Rosenblatt, F. (1958) The Perceptron: A Probabilistic Model for Information Storage and Organization in the Brain. Psychological Review, 65, 386-408. https://doi.org/10.1037/h0042519
[26]	Obite, C.P., Olewuezi, N.P., Ugwuanyim, G.U. and Bartholomew, D.C. (2020) Multicollinearity Effect in Regression Analysis: A Feed Forward Artificial Neural Network Approach. Asian Journal of Probability and Statistics, 6, 22-33. https://doi.org/10.9734/ajpas/2020/v6i130151
[27]	Olden, J.D., Michael, K.J. and Russell, G.D. (2004) An Accurate Comparison of Methods for Quantifying Variable Importance in Artificial Neural Networks Using Simulated Data. Ecological Modelling, 178, 389-397. https://doi.org/10.1016/j.ecolmodel.2004.03.013
[28]	Nwokike, C.C., Offorha, B.C., Obubu, M., Ugoala, C.B. and Ukomah, H.I. (2020) Comparing SANN and SARIMA for Forecasting Frequency of Monthly Rainfall in Umuahia. Scientific African, 10, Article ID: e00621. https://doi.org/10.1016/j.sciaf.2020.e00621
[29]	Ho, T.K. (1998) The Random Subspace Method for Constructing Decision Forests. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20, 832-844. https://doi.org/10.1109/34.709601
[30]	Obite, C.P, Chukwu, A., Bartholomew, D.C, Nwosu, U.I. and Esiaba, G.E. (2021) Classical and Machine Learning Modeling of Crude Oil Production in Nigeria: Identification of an Eminent Model for Application. Energy Reports, 7, 3497-3505. https://doi.org/10.1016/j.egyr.2021.06.005
[31]	Egbo, M.N. and Bartholomew, D.C. (2018) Forecasting Students’ Enrollment Using Neural Networks and Ordinary Least Squares Regression Models. Journal of Advanced Statistics, 3, 45-57. https://doi.org/10.22606/jas.2018.34001
[32]	Oliveira, D.J. and Ludermir, T.B. (2014) A Distributed PSO-ARIMA-SVR Hybrid System for Time Series Forecasting. 2014 IEEE International Conference on Systems, Man, and Cybernetics (SMC), San Diego, 5-8 October 2014, 3867-3872. https://doi.org/10.1109/SMC.2014.6974534
[33]	Friedman, J.H. (1991) Multivariate Adaptive Regression Splines. Annals of Statistics, 19, 1-67. https://doi.org/10.1214/aos/1176347963
[34]	Abreu, P.H., Silva, D.C., Mendes-Moreira, J., Reis, L.P. and Garganta, J. (2013) Using Multivariate Adaptive Regression Splines in the Construction of Simulated Soccer Team’s Behavior Models. International Journal of Computational Intelligence Systems, 6, 893-910. https://doi.org/10.1080/18756891.2013.808426
[35]	Zhang, G.P. and Qi, M. (2005) Neural Network Forecasting for Seasonal and Trend Time Series. European Journal of Operation Research, 160, 501-514. https://doi.org/10.1016/j.ejor.2003.08.037
[36]	Chen, S.M. (1996) Forecasting Enrollments Based on Fuzzy Time Series. Fuzzy Sets and Systems, 81, 311-319. https://doi.org/10.1016/0165-0114(95)00220-0
[37]	Huarng, K. (2001) Heuristic Models of Fuzzy Time Series for Forecasting. Fuzzy Sets and Systems, 123, 369-386. https://doi.org/10.1016/S0165-0114(00)00093-2
[38]	Huarng, H. and Yu, T.H.K. (2006) The Application of Neural Networks to Forecast Fuzzy Time Series. Physica A, 363, 481-491. https://doi.org/10.1016/j.physa.2005.08.014
[39]	Singh, S.R. (2007) A Simple Method of Forecasting Based on Fuzzy time Series. Applied Mathematics and Computation, 186, 330-339. https://doi.org/10.1016/j.amc.2006.07.128
[40]	Teoh, H.J., Cheng, C.H., Chu, H.H. and Chen, J.S. (2008) Fuzzy Time Series Model Based on Probabilistic Approach and Rough Set Rule Induction for Empirical Research in Stock Markets. Data & Knowledge Engineering, 67, 103-117. https://doi.org/10.1016/j.datak.2008.06.002
[41]	Liu, H.T., Wei, N.C. and Yang, C.G. (2009) Improved Time-Variant Fuzzy Time Series Forecast. Fuzzy Optimal Decision Making, 8, 45-65. https://doi.org/10.1007/s10700-009-9051-8
[42]	Yu, H.K. and Huarng, K. (2010) A Neural Network-Based Fuzzy Time Series Model to Improve Forecasting. Expert Systems with Application, 37, 3366-3372. https://doi.org/10.1016/j.eswa.2009.10.013
[43]	Khashei, M., Bijari, M. and Hejazi, C.S.R. (2011) An Extended Fuzzy Artificial Neural Networks Model for Time Series Forecasting. Iranian Journal of Fuzzy Systems, 8, 45-66.
[44]	Bas, E., Uslu, V.R., Aladag, C., Yolcu, U. and Egrioglu, E. (2014) A Modified Genetic Algorithm for Forecasting Fuzzy Time Series. Applied Intelligence, 41, 453-463. https://doi.org/10.1007/s10489-014-0529-x
[45]	Egrioglu, S., Bas, E., Aladag, C.H. and Yolcu, U. (2016) Probabilistic Fuzzy Time Series Method Based on Artificial Neural Network. American Journal of Intelligent Systems, 62, 42-47.
[46]	Rana, A.K. (2020) Comparative Study on Fuzzy Models for Crop Production Forecasting. Mathematics and Statistics, 8, 451-457. https://doi.org/10.13189/ms.2020.080412
[47]	Nwosu, U.I. and Obite, C.P. (2021) Modeling Ivory Coast COVID-19 Cases: Identification of a High-Performance Model for Utilization. Results in Physics, 20, Article ID: 103763. https://doi.org/10.1016/j.rinp.2020.103763
[48]	Sturges, H. (1926) The Choice of a Class Interval. Journal of the American Statistical Association, 21, 65-66. https://doi.org/10.1080/01621459.1926.10502161
[49]	Qihang, M. (2020) Comparison of ARIMA, ANN and LSTM for Stock Price Prediction. 2020 International Symposium on Energy, Environmental Science and Engineering, Vol. 218, Chongqing, 20-22 November 2020, Article ID: 01026. https://doi.org/10.1051/e3sconf/202021801026
[50]	Ayodele, A.A., Aderemi, O.A. and Charles, K.A. (2015) Comparison of ARIMA and Artificial Neural Networks Models for Stock Price Prediction. Journal of Applied Mathematics, 2014, Article ID: 614342. https://doi.org/10.1155/2014/614342
[51]	Zhang, P.G. (2003) Time Series Forecasting Using a Hybrid ARIMA and Neural Network Model. Neurocomputing, 50, 159-175. https://doi.org/10.1016/S0925-2312(01)00702-0
[52]	Alshaimaa, I.E. (2015) A Combined Model between Artificial Neural Networks and ARIMA Models. International Journal of Recent Research in Commerce Economics and Management (IJRRCEM), 2, 134-140. http://www.paperpublications.org

Journals Menu

Follow SCIRP

	+1 323-425-8868
	customer@scirp.org
	+86 18163351462(WhatsApp)
	1655362766

	Paper Publishing WeChat

Journals Menu

Home

About SCIRP

Service

Policies