Comparison among the UECM Model, and the Composite Model in Forecasting Malaysian Imports ()
1. Introduction
Prediction is a difficult art, especially when the future is involved. Forecasting is a process of making statements on events in which their actual outcomes (typically) have not occurred. The art of forecasting the future is a vital and important exercise to determine the economic performance of countries. Malaysian economists would like to determine the future imports to formulate their policy properly, and Malaysian analysts would like to determine the future performance of imports to guide their influencing factors.
Many investigations have been made to determine how Malaysian imports behave, including [1] . These estimated a traditional (classical) import demand function was computed using them, where the level of real income and relative prices serve as the explanatory variables, and the response variable is the number of imports. These analyses’ fundamental presumption is that the data are stationary. The studies mentioned above were done prior to “co-integration analyses” and “error correction models” (ECM) were standard practice in time series analysis. To estimate the import demand function, they employed conventional (OLS) ordinary least squared regression models or partial adjustment techniques. These researchers presume that the model’s explanatory variables and import volume have an underlying equilibrium connection [2] . If the stationary assumption is violated, this could result in spurious regression, therefore beware. As a result, the OLS method’s standard statistical inference would be uncertain. In a late study, [3] used the [4] multivariate co-integration method to determine the long-run elasticities of import demand. They revealed how present income and relative pricing have an impact on import growth in the near run employing the error correction model (ECM). The assumed ECM’s error correction term, however, was not relevant at the 10% level, demonstrating the absence of a long-term connection. [5] reveal that for statistics with little test measure, no co-integration connection can be made among factors that are coordinated of order one, I (1). [6] states that the ECM, [4] and [7] methods are not reliable for studies that have small sample sizes, such as the study in [3] . [8] reinvestigated the Malaysia import demand function over the sample period from 1970 to 1998 using another estimation method known as the Unrestricted Error Correction Model—Bounds Test Analysis. [9] has chosen the dynamic Vector Error Correction Model to estimate the long-run behaviour of Malaysia imports over the sample period from 1980-2010 to overcome the limited number of observations. [10] Examined the long-run relationship of import demand of Malaysia using time series analysis techniques that address the problem of non-stationary. [11] used Johansen’s co-integration analysis to study a long-run relationship (co-integration) between Malaysian imports and exports for the annual period 1959 to 2000. [12] applied two tests for co-integration namely, Engle-Granger and Johansen tests, and the stability tests also found Malaysian economy such as Augmented Dickey-Fuller (ADF) [13] . The Dickey and Fuller (1979) and Phillips and Perron (1988) unit root test statistics show that all variables are integrated in the same order. The results of the Johansen (1988) co-integration method show that there is a long-run relationship between trade balance and commodity terms of trade, but no long-run relationship between trade balance and income terms of trade in Malaysia. [14] [15] examined the composite model provides better forecasts than the regression equation or time series model alone. [16] developed basic artificial neural network (ANN) models in forecasting the in-sample gross domestic product (GDP) of Malaysia. [17] used multiple linear regression to study the importance of macroeconomic variables that affect the total volumes of Malaysia’s imports and exports. [18] concluded that the artificial neural network is the most successful model for forecasting imports and exports.
Although the composite model (which combines regression and ARIMA) was used to predict Malaysia imports future, most researchers believe that the composite model gives better results than using the two methods separately, and contributes to solving regression problems such as autocorrelation and heterogeneity in variance. However, the accuracy of the composite model, ARIMA, and ARDL method-based prediction should be investigated further. Almost composite model, ARIMA, and ARDL method-based model predictions use accuracy measures for selecting a best-fit model, however, the forecast values will not necessarily equal the actual values observed for the same time period. This can be due to several factors such as the various restrictions imposed by the Malaysian authorities to limit imports and the degree to which suppliers comply with these restrictions. Therefore, this study primarily aims to reinvestigate Malaysia’s imports by developing the composite model approach. Two models, namely, UECM and ARIMA, are integrated into a composite model to increase import prediction accuracy and improve extant methods for forecasting imports. As far as we are aware, no research using the same statistical techniques has been conducted that addressed the same methods.
2. Material and Methods
2.1. Material
This part explains the case study, which is thought to be a successful research strategy for examining and contrasting the suggested models. In accordance with the procedures below, this case research was selected.
This study is steered using data on Malaysia imports. The import relationship was analysed by considering time series properties. In addition, the quarterly series of values of Malaysia’s imports in million RM, GDP and value of Malaysia’s exports in million from the first quarter of 1991 to the third quarter of 2022 (total of 128 observations) were utilized in this study. The source of this information was Malaysia’s Department of Statistics. The graphical plots of the series are presented in Figure 1.
A very common accuracy measurement functions are used to assess the performance of each model described below, these performance functions are root mean square error (RMSE), mean absolute percentage error (MAPE) and mean absolute deviation (MAD) [19] .
(1)
(2)
(3)
Figure 1. Time series of Malaysian imports.
2.2. Methods
• Stationarity test:
A time series is a collection of observations on a variable that are regularly taken across time at predefined intervals. If a time series’ mean and variance are constant and its covariance totally depend on the interval or lag between two periods rather than the actual time the covariance is calculated, the time series is said to be covariance stationary (weakly or simply stationary) [20] [21] [22] . To model a time series with ARIMA and exponential smoothing methods, the time series must be stationary. It is common practice to estimate the model coefficients using OLS regression. The stochastic process must be stationary in order for OLS to be effective. The use of OLS can result in inaccurate estimations when the stochastic process is nonstationary. Such estimates are what Granger [23] referred to as “spurious regression” results since they have high R2 values and t-ratios but no discernible economic significance. The ADF and PP unit root tests of stationarity are run in this study to exclude structural effects (autocorrelation) in the time series. Additionally, this study utilizes the autocorrelation function (ACF) and partial autocorrelation function (PACF) to assess the data’s stationarity. A nonstationary series’ autocorrelation function (ACF) also displays a pattern with a gradual decline in autocorrelation size.
• Composite Model
The composite (combined regression–ARIMA) model has been proven useful in many areas, such as in economic business forecasting. This method is based on excellent documentation [25] and has been proven to be computationally efficient. This model is expressed as
, (4)
where
is the dependent variable,
are the independent variables,
are the regression parameters,
and
are the AR and MA parameters, respectively, and
is the error random variable.
This composite model can be used to process a high degree of autocorrelation in residuals. Therefore, this study integrates CO-UECM into this model to improve its performance.
• ARDL Model
According to [26] , the ARDL modelling approach is particularly useful when the variables are integrated in different orders. This particularisation is the most important feature of the ARDL technique, and it is its distinguishing characteristic from the Johansen method. The ARDL approach can be applied to I(1) and/or I(0) regressors. This approach means that ARDL can avoid the pretesting problems associated with the standard co-integration that requires the variables to be pre-classified into I(1) or I(0).
The ARDL model used in this study may be expressed as
(5)
The error correction version of the ARDL framework, as shown in Equation (3.84), can be rewritten as
(6)
For parameter
,
denotes the corresponding long-run multipliers whilst for
,
denotes the short-run dynamic coefficients of our ARDL model.
denotes a serially uncorrelated disturbance with a zero mean and constant variance whilst ∆ denotes the first difference operator.
After confirming a long-run relationship amongst the variables, the following long-run model for imports can be estimated:
(7)
To determine the appropriate lag length of the ARDL model, one usually depends on the literature and conventions to determine how many lags must be used. Several selection criteria, such as final prediction error (FPE), SC, HQ and AIC, are mainly used to determine the order of the ARDL model. To estimate the short-run dynamics, the following error correction model is formulated:
(8)
where
are the short-run parameters for
and ECT is the lagged error correction term obtained from the long-run equilibrium relationship that represents the adjustment coefficient. This variable must be negative, less than one and statistically significant in order to confirm a co-integration relationship.
3. Results
• Stationarity Tests
The following unit root tests were used: the ADF and PP tests (for which the null hypothesis is nonstationary).
Table 1 and Table 2 show that the null hypothesis of
has a unit root and cannot be rejected at the 5% level of significance in both the ADF and PP tests. Therefore, all variables are non-stationary in their level form and both the mean and variance are not constant. However, all variables are stabilised at the first level.
• Lag Order Selection
Selecting the number of the lags is crucial in the conception of a VAR model. Lag length is often selected by using a fixed statistical criterion, such as LR, FPE, AIC, SC and HQ.
Table 1. Results of the ADF test for the linear variables.
* ADF statistic value, ** Critical value (5%), *** Prob.
Table 2. Results of the PP test for the linear variables.
* PP statistic value, ** Critical value (5%), *** Prob.
The results of LR, FPE, AIC, and HQ as shown in the above table clearly indicate that the number of optimal delays in our model is equal to 4. Meanwhile, the results of SC indicate that the number of optimal delays is equal to 2. After comparing these delays based on the accuracy of the model results, we find that the number of optimal delays in our model is equal to 4.
• (ARDL) Bound Testing Approach:
Table 3 reports the calculated F-statistics when imports (yt) is considered a dependent variable in the ARDL–OLS regressions.
The F-test results and the critical values from [27] are reported in Table 3. The F-statistic is 6.909 at lag 4 and is higher than the upper bound critical values at the 1%, 2.5%, 5% and 10% significance levels. Therefore, our variables are co-integrated. This result is in line with the findings of [8] in Malaysia, who found that import value and its determinants (i.e. GDP and relative prices) are co-integrated despite their small sample size (128 observations). Another study in Malaysia conducted by [28] revealed a long-run equilibrium relationship between imports and its determinants.
• Unrestricted Error Correction model (UECM):
We construct an UECM to identify the short-run relationships and check the stability of the long-run parameters. The results are reported in Table 4.
(9)
Panel A of Table 5 shows that the error correction term is statistically significant at the 1% level and bears a negative coefficient, which is desirable. Therefore, the model is reliable. Meanwhile, the value of −0.64 suggests that the long-run equilibrium relationship eventually returns to the steady state when the system faces some shocks. However, the coefficient has a moderate value, which indicates that restoring such relationship to its steady state will not take long when the system faces some disturbance. This finding is consistent with those of [28] , who considered the same restrictions for Malaysia’s imports in his work. [29] used UECM model to check the relationship between imports and their determinants and found that exchange rates do not have a significance influence on Turkey’s imports in the short run. These findings are consistent with the theoretical and empirical predictions.
Table 4. Co-integration test results.
Source: Critical values for the bounds test; restricted intercept and no trend [27] .
Table 5. UECM model results in the short run.
Diagnostic Tests: The significance of the variables is evaluated whilst the serial correlation, normality, heteroscedasticity and structural stability of the model are assessed by performing diagnostic tests. Table 6 and Figure 3 present the results of these tests.
The results demonstrate that the short-run relationships do not pass all diagnostic tests, no evidence of autocorrelation and heteroscedasticity are found at the 5% confidence level, the model does not pass the normality test.
Stability Checking: We test the stability of our model by performing recursive residuals tests. The results of these tests are graphically illustrated in Figure 3, which show that the parameters are stable throughout the sample period.
• Composite Model
We develop composite model that use UECM to obtain short-term forecasts.
(10)
We construct an ARIMA model for the random error variable in UECM by performing a time series analysis. The residuals in this model, such as et, are analysed as follows by using the ARIMA model.
The ARIMA model of the residual series is combined with UECM to develop the composite model (combined UECM-ARIMA) for forecasting Malaysia’s imports. The results are presented in Table 7.
(11)
We substitute the ARIMA (1, 1, 2) model for the implicit error in the original regression model equation. As shown in Table 7, the MARMA model is a combination of the regression model and the time series model. The dependent variable, (yt), and the independent variables are related whilst the error term that is partially “explained” by a time series model is estimated. Table 7 shows that the explanatory variables and the AR and MA parameters explain nearly 99% of the error term.
Table 7. Results of the composite model.
Diagnostic Tests: We evaluate the serial correlation, normality, heteroscedasticity and predictive ability of the composite model by performing diagnostic tests.
Table 8 shows the composite model passes all diagnostic tests. No autocorrelation is observed at 5% confidence level and the average and its standard deviation are 0.000419 and 0.036607. The error term is normally distributed based on the values of torsion, spacing in Jarque-Bera test.
We test the effect of heteroscedasticity by calculating the coefficients of the residual ACF and PACF for a certain number of time differences.
Figure 4 shows that all ACF and PACF coefficients are within zero limits or have values close to zero, thereby indicating the absence of correlation in the time series and heteroscedasticity in the error variances.
Assessing Predictive Ability: The difference between the adjusted-R2 and predicted-R2 must always be between 0 and 0.200 to ensure that the model has an adequate predictive ability. In our calculations, the difference between these values is 0.001, thereby indicating that both values are in good agreement and that CM-UECM has a high predictive ability.
• Analysis of the forecasting abilities of various models
The two models, the Composite model and the UECM model, are contrasted as seen in Table 9. These models were compared based on a range of error metrics. Table 9 and Figure 5 provide summaries of the outcomes of the forecasting performance of these two models.
Figure 4. ACF and PACF of the residuals.
Figure 5. The outcomes of comparing the forecasting abilities of the various models.
Table 9. Statistical measures of forecast error for Malaysia’s imports.
The results shown in Table 9 and Figure 5 were evaluated and analysed by the author in light of the pertinent problems.
The selected model demonstrates excellent performance as reflected in its explained variability and predictive power.
4. Discussion
The results presented in Table 9 and Figure 5 revealed that the MSE, RMSE and MAE of composite model are 0.00135, 0.00134 and 0.02933, respectively, for the time series of the Malaysia’s imports. Such results clearly indicate that all results are lower than those of the other method. Based on that, since the composite model had the best match out of all the models, it performed the best. Table 10 displays the ACF and PACF of the residuals. To create a satisfactory forecasting model, the residuals should only contain white noise after the model has been fitted. Insignificant values are anticipated for these statistics when looking at the residuals.
Table 10 illustrates that the residual errors’ ACF and PACF are insignificant, proving that the composite model is the best choice for projecting Malaysia’s imports.
Table 10. PACF and ACF of the residuals of Malaysia’s imports from the composite model.
The selected model demonstrates excellent performance as reflected in its explained variability and predictive power. Therefore, the results of CO-UECM show that the dependent variable y (Malaysia’s imports) and independent variables (GDP and exports) are related, the error term that is partially “explained” by a time series model is estimated and the explanatory variables as well as the AR and MA parameters explain nearly 0.99% of the error term. These findings are in line with those of [30] - [35] . The composite model provides better forecasts than the regression equation or time series model alone because this model provides structural and time series explanations for those parts of the variance that can and cannot be explained structurally, respectively. This result supports the findings in [14] [15] .
5. Conclusion
The methods for predicting imports in Malaysia were suggested and assessed in this study. The proposed models that are the composite model and UECM model were assessed by comparing them with one another using Malaysia’s import time series. This study has made a valuable contribution to the literature as it was the first empirical study in this field to compare composite models and UECM models. The achieved findings have proven the significance and worth of such composite models as a potent forecasting technique that improves the precision of import value prediction and strengthens forecasting techniques in the Malaysian context. As observed from the results that the composite model is suitable for use in forecasting Malaysian imports, the author recommends the proposed composite model is a linear model that relies on the reactions to Malaysia’s imports. However, future research should better describe the use of non-linear models, such as neural network models. The same procedures described in this study can be also applied to these models. Afterward, the forecasting performance of linear and non-linear models may be compared.