^{1}

^{*}

^{1}

A developing country like Rwanda heavily is keen on international trade for several essential goods in the development of an economy. This study investigated the influence of various factors affecting import trade, and use principal component analysis to determine an empirical model for a comprehensive analysis of the influencing factors of import trade of Rwanda using secondary data over the period from 1980-2017. The PCA model showed that Rwanda’s import trade is principally littered with investment fundamental factors, income consumption factors, price factors, inflation factors, and savings factors and the empirical results showed that Rwanda’s import trade is negatively correlated with the investment fundamental and savings factors, the income consumption factors, price factor, and the inflation are positively correlated and therefore the forecast for the period 2018-2025 revealed that the import trade of Rwanda may experience an increase. The implication is that unstable price and currency depreciation cause high income consumption and increased import trade volume. The study advises policy makers on international trade first to pay attention to the accumulation of investment and savings checking if providing support for import trade control and enhance economic security. Second, stabiliz e the price and manage to keep inflation low and stable. Third, better focus on improving domestic production by not permitting Rwandan currency (Frw) to lose the worth, thus directly forming the necessity for foreign merchandise for investment purposes to increase the level of production exportation, which might have a giant positive impact on saving culture linked to economic growth.

International trade has existed for a protracted time [

An inland country with 26.338 km^{2} and a densely packed population of about 12.2 million people, Rwanda has reached 6.1% of economic growth rate in 2017. A touch economic transformation after Genocide of Tutsi 1994 [

Imports play a crucial role in the investment atmosphere and therefore industrial development and economic progress of a country. To check the foreign trade regime of a country, a stronger image might be obtained if imports are studied at some length.

Principal component analysis (PCA) is often used for analyzing data in the most diverse areas. Although macroeconomic variables have long been studied, many economists and researchers have often neglected the preliminary study of the variables regarding their similarities and differences [

PCA is used to cut back this redundancy, which ends within the reduction of highly correlated data into a small number of uncorrelated principal components that truly account for most of the variance within the highly correlated data [

Despite the apparent simplicity of the techniques, abundant analysis remains being drained the overall space of PCA, and it is terribly wide used and also the explosion of latest applications and more theoretical developments have occurred. This growth reflects the general expansion of the statistical literature, however as PCA requires hefty computing power, the expansion of its use coincided with the widespread introduction of electronic computers. Most of the applications of PCA in Economy for validation as a method to explain the socio-economic status differentiation among population and evaluating the financial development of nations or entities employing a combination of features ought to be thought considered once generating and interpreting results [

This study will undoubtedlycontribute to the prevailing empirical literature on the issue of factors influencing import trade using the Rwandan situation and be a profit to the importers and exporters (whether companies or individuals) by letting them perceive how their activities contribute to the development of Rwanda.

This study is worked out based on principal component analysis, which focused on removing redundancy and compress factors influencing import trade of Rwanda into few components to be used in principal component regression analysis. The method and data part is focused on data source, variables selection and model estimation, while the empirical analysis show the associated results as well discussion of the founded results. Then conclusion summarizes the findings and recommendation part shows opinions supported by the research’s findings for policy implications.

Principal component analysis is a popular data-processing and dimension-reduction technique with optimal properties as the linear combinations of the original variables such that the derived variables capture the maximum variability guaranteeing minimal information loss and principal components are uncorrelated where we can interpret each one independently [

Many economic variables have the property that they are correlated with each other. Given the natural links between almost all facets of economic activity within any given economy, the multicollinearity is severe. The PCA objectives are to extract an important part of the information from the data set, reducing the size of data with minimal damage to data and information. This is achieved by finding a new set of independent (uncorrelated) variables called principal components which are obtained as a linear combination of the original variables.

The calculation of PCs means the computation of eigenvalues and eigenvectors for a positive-semidefinite symmetric matrix. The first PC has the largest proportion of variance of the data, and the second component has the second largest proportion of variance and is orthogonal to the first principal component. Remaining PCs represents the remaining variance in descending order, and each PC is orthogonal to its predecessors. when computing the PCs, the primary first PCs that represents the large part of variation are designated to be used in further analysis.

This study has tried to incorporate all factors in the import function that can potentially play a meaningful role in the import trade model determination for Rwanda. The data are obtained from WDI and UNCTAD officially recognized international sources and the selected variables are summarized in

The motivation for employing the principal component to these indicators is, most of these variables are often speculated to be highly related to one another and hence the precision of estimation may be questionable if we will in the near future want to determine the impact of these indicators on a particular event. The outputs are clearly learned from R statistical software package 4.0.2, IBM SPSS Statistics 20 and Microsoft excel 2016.

Principal component analysis is a popular data-processing and dimension-reduction technique with optimal properties as the linear combinations of the original variables such that the derived variables capture the maximum variability guaranteeing minimal information loss and principal components are uncorrelated where we can interpret each one independently [

The data consist:

Variables | Names |
---|---|

Y | Import volume index |

X_{1} | Foreign direct investment, net inflows (% of GDP) |

X_{2} | Gross capital formation (% of GDP) |

X_{3} | Official exchange rate (LCU per US$, period average) |

X_{4} | Trade openness |

X_{5} | External debt stocks (% of exports of goods, services and primary income) |

X_{6} | Final consumption expenditure (% of GDP) |

X_{7} | Relative import prices |

X_{8} | GDP growth (annual %) |

X_{9} | Inflation GDP deflator (annual %) |

X_{10} | Population growth (annual %) |

X_{11} | Savings [Total reserves (% of total external debt)] |

Source: author’s own.

Suppose X = ( X 1 , X 2 , ⋯ , X n ) T = ( X ˜ 1 , X ˜ 2 , ⋯ , X ˜ p ) is the design matrix of size, n × p columns represent p explanatory variables and rows represent n samples.

X = [ x i j ] n × p = [ x 11 x 12 ⋯ x 1 p x 21 x 22 ⋯ x 2 p ⋮ ⋮ ⋱ ⋮ x n 1 x n 2 ⋯ x n p ] (2.1)

We assume that all columns are centered and μ = E ( X ) , Σ = V A R ( X ) . Principal components are derived from the linear transformation Z = X V , where V ∈ ℝ p × r with r = r a n k ( X ) is the orthogonal matrix of principal component loadings. It is worth mentioning that the PC transformation preserves the total variance (V_{T}) from the initial space:

V T = ∑ i = 1 n V a r ( x i ) = ∑ i = 1 n λ i ⇔ t r ( Σ ) = t r ( Λ ) (2.2)

where Λ is the covariance matrix of the PCs is a diagonal matrix

( λ 1 ⋯ 0 ⋮ ⋱ ⋮ 0 ⋯ λ n ) (2.3)

PCA also results in the preservation of generalized variance (V_{G}):

V G = | Σ | = | Λ | (2.4)

Λ = d i a g ( λ 1 , ⋯ , λ p ) contains the (non-negative) eigenvalues, the principal components Z 1 , Z 2 , ⋯ , Z p should therefore capture much information as possible.

The proportion of the total variance explained by the p^{𝑡ℎ} principal component can be written as:

W = λ i ∑ i = 1 p λ i = λ i λ 1 + λ 2 + ⋯ + λ p , ( i = 1 , 2 , ⋯ , p ) (2.5)

where λ i , is the eigenvalue of the p^{th} principal component. For instance, the eigenvector with the highest eigenvalue is the principal component of the data set. If the first principal components can explain most of the variations in the sample covariance, the p variables can replace the original k variables with little loss of information.

The cumulative percentage of eigenvalues gives the percent variability of the data set

V = ∑ i = 1 k λ i ∑ i = 1 p λ i , ( i = 1 , 2 , ⋯ , p ) (2.6)

where V is accumulated variance contribution rate and the values of the eigenvectors (original dimension) of each PC (which vary from −1 to +1) can be interpreted as an index of the combined action or contrast of the original dimensions. Thus, principal components for the i^{th} sample is given by Z i = V T X i for i = 1 , 2 , ⋯ , n , where Z i is the i^{th} row of Z = ( Z 1 , Z 2 , ⋯ , Z n ) T = ( Z ˜ 1 , Z ˜ 2 , ⋯ , Z ˜ r ) ∈ ℝ n × r .

The principal component loadings:_{ }

a i j = λ i e i j , ( i , j = 1 , 2 , ⋯ , p ) (2.7)

Such that Z = a T X . The first principal component is defined as the variable, Z 1 ≡ a 1 T X = ∑ i = 1 p a 1 i x i , Z 1 = a 11 x 1 + a 12 x 2 + ⋯ + a 1 p x p which has the maximum variance with the constraint that a T a = 1 .

{ Z 1 = a 1 T X ⋮ Z i = a i T X var ( z i ) = a i T Σ a i , i = 1 , 2 , ⋯ , 11 (2.8)

and have C O V ( Z i , Z j ) = a i T Σ a j T , i , j = 1 , 2 , ⋯ , 11 ; i ≠ j .

In this representation, the columns of Z are mutually orthogonal, but not normalized. We will frequently re-express Z = UD where U is orthogonal matrix of scaled principal component scores and D is diagonal matrix with the square-root of variance of principal component score. Thus, U and V are matrices of the left and right singular vectors, and D = d i a g ( d 1 , ⋯ , d r ) is a diagonal matrix of ordered singular values of X.

With the response variable Y = ( y 1 , y 2 , ⋯ , y n ) T , the linear regression model is typically assumed to be:

Y = α + X β + ε (2.9)

where α is the intercept, β = ( β 1 , ⋯ , β p ) T is the vector of regression coefficients, and ε = ( ε 1 , ⋯ , ε n ) T is a vector of random error of mean 0 and variance σ 2 I n , i.e. ε ~ N n ( 0 , σ 2 I n ) . X is centred, OLS estimator of α is given as y ¯ ; therefore, we remove the intercept in the regression model with the centred response variable. OLS estimate of β is obtained by minimizing the sum of squared errors over β:

β ^ O L S = arg min β ‖ y − X β ‖ 2 2 = arg min β ‖ y − ∑ j = 1 p β j X ˜ j ‖ 2 2 (2.10)

Principal component analysis belongs in the general framework of multivariate statistical analysis techniques. Principal component regression is a method to perform data reduction in the model as well as to solve problems of dependencies among variables (multicollinearity) [

γ ⌢ = arg min γ ‖ y − Z γ ‖ 2 2 = arg min γ ‖ y − ∑ j = 1 r γ j Z ˜ j ‖ 2 2 (2.11)

where γ = ( γ 1 , ⋯ , γ r ) T . Since X β ^ = Z V T γ ^ , we can obtain an OLS estimate of β with the relationship of β ^ O L S = V γ ^ . Let Z k = ( z ˜ 1 , ⋯ , z ˜ k ) where z ˜ j is the j^{th} column of Z, i.e., the j^{th} principal component score of X. The common practice of PCR is performed with the first k principal component scores:

γ ^ k = arg min γ k ‖ y − Z k γ k ‖ 2 2 = arg min γ k ‖ y − ∑ γ j Z ˜ j ‖ 2 2 (2.12)

where γ = ( γ 1 , ⋯ , γ k ) T , PCR estimate of β is given as β ^ P C R , k = V γ ^ k .

The number of principal components used in PCR, k, is typically chosen by considering the proportion of variance explained and prediction performance using cross validation [

Kaiser-Meyer-Olkin Measure of Sampling Adequacy test if the sample size used for the study is adequate for principal component analysis and Bartlett’s test of homogeneity of variances test the hypothesis that correlations between variables are greater than would be expected by chance. It was used to compares the correlation matrix with a matrix of zero correlations usually known as an identity matrix, which consists of all zeros except the 1’s along the leading diagonal. The results are presented in

Due to high MSA, the overall KMO value of 0.85 indicates that the sample size used for the study is adequate and that principal component analysis of the data is more appropriate. It can be seen that the p-value is approximately equal to 0.000 (<2.22e−16), indicating that the data is from a normal state and there is a strong correlation between the variable series.

Now face the problem of information redundancy due to powerful correlations between the variables. It is important to understand how these could adequately impact on a particular incidence without necessarily compromising estimation precision. The factors influencing the import trade of Rwanda as shown in the table are highly correlated, meaning that they vary together.

Kaiser-Meyer-Olkin Statistics | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|

KMO-Criterion: = 0.85 | ||||||||||||

Measures of Sampling Adequacy (MSA): | ||||||||||||

Y_{ } 0.85 | X_{1 } 0.84 | X_{2 } 0.88 | X_{3 } 0.89 | X_{4 } 0.91 | X_{5 } 0.89 | X_{6 } 0.92 | X_{7 } 0.79 | X_{8 } 0.82 | X_{9 } 0.90 | X_{10 } 0.86 | X_{11 } 0.70 | |

Bartlett’s Test of Sphericity | ||||||||||||

Approx. Chi-square | 1530.18 | |||||||||||

Df | 66 | |||||||||||

p-value | <2.22e−16 | |||||||||||

Source: Author’s own.

This redundancy in the measurements allows us to build a PCA model that will retain most of the information in a few principal components.

The total variance explained which indicates how much of the variability in the data has been explained by the components shown in

It is necessary to decide on the number of components that have any practical significance. A simpler but arbitrary rule of thumb, which has proved to be useful in practice, the decision is based on the number of components with a standard deviation greater than or equal to one as having any practical significance. In this study, the first four components, accounting for about 86.3% of the total variability would be regarded as being of practical significance, although the possible interpretation of the next component, bringing the total variability to about 92.9%, would also be considered.

PCA models are typically very easy to interpret and when interrogated can provide fault diagnosis [

PCA gives new indicators which are linear combinations of the original ones, thus the new indicators combines similar old indicators through their shared properties, Five main components were extracted to represent the 11 variables, the first PC is powerfully correlated with four of the original variables (Foreign

Component | Eigenvalue | Percentage of variability | |
---|---|---|---|

Component | Cumulative | ||

PC1 | 4.475 | 40.679 | 40.679 |

PC2 | 2.285 | 20.776 | 61.455 |

PC3 | 1.486 | 13.509 | 74.964 |

PC4 | 1.253 | 11.389 | 86.353 |

PC5 | 0.725 | 6.590 | 92.943 |

PC6 | 0.353 | 3.205 | 96.148 |

PC7 | 0.188 | 1.710 | 97.858 |

PC8 | 0.112 | 1.022 | 98.881 |

PC9 | 0.080 | 0.730 | 99.611 |

PC10 | 0.029 | 0.266 | 99.877 |

PC11 | 0.014 | 0.123 | 100.000 |

direct investment, gross capital formation, trade openness and exchange rate) and recovers 40.679% of the variability, three (3) variables correlate well with the second component, one (1) variable correlate well with the third component, two (2) variables correlate well with the fourth component, one (1) variable correlate well with the fifth component.

As can be seen from the principal component load matrix in

Component | |||||
---|---|---|---|---|---|

PC1 | PC2 | PC3 | PC4 | PC5 | |

Foreign direct investment | 0.789 | −0.458 | −0.006 | −0.105 | 0.264 |

Gross capital formation | 0.836 | −0.481 | 0.095 | −0.034 | 0.107 |

Official exchange rate | 0.924 | 0.167 | 0.260 | −0.042 | 0.025 |

Trade openness | 0.736 | 0.211 | −0.489 | 0.043 | 0.326 |

External debt stock | −0.254 | 0.801 | −0.225 | 0.179 | 0−.403 |

Final consumption expenditure | −0.141 | 0.740 | −0.584 | 0.168 | −0.038 |

Relative import prices | 0.049 | 0.966 | 0.077 | 0.022 | −0.109 |

GDP growth | 0.096 | −0.079 | 0.946 | −0.002 | 0.075 |

Inflation | −0.089 | 0.251 | 0.185 | 0.892 | 0.119 |

Population growth | −0.017 | 0.091 | 0.430 | −0.781 | 0.286 |

Savings | 0.270 | −0.320 | 0.066 | −0.043 | 0.857 |

Extraction Method: Principal Component Analysis. Rotation Method: Varimax with Kaiser Normalization. Source: Author’s own.

Having determined the number of principal components retained, we are now concerned with the interpretation of the PCs:

The first component is positively correlated with Foreign direct investment, gross capital formation, trade openness and exchange rate. This correlation suggests the four variables vary together and when the others decrease as well and could be considered as primarily a measure of investment. The second component is most correlated with external debts, final consumption expenditure and relative price import, both in a positive direction. As price as signals of the competitiveness of a country’s exchange power vis-à-vis the rest of the world in market reflects the income consumption information. The third component is positively correlated with gross domestic production and not much else, this component could be viewed as a monetary measure of the market value of all the final goods and services produced in a specific time period reflecting the price information. The fourth principal component is inflation and population growth, the component most correlated with inflation at 0.892 and could be considered as primarily a measure of inflation information and the fifth components of total reserves reflects saving information.

Principal component regression (PCR), is a technique that combines principal component analysis with inverse least-squares regression [

Using 5 named principal components as explanatory variables (set with the import amount (Y) as the explanatory variable component regression to further analyze the relationship between import trade and principal components relationship.

Coefficients: | ||||
---|---|---|---|---|

Estimate | Std. Error | t-value | Pr (>|t|) | |

(Intercept) | 250.74 | 9.84 | 25.49 | <2e−16*** |

β_{1} | −79.56 | 4.71 | −16.88 | <2e−16*** |

β_{2} | 74.46 | 6.60 | 11.29 | 1.1e−12*** |

β_{3} | 13.34 | 8.18 | 1.63 | 0.11 |

β_{4} | 13.42 | 8.91 | 1.51 | 0.14 |

β_{5} | −99.50 | 11.71 | −8.50 | 1.0e−09*** |

Signif. codes: 0 “***” 0.001 “**” 0.01 “*” 0.05 “.” 0.1 “ ” 1. Residual standard error: 60.6 on 32 degrees of freedom. Multiple R-squared: 0.939, Adjusted R-squared: 0.929, F-statistic: 97.9 on 5 and 32 DF, p-value: <2e−16. Source: Author’s own.

The function reflects the relationship between the import volume index and each principal component. The estimated PCR that fits the data gathered is given as:

Y = 250.74 − 79.56 Z 1 + 74.46 Z 2 + 13.34 Z 3 + 13.428 Z 4 − 99.5 Z 5

The import trade has a significant negative relationship with investment fundamentals and saving factors, positive significantly related with income consumption factors and positive non-significant with price and inflation factors. The F-statistics was statistically significant at a 5% significance level (F = 97.9, p-value < 2e−16). Also, the R-square was approximately 92.9% to show how much the components can be explained on the dependent variable.

In

Data splitting is the act of partitioning available data into two portions. Usually, for cross-validation purposes [

11 explanatory variables of the training set are compressed into 5 components and their scores are regressed on import volume index; the prediction shows the accuracy of 0.89. The same technic is used to the testing set and the accuracy was 0.90. The model has been trained well and tested thoroughly, the evaluation of the performance was based on MIN-MAX Accuracy after PCA and we combined the database created: the database of actuals and predicted training set and database of actuals and predicted test set into one and form

Forecasting future observations is the most important task of time series analysis when there are many predictors. The data followed an approximate factor model, here the predictors were summarized by a small number of components estimated using principal components [

we select holt winter’s smoothing due to its prediction abilities and that it can adapt to the changing trend.

The black line represents the actual Import volume index from 1980-2017, the red line represents the predicted values of components influencing import trade from 1980-2017 and the blue line represents the forecast for the eight years 2018-2025. The shaded region represents the confidence intervals values. Figureshowed that the import trade of Rwanda for the following 8 years will experience an increase.

This result is in line with the annual report given by the world bank database, which gave an import volume index of 776,339 for 2018, it is included in the confidence interval.

Based on the findings above, the eleven macroeconomic indicators showed fairly a degree of unpardonable redundancy in the correlation matrix. Preliminary to PCA, we found that five PCs will suffice. From this, the PCA was adopted to identify the variables that are likely to be classified together as one principal component. Five PCs roughly explain 92.4% of the total variation in the data. We found that four of those indicators including foreign direct investment, gross capital formation, trade openness, and exchange rate showed a very high likeness and these were classified into one factor referred to as the investment fundamental factor. External debts, final consumption expenditure, and relative price

import exhibited great similarities and were classified into another factor referred to as the income consumption factor. The third principal component is the gross domestic product (GDP) as a monetary measure of the market value of all the final goods and services produced in a specific time period can be named as the price factor. The fourth principal component is inflation and population growth reflects the inflation factor but total reserves on its own uniqueness from the others were classified as the saving factor.

After the extraction of the five main components, the eigenvectors for the five components were used as regressors for the regression analysis and the results show that there is at least a significant component in the model.

The empirical model showed that import volume has a positive relationship with income consumption, pricing and inflation factors and negative relationship with investment fundamental and saving factors, that is certainly and generally accepted that the import trade increased by price instability, high consumption and current account deficits lead to currency problems (high inflation).

Investment fundamentals and savings factors had a big negative impact on import volume as an increase in domestic investment repatriated a huge amount of 1673 million USD from abroad [

An increase in the price of domestic products compared to imported products increased price instability and encouraged diversification of imported products and spending more income to avoid future price increases and involves an increase of import volume. Consumption increases when disposable income increases and countries consume more imports as the income of their population rises, the reason for the positive relationship between income consumption and import volume.

The forecasting model improves the accuracy of prediction and reduces the error between the actual value and the forecast value in import trade prediction. Using PCA method, the best feature subset for forecasting can be found and the impact of redundant information between factors influencing import trade can be reduced. As a result, the accuracy of forecasting is improved. Finally, the application of PCA served as tools for effectively reducing and grouping variables into fewer components with little loss of valuable information. This provides the essence of the principal component analysis.

Rwanda is currently establishing a competitive knowledge-based economy and promoting industrial enterprise development and also the incentives of investment targeted at promoting exports and shifting from an agriculture-based economy to high-value goods and services [

Therefore, Rwanda’s economic policymakers have to be compelled to take into consideration the following three points to manage import trade promotion:

First, as Rwanda is fast developing economically, technologically and mentally and there is an enormous need for innovative products, leading to Rwanda’s economy higher quality, higher efficiency, fairer and more sustainable that why Rwanda economic policymaker should control the accumulation of the investment by encouraging higher private investments and savings to enrich public funding, industrial policy and human capital thus providing support import trade control and economic security.

Second, prices change due to supply and demand, has an affect import trade positively, to attain sustainable economic growth coupled with price stability. Policy authorities should manage to keep inflation low and stable with negotiating market access through new bilateral trade agreement and transport agreement and conjointly establishing regular monitoring of the price on the market.

Third, the rise in the exchange rate indicates the national currency (Frw) depreciation also will have a negative impact on imports, independency, investment growth as well as economic resilience. With speedy economic development and growth of import trade in Rwanda coming from coffee and tea base, capital accumulation is the main driving force for economic growth. Therefore, the growth of investment and savings can be driven by weakening import trade which will serve as a catalyst to attract more external financiers and investors as it reduced the value of the international flow of capital, thus directly forming the need for imported products for investment purposes to increase the level of product exportation this would have a giant positive impact on saving culture linked to economic growth.

This manuscript is the authors’ original work, has not been published and is not under consideration for publication elsewhere.

The authors declare no conflicts of interest regarding the publication of this paper.

Nzayisenga, E. and Zhu, Y.Z. (2020) The Import Trade Forecasting Model Based on PCA: Evidence from Rwanda. Open Journal of Statistics, 10, 678-693. https://doi.org/10.4236/ojs.2020.104042