_{1}

Relaxed Lasso method is used for variable selection to 15 main economic factors affecting foreign investment. While Relaxed Lasso method, method of least squares and regression are compared, the result further reveals the main problem facing foreign investment at the present stage.

Variables selection will often be put forward when building statistical models. It is not conducive to study the problems that the variables in the model are more or less than the actual variables. In the process of optimizing the models, most explanatory and influential subset of variables need to be found, in order to make the model more reasonable and high forecast precision. In the traditional method, the variable selection and parameter estimation are separated, such as AIC criterion proposed by Akaike [

At present our country is in a new stage of development, the role of foreign direct investment can not be underestimated in China’s economic development in a fairly long period of time. Therefore, it still has very important practical significance that the main influencing factors of foreign direct investment in China are deeply discussed. The research on the causes of foreign direct investment and its decision making has been paid much attention by scholars. Xu Jinliang used ordinary least squares method to study the influencing factors of attracting foreign direct investment in Jiangxi Province [

Given a set of observed data a, i = 1 , 2 , ⋯ , n , x i = ( x i 1 , x i 2 , ⋯ , x i p ) is a vectors consist of variables, y i is dependent variable. Linear regression model can be expressed as: y i = x i β + ε i = x i 1 β 1 + x i 2 β 2 + ⋯ + x i p β p + ε i ( Y = X β + ε ) , β = ( β 1 , β 2 , ⋯ , β p ) T is the vector of unknown regression coefficients, ε i is a random error, Y = ( y 1 , y 2 , ⋯ y n ) T , ε = ( ε 1 , ε 2 , ⋯ , ε n ) T , X is n × p -order

matrix, line i is x i T = ( x i 1 , x i 2 , ⋯ , x i p ) , E ( ε ) = 0 , Var ( ε ) = σ 2 I ,

E ( Y | X ) = β 1 x 1 + β 2 x 2 + ⋯ + β p x p . Assuming that the observations are independent, or dependent variable y i is independent in the case of the given observa-

tions, While x i j is standardized, that is to say, 1 N ∑ i x i j = 0 , 1 N ∑ i x i j 2 = 1 .

Many regression coefficients in the model are 0, Relaxed Lasso method is used to identify those variables with a coefficient of 0 in the model based on the data obtained, and estimate non-zero coefficient, so as to find out the sparse model.

Actually the Relaxed Lasso method of variable selection for linear model is equivalent to take into account the following questions:

β ^ λ , ϕ = arg min { ∑ i = 1 n ( y i − x i { β ⋅ 1 Μ λ } ) 2 + ϕ λ ∑ j = 1 p | β j | } (1)

Which parameter is λ ∈ [ 0 , ∞ ) , ϕ ∈ ( 0 , 1 ] , and 1 Μ λ is the characteristic function regarding the set of variables subscript, that is for all k ∈ { 1 , 2 , ⋯ p } ,

{ β ⋅ 1 Μ λ } k = { 0 , k ∈ Μ λ β k , k ∈ Μ λ (2)

It is not difficult to find that we consider only the variable which subscripts in the collection in estimating the Relaxed Lasso. The same as Lasso estimation, parameter λ controls variable selection section, and the second parameter ϕ controls the parts of the coefficient compression. So we have the following conclusions:

1) ϕ = 1 , Relaxed Lasso and Lasso are completely equivalent.

2) ϕ < 1 , compared with Lasso estimates, the coefficient of compression ratio in Relaxed Lasso are weakened. It is can be prevented that some of the significant variables in the model coefficients become 0 because of excessive compression.

3) The case of ϕ = 0 needs special consideration, as the deﬁnition above would produce a degenerate solution. In general we deﬁne the relaxed Lasso estimator for ϕ = 0 as the limit of the above deﬁnition for ϕ = 0 . In this case, all coeﬃcients are estimated by the OLS-solution in the model.

In conditions of orthogonal design, Relaxed Lasso is:

β ^ k λ , ϕ = { β ^ k 0 − ϕ λ , β ^ k 0 > λ 0 , | β ^ k 0 | ≤ λ β ^ k 0 + ϕ λ , β ^ k 0 < − λ , (3)

where β ^ 0 is the OLS solution.

For the General linear model, Relaxed Lasso algorithm is based on the LARS algorithm, it is actually a two-stage approach, The theoretical description of the algorithm is as follows:

1) Compute all ordinary Lasso solutions with the Lars-algorithm. Let M 1 , M 2 , ⋯ , M m be the resulting set of final models. Let λ 1 > λ 2 > ⋯ > λ m = 0 be a sequence of penalty values so that M λ = M k if and only if λ ∈ ( λ k , λ k − 1 ] , where k = 1 , 2 , ⋯ , m , λ 0 : = ∞ . (the models are not necessarily distinct, so it is always possible to obtain such a sequence of penalty parameters.

2) For each k = 1 , 2 , ⋯ , m , compute all Lasso solutions on the set M k of variables, varying the penalty parameter between [ 0 , λ k ] . The obtained set of solutions is identical to the set of relaxed Lasso solutions β ^ λ , ϕ for λ ∈ ( λ k , λ k − 1 ] . The Relaxed Lasso solutions for all penalty parameters are given by the union of these sets.

We find that this method gives all the Relaxed Lasso solutions when the parameter λ > 0 , ϕ ∈ [ 0 , 1 ] in theory. This simple algorithm is not optimal, however, can be further improved [

According to economic theory and research findings, 14 variables were selected from the infrastructure, human resources, labor cost, market size, exchange rates, labor productivity, concentration factor, trade openness and trade barriers.

Selected variables are as follows: Highway mileage (x_{1}), Freight turnover (x_{2}), throughput of post and telecommunications (x_{3}) reflect infrastructure; The number of students in Colleges and Universities(x_{4}) reflects the human resource situation; Average wage of workers (x_{5}) reflects labor costs; GDP(x_{6}), GDP growth rate (x_{7}) the total investment in fixed assets (x_{8}) and the total retail sales of consumer goods (x_{9}) reflect the size of the market; The dollar-Yuan exchange rate (x_{10}) reflects exchange rate; The ratio of GDP to employment (x_{11}) reflects the level of labor productivity; Third industry accounted for the proportion of GDP (x_{12}) reflects the agglomeration effect ; Proportion of total imports and exports to GDP (x_{13}) reflects the degree of trade openness Tariff (x_{14}) reflects the degree of trade barriers; We expect wages and tariffs have a negative impact on foreign investment, and other variables have a positive effect.

Data were selected in this article range from 1995 to 2014. Exchange rate data were obtained from the State administration of foreign exchange, other variable data were obtained from China Statistical Yearbook 1995-2015. The article utilized the exchange rate of dollar against RMB on the last day from 1995-2014. In order to eliminate dimensional effect among variables, and relatively easy to get a smooth sequence, we take the natural logarithm of the time series data, and then take Centralized criterion, so does not affect the relationship between variables.

Then Relaxed Lasso was used to select the 14 variables. All solutions were found just need 20 steps by r-language Relaxed Lasso algorithm. Solution path as shown in

Can be seen from the results corresponding to each step in the path of the estimation of model parameters, it only took 15 steps before the results of optimal solution. Results showed that Relaxed Lasso method can realize the estimates of the model parameters and variables selection.

Relaxed Lasso variable selection results showed that the number of students in Colleges and Universities, GDP, the growth rate of GDP have a significant positive effect on foreign direct investment, tariff has a significant negative effect on

foreign direct investment. GDP has most significant positive effect on foreign investment. This shows that foreign investment in China is mainly due to the huge domestic market. The impact of freight turnover, throughput of post and telecommunications, average wage of workers, the total retail sales of consumer goods, the ratio of GDP to employment are not significant, have not been selected into the model. In order to see the Relaxed Lasso advantage of variable selection, we compare the result with least square method and stepwise regression method, the parameter estimates are shown in

Can be seen from

Relaxed Lasso method does not select too many variables, nor excessive delete variables. The final model can be explained better.

Based on the above analysis, the major conclusions are as follows:

First, it can be proved theoretically that least squares estimation of parameter is too long on average when data has serious multicollinearity. Parameter

Parameters | Least-squares regression | Stepwise regression | Relaxed Lasso |
---|---|---|---|

−0.4077 | −0.4457 | −0.3756 | |

0.4636 | 0 | 0 | |

−0.09055 | 0 | 0 | |

0.5249 | 0.7081 | 0.6728 | |

0.3295 | 0 | 0 | |

15.2 | 2.677 | 2.18 | |

0.06574 | 0 | 0.0406 | |

−1.459 | −1.616 | −1.078 | |

0.6041 | 0 | 0 | |

−0.5728 | −0.3663 | −0.2346 | |

13.83 | 0 | 0 | |

−0.3028 | −0.2101 | −0.185 | |

−0.04689 | 0 | −0.0864 | |

−0.6188 | −0.4868 | −0.4175 |

estimates in value obtained by Relaxed Lasso is significantly less than the least squares estimate. It is the compression of the least squares estimation, which can largely eliminate the adverse effects of multicollinearity in the model. At the same time, Relaxed Lasso also has obvious advantages in the selection of high dimensional variables, neither like the least squares method that chooses too many variables, nor like the stepwise regression method to eliminate the excessive variables. The deleted variables are not significant variables to the model, and thus improve the accuracy of the model.

Second, foreign investment is greatly affected by the size of the domestic market. Foreign investment will increase by about 1 percentage points with an increase of two percentage points of GDP. At the same time, GDP growth rate has a certain role in promoting foreign investment.

Third, human resources also have a certain impact on foreign investment. The state of human resources represented by the number of students in Colleges and universities reflects the level of education to a certain extent. In fact, the area with higher education has obtained more foreign investment than other places. At the same time, the technology content of foreign capital is also high. It is most evident in the eastern region. With the increase of the number of students in Colleges and universities, it provides more technical talents for foreign investment, and further enhances our competitiveness in attracting foreign investment with other countries.

Fourth, research shows that: lower tariffs will help our country to attract more foreign investment; therefore, China should further increase the intensity of reform and opening up, accelerate the pace of negotiations on the free trade area, to create conditions to fully participate in the international competition.

He, Y.Q. (2017) The Analysis of Impact Factors of Foreign Investment Based on Relaxed Lasso. Journal of Applied Mathematics and Physics, 5, 693-699. https://doi.org/10.4236/jamp.2017.53058