Study on the Macro-Level Risk Assessment and Intelligent Line Selection for Overseas Railway Construction

In recent years, China has made overseas railway construction a key investment project. The primary task of overseas railway investment construction is to select railway routes. Taking some sections of the Belt and Road as an ex-ample, 15 representative risk indicators have been established based on the survey data. Based on the principal component analysis method, the risk assessment is carried out in 63 countries along the Belt and Road district, and finally the risk scores are sorted, and the reasonable high-speed rail lines are programmed through the ranking of risk scores.


Introduction
Since the establishment of "the Belt and Road" cooperation in 2013, China's commitment to overseas railway investment and construction can not only drive the economic development of China and neighboring countries, but also demonstrate China's economic strength and the development of high-speed rail technology.
The primary task of the construction of overseas high-speed railway is railway route selection. The design of railway route selection is the overall design of a railway line, which directly affects the railway transportation capacity, transportation quality and economic benefits of investment. Because the work load of risk assessment [1].
In this paper, the principal component analysis method is finally used for evaluation. The advantages are as follows: 1) The principal component analysis can eliminate the correlation between evaluation indicators Because the principal component analysis forms the principal components that are independent of each other after transforming the original index variables, and the higher the degree of correlation between the indicators is proved, the better the principal component analysis is.
2) The principal component analysis can reduce the workload of indicator selection for other evaluation methods; it is difficult to eliminate the correlation between the evaluation indicators, so it takes a lot of effort to select the indicators. While the principal component analysis can eliminate the related influences, so it is relatively easy to select the indicators.
3) When there are more rating indicators, it is also possible to use a few comprehensive indicators instead of the original indicators for analysis while retaining most of the information. In the principal component analysis, the principal components are arranged in order of variance. When analyzing the problem, some of the principal components can be discarded, and only the principal components with larger pre and post variance are used to represent the original variables, thus reducing the computational workload. 4) In the comprehensive evaluation function, the weight of each principal component is its contribution rate, which reflects the proportion of the information of the primary component of the original data to the total amount of information, so that the determination of the weight is objective and reasonable, and it overcomes the defect of artificially determining the weight in some evaluation methods.
5) The calculation of this method is relatively standardized, which can be easily implemented on a computer, and can be done with specialized software.

Macro-Level Risk Indicators
From a macro perspective, the construction of overseas railways is closely related to the political, economic, and social development factors of each country. Therefore, when considering the risk assessment indicators for overseas railway line selection, the principles of data availability and authority are considered. From the World Bank and the National Bureau of Statistics of China, three general in-American Journal of Industrial and Business Management dicators are selected here, namely, political, economic, and social developmentrelated specific factors to reflect the specific situation of each country [2].
In consultation with Dr. Tong Xinhao, Dr. Zeng Hailin and other experts (both professors in the railway industry from Southwest Jiaotong University) and based on the actual situation of countries along the railway, the macro-level risk assessment indicators are comprehensively selected of data from various authoritative databases on railway line selection and data on foreign project contracting and import and export in China, which are shown in Table 1.

Macro-Level Risk Assessment Principle [3] [4]
Principal component analysis is a multivariate statistical technique that transforms a set of possible correlation variables into a set of linearly uncorrelated variables by orthogonal transformation. The converted set of variables is called the principal component. The basic idea is to reduce the dimensionality of the original variable data to obtain several principal component integrated variables that are not related to each other instead of a large number of original variables, and these integrated variables carry most of the information in the original variables [3]. The first comprehensive variable selected is denoted as F 1 , and F 1 has the largest Var(F 1 ), which means that F 1 contains the largest amount of information, and F 1 is called the first principal component. If the first principal component is insufficient to represent the information carried by the original p variables, then the second principal component F 2 is selected, and F 2 is independent of F 1 linear, and the mathematical language expression requires Cov(F 1 , F 2 ) = 0. By analogy,  The inflation rate, measured by the consumer price index, reflects the annual percentage change in the average consumer cost of purchasing a basket of goods and services.

3
China's foreign contracted project completed turnover The amount of work performed by Chinese enterprises or other units in contracting overseas construction projects in the form of money completed during the reporting period. 4 The changes of exchange rate The rate of change in the average annual exchange rate of the official exchange rate of each country compared to the average exchange rate of the previous year.

Power coverage
The percentage of the electricity coverage, that is, the percentage of the population with electricity supply as a percentage of the total population.

6
The difficult degree of company registration By measuring the company's registration process complexity, the registration process for starting a company includes obtaining the necessary permits, certifications, and other procedures. 7 The rate of traffic accident The number of deaths caused by road traffic in Shanghai per 100,000 populations.

Tax burden
The percentage of total tax out of the commercial profits. The total tax rate refers to the amount of tax and mandatory contributions that the enterprise should pay after deducting the deduction and tax exemption as part of the commercial profit.

9
The population density The population per square kilometer is the number that the mid-year population divided by the land area (square kilometers).

Population growth rate
The t-year population growth rate refers to the medium-term population growth rate from t-1 to t.

Total population
Calculated according to the actual number of people, all residents are counted, regardless of their legal status or nationality. 12

Proportion of urban population
The percentage of the urban population out of the total population is collected and collated by the United Nations Population Division.

13
Establishing cooperative relations with China From low to high, it is divided into diplomatic relations, partners, comprehensive partners, strategic partners, strategic partners, and comprehensive strategic partners, ranging from 1 to 6.
14 Political stability Political stability measures people's perceptions of the possibility of political instability and politically motivated violence or terrorism.

Government efficiency
It reflects the public's perceptions of public service quality, the civil service system and the degree of neutrality, the quality of policy formulation and implementation, and the extent to which the government implements these policies. American Journal of Industrial and Business Management where ZX i represents the i-th column of the matrix ZX of data normalization.
The first m principal components with cumulative contribution rate of 60% are selected, and take the variance contribution rate k α ( 1, 2, , k m =  ) as the weight to construct a linear combination as a comprehensive evaluation function: the above formula, the comprehensive score R of the evaluation can be obtained, and then the magnitude of the R value is calculated according to the value of each data of each evaluation object and these R values are comprehensively sorted, thereby obtaining a comprehensive evaluation of each object to be evaluated [5].

Instance Application
According to the overseas railway macro-level risk assessment indicators established in Section 2, here are 15 indicators in total, the indicator data are from the World Bank (https://www.worldbank.org/) and the China National Bureau of Statistics. In order to increase the reliability of the data, this paper selects 63 countries (All the Belt and Road project routes and participating countries) as samples, because there is no direct relationship between the risks, and the dif-ference between the dimensions Large, it is necessary to standardize the various risk data, and use the unified standard to judge, so the original data is first standardized, and the standardized data is shown in Table 2 [6].
The normalized data is used in the dimension reduction factor analysis, and the data is subjected to KMO and Bartley test. If the result of KMO value is greater than 0.6 and the significance of the Bartley test is less than 0.01, principal component analysis or factor analysis can be performed. Because the amount of samples are huge, so on the basis of principal component analysis, the rotation of the factors is actually rotating the factor load matrix, which can simplify the structure of the factor load matrix, so that the square of the element of each column or row in the load matrix is polarized to 0 and 1, through the factor rotation (actually coordinate rotation), it makes each original variable have a close relationship between as few factors as possible, so the actual meaning of the factor solution is easier to explain [7].
Then, the factor analysis tool is used for dimensionality reduction. Based on the principal component analysis, the maximum variance method is used to perform the factor rotation, and the result Table 3 is obtained.
It can be seen from the above test results that the KMO value is greater than 0.6 and the significance is less than 0.01, so it is suitable for principal component analysis or factor analysis.
As can be seen from Table 4, the first five principal components contain nearly 66% of the information, and it can be considered that these five principal components contain most of the information of the original elements.
The load matrix after the rotation of these five principal components is shown in Table 5, the coefficient indicating the risk of each component, generally greater than 0.5 -0.6, is attributed to the component.
The above data was processed to obtain Table 6, where the gray shading marks were the portions with coefficients greater than 0.58. Name each principal component according to the data marked in the below table.
The fifth item (power coverage rate) and the 12th item of urbanization rate in The coefficient of item 10 (population growth rate) in F 4 is relatively large, and F 4 is called "the main component of population development trend";  The coefficient of item 7 (traffic accident rate) in F 5 is relatively large, and F 5 is called "main component of traffic safety index".
In summary, the five main components of national line selection risk are shown in Table 7.
Next, each column of the load matrix is divided by the square root of the variance of the corresponding principal component, and the coefficient of each principal component is obtained, and the matrix is denoted as A; then the variance matrix is normalized, which are the Weights of each principal component, this is regarded as matrix B. Matrix B: Let the data after standardization be matrix X, then the composite score of the sample countrie: S X A B = × × .
The results are generally irregular, for example, South Korea is 1.6243, and there is a negative number in the score, which only indicates that the score is lower than the average. Since this score does not intuitively judge the country's risk, it is converted into a familiar percentage system score [8]. Suppose: Then control the score between (0,1), then suppose: The score can be converted to a percentile, as shown in Table 8:  components are calculated by each country, and the scores are calculated. The risk score is specified, and then through the ranking order, the risk level of the candidate line can be quickly and clearly determined, and the corresponding line selection result is obtained.

Conclusions
The principal component analysis method can be determined from the size of the information sample and the system effect of the sample included in the indicator, avoiding the arbitrariness of the expert scoring and subjective judgment, and the risk rankings of each country can be visually seen through the results.
All these indicate that this is a more practical method for railway risk assessment and route selection.
This article lacks a control experiment and will use other risk assessment methods in subsequent studies to compare the results with principal component analysis to further determine the accuracy of the method.