Spatial Statistical Analysis and Comprehensive Evaluation of High-Tech Industry Development

After 30 years of economic development, the high-tech industry has played an important role in China’s national economy. The development of high-level technological industry plays a leading role in guiding the transformation of China’s economy from “investment-driven” to “technology-driven”. The hightech industry represents the future industrial development direction and plays a positive role in promoting the transformation of traditional industries. The rapid development of high-tech industry is the key to social progress. In this paper, the traditional analytical model of statistics is combined with principal component analysis and spatial analysis, and R language is used to express the analytical results intuitively on the map. Finally, a comprehensive evaluation is established.


Research Purpose
High technology industries are playing an increasingly important role in the optimization of regional industrial structure. Spatial statistics is an analytical method to analyze statistical information through spatial position. Compared with traditional statistical methods, spatial statistical analysis has its own advantages in understanding the geographical location characteristics and spatial pattern by considering the spatial position and neighborhood relationship according to correlation analysis. This paper combines the principal component analysis method of statistics and spatial analysis, uses R language [1] to make the analysis results intuitively expressed on the spatial map based on the research results.

Research Status
Spatial statistical analysis was originally proposed by South African geologist Krige, many foreign scholars carried out extensive and in-depth research on it.
From 1970s to 1990s, Cliff and Griffith in the analysis [2] not only conducted many meaningful explorations on the verification of spatial autocorrelation, but also establish spatial model. Anselin and Getis in the analysis [3] and [4] conducted research on Local Spatial Statistical Indicators, including ESDA (Exploratory Spatial Data Analysis), LISA (Local Indicators of Spatial Analysis), Moran scatter map, etc. J.R. Friedman in the analysis [5] based on the theory of unbalanced regional economic development, divided the spatial structure of the economic layout into two systems, center and periphery, emphasizing that innovation elements gradually spread to the periphery while strengthening the development capacity and vitality of the "center". Since the 1990s, regional economic research has entered the stage of new spatial economics, in which new economic geography theories such as P.R. Krugman [6], Masahisa Fujita [7], and A. J. Venables [8] have become mainstream. In 2000, Okabe [9] et al. discussed spatial interpolation, spatial model and spatial point pattern analysis from the perspective of spatial mosaic, and Lloyd [10] et al. systematically summarized the statistical analysis of local models and spatial point patterns. Wu Yuming in the analysis [11] and [12] uses the spatial constant-coefficient spatial autoregressive model, spatial error model, and geographic weighted regression model of spatial statistical analysis to conduct a quantitative analysis of the overall R & D and innovation in China's provinces. Chen Hongchuan in the analysis [13] built a high-tech industry technology innovation capability evaluation index system, used data mining method (K-means clustering) to evaluate the high-tech industry technology innovation capacity. Zheng Shuwang and Xu Zhenlei in the analysis [14] used partial least squares regression model to analyze the influencing factors of the high-tech industries' technological innovation capability in the three northeastern provinces, and established a comprehensive evaluation index system for technical innovation capability. Fan Decheng and Du Mingyue in the analysis [15] used the TOPSIS grey relation projection method to calculate the close degree of grey relation projection, introduced coordination degree model to evaluate regional coordination development level and conducted quadratic weighted calculation. Wu Yanxia and Zhou Chunguang in the analysis [16] used spatial panel Dubin model to test the relationship between financial agglomeration and its spatial overflow and the development of high-tech industries. The results show that the Guangdong-Hong Kong-Macao Greater Bay Area financial agglomeration has the ability to promote the development of high-tech industries. 2) The research results of this paper will be applied in the management and decision-making of high-tech industry in China. Through the research on industrial structure and regional distribution, some feasible countermeasures and suggestions are put forward.

Innovation of This Paper
3) This paper introduces the spatial correlation effect analysis for the first time in the domestic industrial statistical analysis research by using the theoretical method of spatial statistical analysis and the integrated statistical model. We systematically and multi-dimensionally conduct empirical test and analysis on the overall research of China's high-tech industry.

Construction of Evaluation Index System and Data Collection
To conduct comprehensive analysis, it is necessary to construct the evaluation index system. Generally speaking, the principles to be followed in constructing the comprehensive evaluation index system are as follows: 1) Principle of system comprehensiveness.

4) Principle of flexibility and operability.
In order to fully reflect the situation of national high-tech industry and understand the impact of regional differences on industrial development, it is necessary to conduct classification research according to the similarity of the development situation of high-tech industry in various regions. On the basis of collecting and collating the relevant data released by the country, we selected 10 evaluation indexes in 4 categories, including production and operation conditions as shown in Table 1.

Basic Analysis and Spatial Display of Data
According to the four types of statistical indicators, we establish the astrological map of the development data of each province in 2019. billion yuan as shown in Table 2 and Table 3.    The classification of four regions was based on the traditional quartile method.

1) Spatial maps of high-tech enterprises
It can be seen from Figure 1    Tibet, Qinghai, Ningxia are in low-value areas. In terms of regional distribution, except for Sichuan, the high-value areas are basically distributed along the eastern coastal areas, and gradually transit to the western inland into low-value areas.

4) Spatial map of total profit
It can be seen from Figure   industry. In terms of regional distribution, except for Sichuan, the high-value areas are basically distributed along the eastern coastal areas, and gradually transit to the western inland into low-value areas.

5) Spatial map of export delivery value
It can be seen from Figure     As the map ( Figure 8) below shows, new product sales in Guangdong are of high value area. Sales of new products in Fujian, Zhejiang, Jiangsu, Shandong, Beijing, Shanghai, Tianjin, Hunan, Jiangxi, Liaoning and Sichuan are of relatively high value area. Heilongjiang, Jilin, Hebei, Henan, Guizhou, Guangxi are relatively low. Xinjiang, Tibet, Qinghai, Ningxia, Inner Mongolia, Gansu and Yunnan provinces are low. From the perspective of regional distribution, the high-value areas are basically distributed in the eastern part of China, and gradually transit to the low-value areas in the western inland. 9) Spatial map of the number of patent applications It can be seen from spatial map (Figure 9), Guangdong has the highest value of application for patent. Fujian, Zhejiang, Jiangsu, Shandong, Sichuan, Shanghai, Beijing, Hubei, Hunan, Anhui, Henan, Liaoning are of relatively high value area. Heilongjiang, Jilin, Hebei, Guizhou, Chongqing, Shanxi are relatively low. Xinjiang, Tibet, Qinghai, Ningxia, Inner Mongolia, Gansu are low. From the perspective of regional distribution, except that Sichuan is a high-value region, the high-value region is basically distributed in the eastern region of China, and gradually transfers to the western inland region as a low-value region.
10) Spatial map of valid invention patents It can be seen from Figure 10 that Guangdong is the best in regards of high value of invention patent. Zhejiang, Jiangsu, Shandong, Sichuan, Beijing, Shanghai, Tianjin, Hunan, Hubei, Anhui, Fujian, Liaoning are of relatively high value area. Jilin, Heilongjiang, Henan, Guizhou, Guangxi, Yunnan are relatively low. Xinjiang, Tibet, Qinghai, Ningxia, Inner Mongolia, Gansu are low in valid invention patents. From the perspective of regional distribution, the high-value areas are basically distributed in the eastern part of China, and gradually transit to the western inland areas. Sichuan province in the central part belongs to the high-value area, showing a prominent spatial display.

Principal Component Analysis Method
Principal component analysis (PCA) is a powerful tool for dimensionality reduction of variables, and its basic idea is to try to recombine correlated indicators into a new set of independent comprehensive indicators to replace the original ones. The mathematical solution is to take the original p indices as a linear combination of the new indices. The first linear combination, namely the first composite index, is denoted as y 1 . In order to make this linear combination unique, it is required that the variance of y 1 is the largest among all linear combinations. If the first principal component is not enough to represent all the information of the original p indexes, then consider selecting the second principal component y 2 , and require the existing information of y 1 not to appear in y 2 , that is, Cov (y 1 , y 2 ) = 0. Figure 11 shows schematic diagram of principal component analysis. As shown in Figure 11, the indexes (scatter points) with x 1 -x 2 as the axis have a large projection on x 1 and x 2 , so both indexes contain the necessary information of the data. However, if we find out there is a strong correlation between x 1 and x 2 , and the data could be distributed along y 1 , then we can carry out simple transformation.
After this transformation, the information of the original data is mainly contained on the y 1 axis under the new coordinate axis, while the information projection on the y 2 axis is very small, which can be ignored. If only y 1 is selected for study, we can successfully achieve the dimensionality reduction processing.
The case of multidimensional variables is similar to that of two-dimensional Figure 11. Schematic diagram of principal component analysis. variables, but we can only imagine the abstract space, such as a multidimensional ellipsoid with high fluctuation. In order to reduce the dimension, the principal component analysis is basically completed by first finding out each principal axis of the ellipsoid. Then calculate the new axis standard which can represent most of the principal axis information as a new variable. Similar to the two-dimensional case, the spindles of the higher dimensional ellipsoid are also required to be perpendicular to each other, and these new spindles that are orthogonal to each other are linear combinations of the original spindles, which are the principal components. The fewer principal components can be selected for study, the better the dimensionality reduction effect will be. Most researchers believe that the criterion for selecting principal components is that the sum of the principal axes represented by the new principal components should account for most of the sum of the original principal axes. Some scholars suggest that the total length of the spindle selected accounts for more than 80% of the total length of all spindles. This principle can be used as a basic treatment principle. Principal component analysis process: 1) Find the eigenvalues and eigenvectors of the correlation matrix; 2) Calculate variance contribution rate and cumulative variance contribution rate: the contribution rate of each principal component represents the percentage of the total information of the original data; 3) Determine the principal components: let comp.1, comp.2 ... comp.p be p principal components, in which the total amount of data information contained by the first m principal components (i.e. their cumulative variance contribution rate) is not less than 80%, the first m principal components should be used to reflect the original evaluation object.   The principal component analysis function is used to analyze data, where X is the data box, m is the number of factors, whose default is 2. Plot is the main component graph. When these parameters are TRUE, the function program will automatically calculate and output the result. When the parameter value is FALSE, it is not calculated.

Example for Principal Component Analysis
Next, due to the fact that there are many years, we will take the data of year 2019 as an example for principal component analysis.

1) find the principal components of the correlation matrix
2) determine the principal component From Table 4, according to the principle that the contribution rate of cumulative variance is greater than 80% and the variance is greater than 1, two principal components are selected, and the contribution rate of cumulative variance is 97.72%. In this case, m = 2.

3) principal component coefficient
Obtained from

Space Comprehensive Evaluation Based on Principal Components
Finally, the comprehensive score was estimated by the weighting method, and the proportion of each principal component's variance contribution to the total variance contribution of the two principal components was used as the weight to make the weighted summary, and the comprehensive score and ranking of the provinces were obtained.
In terms of comprehensive scores from Table 7 From Figure 12, uses the results of spatial statistical function analysis, we

Comprehensive Evaluation of Principal Component Space from 2016 to 2019
From Table 8    Based on the statistical research of the high-tech industry, this paper puts forward the following Suggestions: First, establish a set of comprehensive indexes for the development of high-tech industries. Make sure that production and business operation activities, research and development, patents, fixed asset is feasible and effective. Forming a comprehensive index is not only more advantageous to simple index, but also is clearer to show a regional industrial development level. Therefore, we suggest establishing a comprehensive index for the development of high-tech industry, so as to facilitate the reference of government when making decisions.
Second, carry out the classification to guide the development according to the region difference. China can implement regional classified development. Since Guangdong and Jiangsu lead the development of high-tech industries, we should take the two provinces as examples for other provinces. Guangdong and Jiangsu should have the courage to innovate and vigorously develop high-tech industries by extending existing advantages. Northeast China should further accelerate the pace of industrial catch-up. In central China, Sichuan province is the center to accelerate the realization of industrial upgrading. Since the western region, represented by Tibet, Qinghai and Xinjiang, is sparsely populated and lacks the industrial foundation. Therefore, it should give up the development of high-tech industries and adopt the development of basic industries.
In addition, the development of high-tech industry in Guangdong province has made great contributions to the country. However, the development of surrounding provinces of Guangdong is in a relatively backward state due to their small correlation with the development of Guangdong province. It is suggested that Guangdong should encourage and promote the transfer of Guangdong's high-tech industry to surrounding areas in industrial policy and other aspects, so as to strengthen the support of Guangdong's development.
Moreover, Hong Kong and Macao bay area financial agglomeration has the ability to promote the development of high technology industry, therefore, this Lastly, the western region should take the opportunity of "One Belt and One Road" construction to expand its economic openness to the outside world and promote the coordinated development of domestic and foreign markets. The vast western region, Inner Mongolia, Xinjiang, Tibet, Yunnan should not only seize the construction opportunity, but also actively participate in the cooperation of high technology industries both at home and abroad. The undeveloped area in China should promote innovation ability to improve the research level of science and technology for high-tech industry development.