Comprehensive Evaluation of Tourism Development Potential in Anhui Province Based on Cluster Analysis and Factor Analysis

In order to improve the potential of urban tourism in Anhui Province, the comprehensive evaluation of the development level of urban tourism in Anhui Province is carried out. Firstly, the comprehensive evaluation index system of the development level of tourism is constructed. Secondly, 16 cities in Anhui Province are divided into four categories through cluster analysis, and Hefei has the best tourism development. Then factor analysis is carried out to get the ranking of each prefecture-level city. The top two cities in tourism development level are Hefei and Huangshan. Finally, the conclusion is that the tourism development level of 16 cities in Anhui Province is not balanced and the tourism industry is still to be developed.


Introduction
Nowadays, with the improvement of people's living standard, traveling has become a way to relax and experience life. Tourism can drive economic development. The tourism potential of a city is related to its transportation, accommodation, scenic spots, food, shopping and so on. The tourism economy of a city is an important part of the city economy.  (Dong & Wu, 2016). In terms of the quantitative measurement of tourism competitiveness, ShiMin Fang et al. constructed the evaluation index system of tourism competitiveness, calculated the tourism competitiveness of eight cities and prefectures (forest regions) in the western Hubei circle with factor analysis method, and then analyzed the changes of tourism competitiveness in the last 10 years (Fang & Lan, 2017). Ning Cao et al. analyzed the key factors affecting the size of tourism competitiveness, established an evaluation model of urban tourism competitiveness based on this, and finally reached the conclusion that there were differences between the camp division of tourism competitiveness of 14 cities in Liaoning Province and people's common sense objective understanding (Cao & Guo, 2005). Nina Qu established an index system for the evaluation of tourism competitiveness, gave weight to the evaluation indexes by means of coefficient of variation and analytic hierarchy process, and evaluated the tourism competitiveness of the region by using a weighted summation multi-index comprehensive evaluation model (Qu, 2016). ShiCheng Deng conducted a comprehensive quantitative evaluation and comparative analysis on the tourism destination competitiveness of 38 districts and counties in Chongqing based on principal component analysis and K-means clustering analysis (Deng, 2020). In the analysis of tourism competitiveness, few studies combine cluster analysis with factor analysis. Based on this, this paper firstly clusters cities according to the tourism development level of 16 cities in Anhui Province, and then conducts factor analysis to comprehensively evaluate the tourism development potential of Anhui Province.

The Construction of Index System
Factors of tourism impact range, the influence level of tourism development, involving multiple dimensions such as economy, society and environment, the evaluation index of the tourism industry potential is numerous, by consulting relevant literature, this article selects the including the potential demand of tourism, tourism supply potential, tourism potential support and guarantee of four aspects, such as potential 21 indicators, constructs the comprehensive evaluation index system of tourism industry potential, evaluation index as shown in Table 1.

Data Source and Preprocessing
The author selects 21 indicators from four aspects, including tourism demand X. Y. Li et al. potential, tourism supply potential, tourism support potential and tourism guarantee potential, to reflect the tourism industry potential of Anhui Province. The data comes from the Statistical Yearbook of Anhui Province in 2019, and all the data are true and effective. In order to eliminate the dimensionality effect, the data is normalized. Suppose there are n objects and p indexes. The value of the jth index of the ith object is ij X , and the normalized index value is (2) Among them, j X is the mean of the jth index, ( ) j var X is the variance of the jth index.

Research Thought
Based on the comprehensive evaluation index data of tourism industry potential of 16 cities in Anhui Province, the 16 cities were divided into four categories according to the level of tourism development by systematic clustering method, and the tourism industry potential of the four categories of regions was analyzed.

Research Method
Systematic clustering method: according to the difference of distance between samples, the system clustering will cluster the samples with small distance into one category first, and the samples with large distance into one category later, until all the samples are clustered into a cluster. In the cluster analysis, the most common distance calculation formula is square Euclidean distance, so the square Euclidean distance will be selected to calculate the distance between samples, and its calculation formula is: Ward deviation sum of squares method: first, n samples are grouped into one class respectively, at this time, the sum of total deviation squares S = 0, and then, according to the principle of minimizing the sum of total deviation squares, two of them are grouped into one class, and the cycle continues until all samples are grouped into one class.

Analysis of Clustering Results
According to the selected indicators of tourism demand potential, tourism supply potential and other aspects, the data of 2019 is clustering based on SPSS software, and the clustering results are shown in Figure 1. According to Figure 1, 16 cities in Anhui Province can be divided into four categories. As shown in Table 2, Hefei is grouped into one category. Huangshan Mountain is grouped into a class; Maanshan, Wuhu, Lu'an, Xuancheng, Anqing and Chizhou are grouped together. Huaibei, Tongling, Bengbu, Huainan, Bozhou, Suzhou, Fuyang and Chuzhou are grouped in one group.

Research Thought
Factor analysis is to divide the variables with high correlation into a group, and the variables with low correlation into different groups. Each group of variables is represented by an implied comprehensive variable, which is the common factor. It reduces the dimensionality of multiple variables to a few principal factors, reflecting the correlation between original variables and principal factors.

Research Method
The idea of R-type factor analysis is to decompose each original variable into two parts: one part is composed of a few factors common to all variables, namely the so-called common factor part; the other part is the factor that each variable has alone, that is, the so-called unique factor part.
( ) Among them, 1 F , 2 F and m F are common factors, and i ε is a special factor of i X .
It can also be expressed in matrix form Among them, where U is the factor load matrix, and ij u is the load of the ith variable on the jth factor. The smaller ij u is, the smaller the correlation between the original variable and the common factor is, and vice versa.

1) KMO and Bartlett's sphericity test results
As shown in Table 3, the KMO value of the model based on SPSS software is 0.792, and the P value of Bartlett's test is 0.000 (<0.01). Therefore, the null hypothesis should be rejected, the correlation between variables is strong, and the factor analysis is appropriate.
2) Factoring In general, the number of common factors is determined according to the principle that the cumulative variance contribution rate is greater than 85%. As shown in Table 4, the cumulative variance contribution rate of the first four principal components reached 88.041%, and the variance contribution rates of each principal component were 54.05%, 18.22%, 10.04% and 5.73%, respectively. Therefore, it was appropriate to extract four common factors. The four common factors are recorded as F 1 , F 2 , F 3 and F 4 respectively. The common factors contain 88.041% of the tourism development level of 16 cities in Anhui Province. The comprehensive evaluation of the tourism development level of Anhui Province after extracting the common factors has a good effect.
3) Factor rotation and naming In order to make the component coefficient of the principal factor more reasonable and facilitate the interpretation of the meaning of each principal factor, the most commonly used method for the rotation of the principal factor is the orthogonal rotation method. The matrix after rotation is shown in Table 5.
As can be seen from Table 5, domestic tourism income, highway passenger volume, number of travel agencies and GDP and other indicators have a relatively high load on F 1 , so the common factor F 1 can be named as the tourism demand potential factor. The number of inbound tourists, the number of star-rated hotels, the number of tourist rooms and the number of hotel beds have a higher load on F 2 , so the common factor F 2 is named as the tourism supply potential factor. The evaluation indexes such as per capita GDP and per capita disposable income of rural residents have a higher load on F 3 , so the common factor F 3 is named as the potential factor of tourism support. Evaluation indexes such as green coverage rate of built-up areas and per capita park green area have higher load on F 4 , so the common factor F 4 is named as tourism X. Y. Li et al. security potential factor. 4) Calculate factor score Then, the regression estimation method was used to calculate the factor score, and the coefficient matrix of each factor component was shown in Table 6.
According to the factor component coefficients in Table 6, the factor scoring function is obtained.  Then, the comprehensive score of 16 cities in Anhui Province can be calculated. The weight is the ratio of the variance contribution rate of main factors to the total variance contribution rate. The scores of common factors and comprehensive scores of 16 cities in Anhui Province are shown in Table 7. The lower the comprehensive score is, the lower the level of tourism development and the weaker the potential of the city. On the contrary, the higher, the stronger.
From the perspective of the overall development level of Anhui tourism, the score is low, and the development potential of Anhui tourism is low. From the city point of view, the level of tourism development is uneven, the gap is larger.
According to Table 7 and the categories divided by cluster analysis, Hefei is an area with strong development potential of tourism industry. Hefei has absolute competitive advantage in tourism development, and its comprehensive score is much higher than that of the other 15 cities. Compared with other cities in Anhui Province, Hefei has a fault type advantage in tourism industry potential.
Among the 16 cities, the potential factors F 1 , F 2 and F 3 of the tourism industry in Hefei ranked first, third and third respectively, that is, the potential of tourism demand, the potential of tourism supply, the potential of tourism support and the comprehensive strength ranked first. Hefei, known as "huai right inner pipes, jiangnan lips" reputation, has A long history of more than 2000 years, A level and above tourist attractions at 59, rich in tourism resources, including sanhe town is 5 A scenic spot, and emblem park and temple of outstanding scenic spots and historical sites, such as LaoShanDao scenic area and economy by leaps and bounds in recent years, by the reputation of "garden city" and "green city" of Hefei tourism development level in the advantage position (Luo et al., 2020). The tourism security potential factor F 4 is ranked the 10th, mainly because Hefei is the capital city of Anhui Province, with a large population influx, but limited land resources and average green coverage rate. The large population leads to fewer tourism and living environment resources. The area with strong tourism industry potential is Huangshan Mountain, whose tourist attractions are well known at home and abroad, complete tourism facilities and good services, and which is a key area supporting the development of Anhui tourism (Sun, 2014). Among them, Huangshan is the most famous tourism resources, Huangshan is the United Nations world cultural and natural heritage, renowned at home and abroad, is the first in the world, has also become the symbol of Anhui Province's tourism, Huangshan lays the foundation for the development of tourism. Maanshan, Wuhu, Lu'an, Xuancheng, Anqing and Chizhou are the areas with weak tourism industry potential, and the tourism industry potential of the six places ranks the middle and upper position in Anhui Province. Ma'anshan ranks first in F3 factor score, with good economy and high per capita disposable income. However, its development is restricted by scarce tourism resources. The living quality and urban economic environment of Wuhu residents are good, but it is in the transition stage and faces huge challenges (Li & Chen, 2018). The tourism resources of Lu'an and Xuancheng are sufficient, but the comprehensive economic strength is not strong, which leads to the general performance of tourism industry potential. Anqing has a large number of A-level and above scenic spots, rich tourism resources and deep tourism culture, but the single index is not significant, and the development potential of the tourism industry is moderate. Chizhou single index is not significant, the development potential of tourism is general.
The regions with weak tourism industry potential are Huaibei, Tongling, Bengbu, Huainan, Bozhou, Suzhou, Fuyang and Chuzhou, and the comprehensive scores of tourism industry potential are all negative. They are the regions with low tourism development level. These eight regions are characterized by lack of tourism resources, weak economic competitiveness and imperfect tourism infrastructure, which restrict the development of tourism in the eight regions and lead to low tourism potential.

Epilogue
In this paper, from the perspective of tourism potential of Anhui Province, the comprehensive evaluation system of tourism potential of Anhui Province is established by using the methods of systematic cluster analysis and factor analysis Gu et al., 2020), and the development of tourism in Anhui Province is evaluated reasonably, and the result is that the development of tourism in Anhui Province is unbalanced and still needs to be improved. Among them, the tourism development of Hefei is the best, but the tourism development of Huaibei, Tongling, Bengbu, Huainan, Bozhou, Suzhou, Fuyang and Chuzhou is relatively backward, and the development of regional level is unbalanced. The unbalanced development of regional tourism will not only affect the overall development of Anhui Province, but also further expand the regional differences in the long run, and cause some environmental pressure to the regions with developed tourism, which is not conducive to the effective allocation of environmental resources. In the future, we should pay more attention to the balance of the 16 cities' tourism development in Anhui Province, the government should introduce corresponding policies to encourage urban areas with less tourism resources and insufficient driving force to increase investment in tourism, fully tap the tourism development potential of Anhui Province, focus on cultivating areas with greater development potential, increase investment in tourism in backward areas, and build an optimized and balanced regional tour-