Estimation of Poverty Based on Remote Sensing Image and Convolutional Neural Network

Poverty has always been one of the topics concerned by governments and researchers all over the world, especially in developing countries. Remote sensing image is widely used in poverty estimation because of its large area observation, timeliness and periodicity. In this study, we explore the applicability of convolution neural network (CNN) combined with remote sensing image in regional poverty estimation. In the 2016 economic indicators estimation of Guizhou Province, China, the Pearson coefficient of per capita GDP (PCGDP) reached 0.76, which means that the image features extracted by CNN can explain the change of PCGDP of county level economic indicators up to 76%. Compared with other methods, our method still has high precision. Based on these results, we found that convolutional neural network combined with remote sensing image can be used in regional poverty estimation.


Introduction
Poverty as a problem has been perplexing governments all over the world, especially developing countries. It is very important for policy makers and researchers to analyze the conditions and causes of poverty, which is helpful to reduce poverty. The traditional way of poverty measurement mainly depends on the ground survey data [1]. However, it takes a long time and is expensive to obtained data [2]. Some countries have not even collected such data [3].
Because remote sensing data can provide large-scale, multi-temporal and spatial resolution surface information, it is widely used in regional poverty estimation. As a new kind of remote sensing data, nighttime light remote sensing data How to cite this paper: Wu, P. and Tan [14]. DMSP-OLS data have some shortcomings, such as strong light saturation, coarse spatial resolution and so on [15]. NPP-VIRS data is calibrated to DMSP-OLS Data with better quality. In terms of spatial resolution, 15 arc seconds, 500 meters DMSP-OLS data are better than 30 arc seconds, and 1000 meters NPP-VIRS data can provide more artificial lighting information at night in human settlements [16] [17]. In the use of DMSP-OLS data, Noor et al. [18] used the data to calculate the Pearson coefficient of household asset index; the best result was 0.64, which confirmed the correlation between DMSP-OLS data and socio-economic indicators. Li et al. [19] explored the potential of NPP-VIRS night light images for regional economic modeling in China. Yu et al. [20] used NPP-VIRS data to estimate poverty at the county level in China.
In a recent paper [21], the features extracted from remote sensing images trained by convolution neural networks (CNNs) are used to estimate poverty, which explains up to 75% of the changes in local economic and living indicators.
All the satellite image data used in this study are open and free, which promotes an important step in using such satellite image data to estimate economic indicators without expensive and time-consuming ground statistical surveys. There is no assessment of transfer learning methods to predict changes in economic well-being over time in specific regions. Perez et al. [22]

Dataset
In  Per capita gross domestic product (PCGDP), total retail sales of consumer goods (TRSCG) and general public financial budget revenue (GPFBR). Table 1 shows the data used in the experiment.

Methods
Our

Experimental and Results
In Section 3, ResNet-50 and FPN are used for classification tasks. The classifiers after P2 feature layer are used for classification of nighttime light intensity. Similarly, the classifiers after P3 to P5 feature layer are used for classification of NDVI, MNDWI and NDBI respectively. The classification categories of four kinds of data are confirmed by Gaussian mixture model. The four data classification categories are shown in Table 2. The input of convolutional neural network is Landsat 8 image and the output is the category of nighttime light intensity (NLI), NDVI, MNDWI and NDBI. Table 3 shows the training accuracy and test accuracy of the four classification tasks. After the training of the model, the output of the global average pooling layer in the convolutional neural network is extracted as the feature, and the 256-dimensional featurevectors of the four classification tasks are combined into a 1024-dimensional featurevectorsas the image features extracted from the remote sensing image by CNN. Finally, we use the economic indicator data from the statistical yearbook and the corresponding image features extracted from the remote sensing image by CNN to train the ridge regression model to estimate the economic indicator. In ridge regression model, we use principal component analysis (PCA) to reduce the dimension of features to avoid over fitting, in which 1024-dimensional feature vectors are reduced to 100-dimensional feature vectors. The 10-fold cross-validation method was used to estimate the 2016 economic indicators of Guizhou province, and the process was repeated 20 times. Finally, we took the average value of all R 2 of 20 times as our final result. In this study, we estimate the three economic indicators of PCGDP, TRSCG and GPFBR. The Pearson coefficients of the three economic indicators are 0.76, 0.72 and 0.65 respectively, as shown in Figure 3, Figure 4 and Figure 5. Compared with the previous experimental results [21] [30], although the estimated economic indicators are different, the Pearson coefficient (R 2 ) of economic indicators is also in this range, indicating that our method can be applied to the economic indicators estimation of Guizhou Province, and the estimated economic indicators result is reasonable.
In order to test whether our method is better than other methods in the estimation of economic indicators, we also calculate the results of estimating economic indicators directly using the nighttime light (NTL) data. The sum of night light of cities, districts and special zones in Guizhou Province is used to estimate three economic indicators by linear regression model, among which the R 2 (Table 4) of PCGDP, TRSCG and GPFBR are 0.31, 0.58 and 0.5 respectively.

Conclusions
The present study demonstrates that CNN combined with remote sensing image to estimate poverty and identify regional poverty, especially the estimation of       Although this study shows that the combination of CNN and remote sensing image has a high accuracy in regional poverty estimation, it is still necessary to further study the applicability of deep learning in different regional poverty estimation.