Prediction of Meteorological Drought in the Lower Nu River by Statistical Model

Global climate change, temperature rise and some kinds of extreme meteorological disaster, such as the drought, threaten the development of the natural ecosystem and human society. Forecasting in drought is an important step toward developing a disaster mitigation system. In this study, we utilized the statistical, autoregressive integrated moving average (ARIMA) model to predict drought conditions based on the standardized precipitation index (SPI) and standardized precipitation evapotranspiration index (SPEI) in a major tributary in the lower reaches of Nu River. We employed data from 2001 to 2010 to fit the model and data from 2011 to 2013 for model validation. The results showed that the coefficients of determination (R 2 ) was over 0.85 in each index series, and the root-mean-square error and mean absolute error were low, implying that the ARIMA model is effective and adequate for this region.

moisture in a region for a long time. Insufficient long-term precipitation in meteorological factors is usually the main reason for the formation of drought. Determining when droughts occur and end is often difficult because the effects of droughts can sometimes last for months (Li et al., 2015), and sometimes for years. Another challenge and the fundamental aspect of drought research are determining its occurrence and severity. To quantify the extent of drought, various drought indices have been developed. For example, research on meteorological drought includes relative humidity index, percentage of precipitation anomaly, standardized precipitation index (SPI), and Palmer index (PDSI), etc. (Dai, 2013;Li & Zhou, 2014). Among them, there are some algorithms for improving and perfecting these drought indexes. Vicente-Serrano et al. (2010) proposed a standardized precipitation evapotranspiration index (SPEI) based on the SPI index and added consideration of evaporation effects. In addition, the indexes that use river runoff as indicators to describe drought include runoff drought index (SDI), standard runoff index (SRI), standard hydrological index (SHI), etc. (Sharma & Panu, 2014). Tsakiris et al. (2013) proposed a method that can systematically assess the interactions between meteorology, runoff drought, and economic and environmental impacts on a river basin scale.
In the past ten years, with the increasing trend of drought on a global scale, southwest China and surrounding countries have also been affected (He et al., 2011). In particular, since the fall of 2009, the effects of successive droughts have led to large-scale reductions in crop yields and even crop failure. Many reservoirs and small lakes are on the brink of dryness. Living difficulties of mountainous people and agricultural water are further exacerbated, and the ecological environment is greatly affected. It is located in the lower reaches of the Nu River basin in the mountains of southwestern China, and it is not only an important food production base in western Yunnan province, but also a large part of it is a hotspot for international ecological protection. The terrain in this area is dominated by river valleys and mountains distributed north-south direction. Due to the "channel" of the longitudinal valleys and the "barrier" of the mountains, the climate and environment change are complex, and it is an ecologically sensitive and fragile area in the plateau and mountains (Becker & Bugmann, 2001). At present, the research on Lancang-Mekong River, Red River and other river basins near the Nu River shows that under the interactive influence of climate change and human activities, the hydrological process of the basin has changed, causing a series of ecological and hydrological problems, which have attracted widespread attention from the international community (Giang et al., 2013;He et al., 2014). At present, there are some studies on the characteristics of changes in precipitation, runoff and other factors in the Nu River basin (Fan & He, 2012;Liu & He, 2013), but few study done on the drought (Xu, 2017), A few related studies show that the meteorological drought in the lower reach of the Nu River showed an increasing trend from 1966 to 2013, but it reflects multi-scale periodic characteristics (Chen el al., 2019). In particular, the meteorological drought at Province. Try to predict future drought conditions. Provide scientific support for regional drought monitoring, industrial restructuring, and researches on the transboundary water resources changes in international basin.

Study Area
The study area, the Minbo River Basin, is located in Yunnan Province, China, and is a primary tributary to the lower reaches of the Nu River ( and walnut are also grown on the mountains (Li & Liu, 2004;Jiang, 2006 2012, due to the control of westerly airflow, the rainfall in the whole region was significantly less during the rainy season, and the range affected by drought was wider and the severity was higher. Among them, the Longyang District of Baoshan City in the upper reaches of the Nu River Basin had a rainfall of 100mm in 2012, which was 91 mm less than the same period of previous years. Due to the uneven distribution of rainfall areas and rainfall periods, and the cumulative effect of the effects of less precipitation for three years, the drought situation is more severe, reaching the severe drought level.

SPI and SPEI
The SPI is one of the main indicators recommended by the World Meteorologi-

ARIMA and Seasonal ARIMA
The commonly used forecast tools for the time series is a data-driven methods, namely Autoregressive Integrated Moving Average (ARIMA) model (Hao et al., 2018). For a certain time series (such as drought indicator series), ARIMA emphasizing the stochastic properties of the series rather than its constructing single. Each variable in the model is represented by the lag of the series and the stochastic error. The common ARIMA model consists of a p-order AR (autoregressive) model, a q-order MA (moving average) model, and a d-order difference operator on the time series (Box et al., 2015). This can be expressed as ARIMA (p, d, q). Box  ARIMA model has low requirements for input data, few model parameters, and fast optimization speed. It is well adapt to non-stationary data (Alam et al., 2014). Based on these advantages, this model is more suitable for hydrological prediction in the area with lacking data. However, because the parameters of the model are based on the change rule of historical data, when the climate of the study area changes significantly, the prediction accuracy will decrease, also the prediction period should not be set too long (Yeh & Hsu, 2019).

Model Development
The time series modeling method is used to fit the SPI and SPEI sequences. The

Model Recognition and Parameter Estimation
The construction of ARIMA/SARIMA model requires the time series to be a stationary process. Therefore, the first step in model identification is to perform a stationary test on the original time series. Take the 2001-2013 time series and use the ADF test method to check the sequence stability. If the sequence is unstable, perform differential processing and repeat the test process until the sequence is stable (Table 2).
In the model recognition stage, ARIMA/SARIMA models were selected as candidate models, and the grid search method was used for parameter optimization. In the optimization process, AIC (Akaike information criterion) and SBC (Schwarz-Bayes information criterion) are used for judgment: models with lower AIC and SBC values are considered to have better fitting effects. Calculations for AIC and SBC may refer to related literature (Akaike, 1974;Schwarz, 1978). Finally, after identifying the model, the parameters of the model need to be estimated. In this study, the recommended maximum likelihood method (Box et al., 2015), which is often used for statistical model parameter estimation, is used to estimate model parameters.

Diagnostic Checking
After the parameter estimation is completed, a diagnostic check is performed.
The statistical check of the residuals is mainly performed to verify whether the model is suitable for time series. For a well-performing model, the residuals must satisfy independence, obey normal distribution, and homogeneity of variance. That is, the residual must be a white noise process. Some statistical tests and residual plots are used for diagnostic tests. The residual autocorrelation function (RACF) and residual partial autocorrelation function (RPACF) of the time series can be used to determine whether the series is independent. If the ACF and PACF of the residuals are significant within the confidence interval, it indicates that there is no significant correlation between the residuals. Another method is to use Ljung-Box-Pierce (LBQ) to test. If the Q statistic is less than (α = 0.05 level) critical value, the residual error is white noise. This paper uses Ljung-Box-Pierce (LBQ) to test the independence of residuals. Normality of residuals is verified by histograms and residual probability plots; the homogeneity of residuals is verified by scatter plots of residuals and predicted values. If there is no obvious pattern in the scatter plot and the residuals are randomly distributed around 0. It indicates that the residuals are homogeneous.

Drought Change Characteristics
According to previous studies, the SPI and SPEI time series have dropped significantly after 2000 (Chen el al., 2019). Here, we take the year 2000 as the breakpoint, and counted the frequency of occurrence of moderate or higher droughts (value < −1) in each month, which are represented by SPI and SPEI indicators, in the study area from 1966 to 2000 and from 2001 to 2013. There are some differences in the frequency of drought reflected by different indicators. In general, the drought frequency of SPEI is higher than that of SPI. Since the beginning of the 21st century, the frequency of droughts reflected by the two indicators has increased significantly from 1966 to 2000 (Figure 2). During the period from 1966 to 2000, the average frequency of moderate and higher droughts characterized by SPI and SPEI was 8.6% and 10.9%. However, after 2000, the average frequency of moderate and above droughts represented by SPI and SPEI was 24.8% and 32.4%, especially in the dry season months such as January-April and November-December. The frequency of moderate and above droughts even reached 29.1% (SPI) and 39.2% (SPEI). Figure 3 shows that the ACF curves of the SPI and SPEI index training samples are gradually attenuated in the form of a sine wave, indicating that the sequence contains periodic information. It is more appropriate to consider fitting with the SARIMA model. According to the AIC and BIC quasi-sides, the grid search method is used to set the search range to 0 -4. The SARIMA (p, 1, q) (sp, 1, sq) 12 model parameters are automatically optimized to obtain the minimum AIC The best model for BIC is shown in Table 3. The one with the fewest parameters is selected as the best model. The analysis of model coefficients, related standard errors, and standard error probabilities is shown in Table 4. It can be seen from the table that the selected SARIMA model performs well in both series.

Diagnostic Checking
The Ljung-Box-Pierce statistic was used to test the independence of the residuals.    The Q value is compared with the critical value of a significant level of α = 0.05 under the corresponding degrees of freedom. In all cases, the test results are not significant, indicating that the model's residuals are white noise (Table 5). Figure 5 depict the histograms (Histogram plus estimated density of standardized residuals, along with a normal (0,1) density) and normal QQ plots of the SPI and SPEI simulation residuals, respectively. The histogram

Figure 4 and
shows that the residuals are roughly concentrated around zero and are approximately normally distributed. The normal QQ plot of the residuals shows that the residuals are basically located on a diagonal, and also indicates the normal distribution of the residuals.
The residual homogeneity is tested by the residual scatter plot and predicted values. The results are shown in Figure 6. There are no specific patterns of scatter, and the residuals are randomly distributed. That is, the residuals are evenly distributed around the zero mean, which means that the model fits well.
W. H. Chen et al.    Here, we also compare the observations with the optimal simulation values of the series from 2008 to 2010 (intra-sample data). Figure 7 shows the comparison between the predictions and observations of SPI and SPEI series. It can be seen that the prediction and the observation series have similar characteristics and also, the 95% prediction uncertainties (95 ppu) shows a relatively narrower band. The performance indicators for the model are shown in Table 6. In general, the higher R 2 , the better the performance of the model. The results show that American Journal of Climate Change

Model Validation
The calculation was performed using the best-fit SARIMA model. The SPI and SPEI sequence values from 2011 to 2013 (out-of-sample data) were simulated.
See Figure 6 and Figure 7 for the comparison between observed and predicted series. After test the correlations between observation and prediction value, and checked the root mean square error (RMSE) for time span of 12, 24, and 36 months in advance, it was found that with the extension of the prediction period, the accuracy and uncertainty of the model decrease (Table 7) and increase ( Figure 7) significantly respectively.
Although the value of the hydrological drought index estimated using the SARIMA model is still uncertain, the intensity of the hydrological drought is mainly expressed by the index level rather than the absolute value of the index. Therefore, the model is useful in the prediction of short-term change and long-term trend of drought. Using these SPI or SPEI-based drought indices and the established SARIMA model for drought forecasting has certain application value for mitigating drought in the study area.

Conclusion
This paper compares the occurrence frequency of moderate and above-average droughts (values < −1) in each month as represented by the SPI and SPEI indicators in the lower reaches of the Nu River. It is found that since the beginning of the 21st century, the frequency of droughts reflected by the two indicators has increased significantly from 1966 to 2000, and the SPEI drought frequency is  higher than the SPI drought frequency. This provides further evidence for the fact that the increase in meteorological drought in the region was dominated by previous studies (Chen el al., 2019). Based on this, this paper uses the time series stochastic model (ARIMA) to simulate the drought process in this area, and can draw the following conclusions: 1) The temporal characteristics of meteorological drought in the lower reaches of the Nu River indicate that the region may experience drought in almost all months of the year (i.e., SPI < 0).
2) The stochastic model developed to predict drought has achieved good simulation results in predicting medium and short-term drought (within 12 months).
3) The linear stochastic model can be used to quickly predict watershed droughts with scarce data, analyze the severity of future droughts, and also be used by local governments and resource planners to predict drought severity in advance.
The stochastic model predicts the SPI and SPEI time series, which provide a convenient tool for forecasting meteorological drought in the lower reaches of the Nu River. However, due to many factors, changes in natural phenomena related to drought are complex. Stochastic models do not consider physical processes, so it is difficult to understand climate change from the physical mechanism behind it. In addition, the model assumptions may lead to higher uncertainty when we extend the prediction period. However, the model can be used to predict the short-term change and the future trends of the drought, and it may also be used as a reference for further research on other prediction models.

Conflicts of Interest
The authors declare no conflicts of interest regarding the publication of this paper.