NDVI-Derived Vegetation Trends and Driving Factors in West African Sudanian Savanna ()
1. Introduction
Vegetation is among the first elements to be altered in terrestrial ecosystems degradation [1] . It is a sensitive index to environmental impact and an indicator of global change [2] . Monitoring and assessing change in vegetation is a topic of high interest worldwide [3] [4] [5] . This is of particularly importance in the Sudanian savanna of West Africa, one of the key vegetation biomes in the region. It provides food and vital ecosystem services, contributes to the protection of soil resources and plays a paramount role in energy, water and carbon balance [6] . Local populations depend on this savanna ecosystem for their livelihoods through activities, such as subsistence farming, livestock grazing and wood harvesting [7] , which coupled with climate vagaries led to severe vegetation degradation and habitat fragmentation [8] .
Change in vegetation can be seasonal responses, inter-annual variability, and directional change of vegetation [9] . The directional change of vegetation (e.g., increasing or decreasing trend), focus of this study, is perceived as its response to anthropogenic and natural stressors. In West Africa, the severe droughts of 1972-1973 and 1983-1984 that dramatically affected vegetation cover, favoured political interest for efficient management of natural resources, and particularly drew the attention of scientists on vegetation change monitoring and analysis. The NDVI (Normalized Difference Vegetation Index), due to its strong link with vegetation cover and production [10] [11] , has been widely used by scientists to investigate change in the West Africa’s vegetation cover especially over large scale [10] [12] [13] . Despite the existence of numerous studies, vegetation trends in West Africa are still a matter of discussion [14] [15] . While studies support greening trends [13] , others indicate no trend [16] or browning [17] . Besides, one of the key current challenges is to document consistently the underlying driving factors of vegetation change in this part of the world [18] .
Driver analysis relied on numerous methods, among others correlation analysis. For example, [19] used NDVI trends and spearman correlation to analyze driving factors of vegetation degradation among topographic, climatic and accessibility variables in the Sudanian savanna of Burkina Faso. However, the correlation analysis did not provide detailed information on factors contribution. To address this deficiency, authors opted for more sophisticated modelling (e.g., multiple regression). For example, based on a binary logistic regression model, [20] assessed land cover change drivers among biophysical and socioeconomic independent variables in Northern Ghana. [21] also used logistic regression model to establish the relationship between socio-economic drivers and land cover change in the total wildlife reserve of Bontioli in Burkina Faso. But, logistic regression, as a parametric method, requires data to meet specific assumptions or criteria (e.g., independence of observation), which is often difficult to be achieved. However, recently, studies have demonstrated the strength of non-parametric machine learning algorithms (MLAs) to model and relate vegetation change to drivers [18] . Authors have even found MLAs outperforming the traditional parametric algorithms, especially for complex data with many predictors [22] . MLAs have the flexibility and capability to process large number and different types of data. They can learn and approximate complex non-linear mappings by exploiting the information from reference data [23] . Moreover, they do not make assumptions about the data distribution, i.e., they are nonparametric methods [24] . Random Forest (RF) is among the most common and effective MLAs [25] [26] , and its implementation provides opportunities to identify key drivers based on the relative contribution of predictors to the model performance.
The present investigation focused on the Sudanian savanna of Burkina Faso, and its objective was to assess vegetation trends and driving factors from 2000 to 2022. For that, NDVI was adopted as proxy indicator of vegetation state and to analyze trend. Key driving factors of vegetation trends were determined based on machine learning modelling and non-parametric correlation analysis.
2. Materials and Methods
2.1. Study Area
The study deals with the Sudanian phytogeographical zone of Burkina Faso (Figure 1). Two phyto-geographical zones are encountered in the study area: the
North and South Sudanian savannas [27] . The climate is characterized by a rainy season extending from May to October, and a dry season occurring from November to April [28] . The vegetation of the study area is mainly characterized by savanna biome (tree and shrub savanna) with dominant species such as Vitellaria paradoxa. Pterocarpus erinaceus. Parkia biglobosa. Terminalia laxiflora. Afzelia africana. Anogeisus leiocarpa and Adansonia digitata).
The population is dynamic like in the entire West Africa, which has experienced high population growth rate in recent years due to high fertility rates [29] . Agriculture is the main source of livelihood in the study area as well as for the entire country where 86% of the workforce is primarily devoted to agriculture. However, agriculture is rudimentary and practiced by small-scale farmers who make reduced use of inputs such as fertilizers or pesticides. Cereals (e.g. sorghum. millet. maize and rice) are the main crops cultivated. In rural areas, population growth and cropland expansion lead to excessive pressure on vegetation and the depletion of natural resource stock in general [30] .
2.2. Data Collection
2.2.1. Remotely Sensed Vegetation Data
The Normalized Difference Vegetation Index (NDVI) was employed as proxy to assess vegetation trend. NDVI data of the Moderate Resolution Imaging Spectrometer (MODIS) MOD13Q1 product, collected from Google Earth Engine (GEE) cloud platform, were analysed in this investigation. MOD13Q1 dataset are considered, since it has already been widely and successfully used in previous vegetation trends investigation [31] [32] . Time series of NDVI 16-day composites with a spatial resolution of 250 m were gathered for the 2000-2022 period. The MOD13Q1 NDVI product is delivered with pixel-level data quality indicators which can be used to filter time series and interpolate bad values (e.g., cloud-induced noisy). Thus, the quality assurance (QA) mask was applied to the dataset, and only the best quality pixels (QA = 0) were considered in order to produce a high quality NDVI time series. Finally, a time series of annual mean NDVI was built for the period 2000-2022 and projected to UTM WGS 84 zone 30. The yearly NDVI time series data were downloaded from GEE platform for trend analysis with the statistical platform R.
2.2.2. Environmental Dataset
Various geospatial data (biophysical, accessibility and demographic) were gathered to for driving factors analysis of vegetation trends (Table 1). Gridded rainfall data were collected from TAMSAT for the period 2000-2022. These datasets are based on Meteosat thermal infra-red (TIR) imagery provided by EUMETSAT, and the TIR is calibrated against an extensive ground-based rain gauge data archive. Three indicators of rainfall variability were computed: coefficient of variation and mean annual rainfall as well as rainfall trend derived from Mann Kendall trend test. Elevation above mean sea level was derived from the 30 m SRTM (Shuttle Radar Topography Mission) (https://earthexplorer.usgs.gov/), and soil
Table 1. Ancillary data used in this investigation.
type data from the national soil office of Burkina Faso (BUNASOL). Accessibility data were also collected from the Geographical Institute of Burkina Faso (IGB) and computed as the Euclidean distance of each MODIS pixel to the nearest river. Population data were obtained from the Gridded Population of the World version 4 (GPWv4) dataset. These data were based on counts consistent with national censuses and population registers of the countries. Population growth between 2000 and 2020 was computed as in Equation (1).
(1)
where,
indicates population growth.
In addition, land cover data were also obtained from the European Space Agency (ESA). The ancillary data were projected to UTM WGS 84 zone 30 with spatial resolution of 250 m to match the pixel size of MODIS NDVI.
2.3. Data Analysis
2.3.1. Vegetation Trend Detection
The non-parametric Mann-Kendall’s monotonic trend test (MK test) was used to detect trends in the annual NDVI time series (2000-2022). MK test does not require data to meet specific criteria, such as normal distribution, and is less affected by outliers, therefore, the method can be used to effectively analyze trend of NDVI time series [33] . It calculates the correlation between the observation data (here NDVI) and time. The Mann-Kendall’s test outputs the significance value (p-value) of the trend slope and the correlation coefficient, the so-called Kendall’s tau (τ), which provides information on the direction and the strength of trends. Thus, a trend was considered statistically significant if p-value was less than 0.05 (p < 0.05), while Kendall’s tau (τ) value higher than 0.5 (less than −0.5) was used as criteria of strong trend. The formula MK test is provided by the following equation:
(2)
where n is the numbers of data points.
and
are annual values in years j and i. j > 1 and Sign (
) calculated using the equation:
(3)
The computation of Mann Kendall significance produces a standardized Z (Equation (4)) and corresponding probability p (Equation (5)).
(4)
and
(5)
where
(6)
Based on the two MK-test derived indicators; five (05) trend classes have been defined (Table 2) and analysed according to land use/cover types (e.g., natural vegetation types, cropland and agglomeration) and their spatial distribution. The MK test was performed using the time series of annual mean NDVI with the statistical software R.
2.3.2. Driving Forces Analysis
Driving factors analysis was performed firstly through variables importance score of Random Forest (RF) modelling. RF was selected due to its successful performance in predicting changes in vegetation cover [34] . RF is an ensemble machine learning algorithm developed by [35] for classification and regression. It is based on bagging. a technique used for training data creation by randomly
Table 2. NDVI-derived vegetation trend classes defined in this study.
resampling the original dataset with replacement. Here, RF is used under classification mode. For that, RF builds several trees with random samples of observations and a random sample of variables, then, the outputs of the classification trees are aggregated, and a class is assigned by majority voting [35] . RF provides importance score to characterize the effect of each predictor on the model. It has its own built-in variable importance computation mechanism. RF yields prediction accuracy on the out-of-bag (OOB) portion of the data that is recorded for each tree built, and then the same is done after permuting each variable. The difference between the two accuracies is then averaged over all trees, and normalized by the standard error to estimate variable importance [36] .
Based on the trend map, sample points were derived for strong greening, strong browning and no trend classes. To avoid high spatial autocorrelation in the dataset which can lead to overestimation of modelling accuracy, Moran’s I-based correlogram were produced, and it enabled the selection of a minimum distance between sample points. In this study, a minimum distance between sampling points was set to 3000 m (3 km) with Moran’s I value of 0.4. This is the result of a compromise between collecting more samples data and reducing spatial autocorrelation. In all, 2000 reference samples (pixels) were randomly collected and shared into training data (50%) and testing data (50%) for external validation. RF classification was implemented with the caret package of the statistical software R. Vegetation trends were set as response variable, and climatic, topographic, edaphic, accessibility and demographic variables were considered as model predictors. RF accuracy was evaluated with the overall accuracy and Kappa index. The RF permutation-based importance score was used to determine the most important variables that guide the occurrence of vegetation trend, as provided by the modelling [37] .
Furthermore, the Spearman’s correlation was also applied for driving factors analysis. It was performed between drivers and NDVI trend (Kendall tau) to determine the direction of influence of drivers on vegetation trend. Spearman’s correlation is a non-parametric method and is computed as:
(7)
where,
, which is the difference between the two ranks of each observation, and n is the number of observations.
3. Results
3.1. Vegetation Trends during 2000-2022
Figure 2(a) shows the spatial distribution of mean NDVI pattern (2000-2022) in the study area with maximum NDVI value reaching 0.73 and minimum value of −0.12. NDVI distribution follows the south-north precipitation gradient with higher values particularly found in the southern part of the study area, and low values in the northern zone. The results of MK test are illustrated by Figure 2(b),
Figure 2. Vegetation trend analysis during 2000-2022: (a) mean NDVI from 2000-2022; (b) MK tau value; (c) NDVI-derived vegetation trend classes; (d) Land use/cover classes in the study area.
Figure 2(c). Patterns of decreasing NDVI trends (red colour) as well as increasing trends (green colour) are observed in the study area (Figure 2(b)). The integration of the MK tau value with the significance test (Figure 2(c)) produced the NDVI trend classes illustrated in Figure 2(d). More than half of vegetation cover area (50.5%) was characterised by no trend patterns (Table 3), which are distributed throughout the study area. Greening trends are detected mainly in the half-west of the analysed area of which 19.1% and 12.8% exhibiting weak greening and strong greening respectively. Large patterns of vegetation degradation signals are also observed, since 10.6% of weak browning trends and 7.1% of strong browning trends affected the study area, essentially, the eastern, south-western and southern parts. In general, most of the protected areas were often dominated by no trend and greening trends, contrary to declining trends that were particularly found over non restricted area.
3.2. Observed NDVI Trends in Land Use/Cover Types
The results of the MK test show predominance of no trend in areas covered by natural vegetation types (tree cover, shrub cover and grassland), cropland and agglomeration (Figure 3). The natural vegetation types were more concerned with greening trends than browning. For example, 17.7% of tree cover and 15% of shrub cover exhibited strong greening tendency, while only 5% and 6.4% were affected by strong browning trend respectively. The opposite dynamics were found for LULC with high human footprint, such as agglomeration, in which browning trend were common compared to greening trend. Figure 4 highlights
Table 3. NDVI-derived vegetation trend classes.
Figure 3. Distribution of NDVI-derived trend classes per land use/cover type.
Figure 4. Proportion of each LULC type per NDVI-derived trend class.
the proportion of each LULC types in areas affected by trend classes. It reveals that areas under greening trend are mainly found in tree cover and shrub cover, while those exhibiting browning dynamics are particularly observed under cropland.
3.3. Driving Factors of Vegetation Trends
The RF predicted the vegetation trend classes with a classification overall accuracy and kappa value of 0.82 and 0.76 respectively. The relative importance of the predictors is shown in Table 4. According to RF modelling, climatic (rainfall trend and mean annual rainfall) and demographic (population growth) variables were by order of importance the key contributing variables. Soil type appeared as the least important driving factors. Table 5 shows the result of the Spearman’s
Table 4. Variable importance derived from RF.
Table 5. Spearman’s correlation between NDVI trend (expressed as trend correlation coefficient) and set of potential drivers.
**. Correlation is significant at the 0.01 level (2-tailed). *. Correlation is significant at the 0.05 level (2-tailed).
correlation between NDVI trend (expressed as trend correlation coefficient) and set of climatic, topographic, accessibility and demographic variables. The Spearman’s correlation revealed significant (P value of 0.05) negative association of NDVI trends with population growth, mean annual rainfall and elevation. Unlike, non-significant relationships were observed between NDVI trends and distance to river as well as with coefficient of variation of annual rainfall.
4. Discussion
The predominance of non-significant trends (no trend class) in the study area from is in line with previous reports that used NDVI trend to monitor vegetation dynamics in the West African Sudanian savanna [16] [19] . The observed changes of vegetation were mainly the result of the combined effect of climatic and anthropogenic factors. The improvement of rainfall since the beginning of the 2000s, as noticed by [38] , and the reforestation activities have probably contributed to the occurrence of the large greening patterns. The improvement of rainfall condition has also favoured a natural regeneration of vegetation species in this zone as found by [39] . This relevant role of climatic conditions was evidenced by the positive correlation between NDVI trend and rainfall trend in the study area. The variable importance score of RF modelling also identified rainfall trend and mean annual rainfall as key drivers of vegetation change in the study area. Such findings agreed with the work by [18] that found rainfall average over all growing seasons as the most important driver for the classification of vegetation production trends in their study area in Niger. Rainfall constitutes a key factor of vegetation growth in semi-arid regions [40] . However, the negative relationship noticed between the spatial patterns of mean annual rainfall and NDVI trends revealed that rainfall also constitutes a catalyst for human pressure on vegetation [19] . For example, in our study area, the large patterns of degradation found in the southern part were likely due to the favourable conditions for agricultural activity, which systematically increases human pressure on vegetation cover [41] . Actually, anthropogenic pressure seems to be an important threat for vegetation in the Sudanian savanna of Burkina Faso, as confirmed by the negative association of NDVI trend with population growth which also came up as the third most important variable in the RF modelling. In Burkina Faso for example, [42] attributed cropland expansion at the detriment of natural vegetation during 2001-2014 period to the rapid population growth noted in most of the provinces of the country. Several works attributed the degradation of vegetation cover in the Sudanian savanna to anthropogenic activities, such as agriculture, livestock and wood harvesting [20] [21] [43] [44] . In our study area, agriculture appeared as the human activity that have most affected the vegetation during 2000-2022 period; indeed, our investigation found cropland dominating in areas affected by browning trends of vegetation in the study area.
Studies carried out at large scale have found rainfall driving vegetation variation in West Africa [13] [40] . The present study noticed that, even at in-country vegetation biome scale, rainfall still plays an important role in vegetation dynamics. However, as observed by previous studies [16] , local vegetation trends (NDVI trend-derived) are not fully explained by rainfall. This is confirmed by our investigation that found population growth as an additional key driving factor of vegetation trend in the Sudanian savanna of Burkina Faso.
It appeared in the modelling that RF has good predictive capacity of vegetation change with climatic, topographic, edaphic, accessibility and demographic variables in the Sudanian savanna, at least in Burkina Faso. The predictive performance of RF was also noticed by [18] that used RF to model local vegetation production trends in southwestern Niger and achieved an overall accuracy of 80%. Our results accord with previous studies that found RF efficient in the mapping of vegetation cover change in the African savannas [45] [46] [47] . This highlights that RF can be preferably used with biophysical variables to predict vegetation trends and anticipate future change. It also showed that RF model offers an opportunity to distinguish the influence of environmental variables operating simultaneously on vegetation change [48] .
The present study revealed a large predominance of no significant trends over the vegetation cover of the study area between 2000 and 2022. However, contrasted significant changes, mainly driven by the coupled effect of climatic and anthropogenic factors, were detected. The spots of vegetation degradation are mainly due to unsustainable land use practices particularly in agricultural zone. This calls for more measures as well as rigorous implementation of the existing land degradation policies to promote sustainable land use management practices in this part of Burkina Faso. The expansion of cropland constitutes a threat for protected areas of which existing monitoring and protective measures should be reinforced.
The study also showed the capacity of RF algorithm to predict vegetation change, which is useful in the context of population growth and climate change. Indeed, RF could play a key role in the prediction of the impact of future demographic and climatic changes on vegetation cover. However, one weakness of this study remains the shortness of the NDVI time series (23 years). In fact, time series data that range over 30 years might be beneficial to draw strong conclusion on vegetation trends. Moreover, the 250 m MODIS NDVI data might be coarse for the heterogeneous Sudanian savanna especially to capture local scale change. Therefore, the use of Earth Observation (EO) data of high spatiotemporal resolution will be of great asset for vegetation change monitoring in this area. The recent sentinel optical product with 10 m spatial resolution and 5-day revisiting frequency coupled with its radar imagery may be a solution, but currently, its data time series is too short for consistent vegetation trend analysis.
5. Conclusion
Monitoring and understanding vegetation dynamics is of great interest in the context of global environmental change and REDD+ (reducing emissions from deforestation and forest degradation) which is a viable climate change mitigation strategy. This is of paramount importance in the Sudanian savanna, a key biome in West Africa, but of which vegetation cover is threatened by anthropogenic and climatic pressures. However, in addition to the need of providing reliable response upon the debate related to vegetation trends in the Sudanian savanna, another challenge is to consistently document the underlying driving factors of vegetation change in this biome. The present study contributed to the filling of this gap by conducting an investigation in Burkina Faso. The study aimed at assessing vegetation trends and driving factors from 2000 to 2022. The vegetation of the study area was largely characterised by no significant trends. Nevertheless, important patterns of greening and browning trends were detected. The driving factors analysis indicated rainfall dynamics (trend and mean annual) and population growth as well as anthropogenic activities as the key underlaying driving factors of the observed trends. The study provided sound information to improve the understanding of vegetation change and the underlying driving factors in the Sudanian savanna especially in Burkina Faso. The observed greening tendency is a sign of hope in the combat against land degradation and climate change, but the patterns of browning trends call for more actions towards sustainable land use. Agricultural practices and the anthropogenic activities in general should be adapted to the context of climate change and land degradation, and afforestation activities should be reinforced across the country as well. The study highlighted the good predictive capacity of Random Forest (RF) algorithm which appeared as a valuable tool that can be used to predict vegetation dynamics with environmental variables and anticipate future change. RF could play a key role in the prediction of the impact of future demographic and climate change on vegetation. This study can help to better manage vegetation cover and efficiently tackle land degradation in the Sudanian savanna.