Soil Salinity Detection in Semi-Arid Region Using Spectral Unmixing, Remote Sensing and Ground Truth Measurements

Soil salinity is one of the serious environmental problems ravaging the soils of arid and semi-arid region, thereby affecting crop productivity, livestock, in-crease level of poverty and land degradation. Hyperspectral remote sensing is one of the important techniques to monitor, analyze and estimate the extent and severity of soil salt at regional to local scale. In this study we develop a model for the detection of salt-affected soils in arid and semi-arid regions and in our case it’s Ghannouch, Gabes. We used fourteen spectral indices and six spectral bands extracted from the Hyperion data. Linear Spectral Unmixing technique (LSU) was used in this study to improve the correlation between electrical conductivity and spectral indices and then improve the prediction of soil salinity as well as the reliability of the model. To build the model a multiple linear regression analysis was applied using the best correlated indices. The standard error of the estimate is about 1.57 mS/cm. The results of this study show that hyperion data is accurate and suitable for differentiating between categories of salt affected soils. The generated model can be used for management strategies in the future.


Introduction
Soil degradation as a result of increased accumulation of salt content in the soil is one of the major environmental problems in arid and semi-arid region of the world ([1] [2]). The impact of soil salinity is mostly adverse especially in agri-cultural lands thereby causing huge agricultural loss [3] and low standard of living for local inhabitants whose means of sustenance (livelihood) is mainly dependent on farming activities ( [4] [5]). Apart from human-induced salinization caused by improper irrigation practices and poor drainage system ( [6] [7] [8]), climatic factor such as low precipitation exacerbate soil salinity ( [9]- [15]). The phenomenon of salinization is more and more worrying. Although estimates of salinity differ from one author to another, the areas affected are generally estimated at one billion hectares, which represents 7% of the total surface area of the continents [16], 77 million hectares are saline soils induced by human activity, 58% in irrigated areas [17]. Tunisia, an example of a country from arid to semi-arid climate, is also exposed by soil salinization; about 10% of Tunisia's areas are already affected by salinization in varying degrees ( [18] [19]).
Although several methods have been used by many researchers to assess soil salinity in the semi-arid region of Tunisia such as the study of ( [20] [21] [22] [23]), some of this studies are based on in-situ measurement and laboratory analysis, which makes study on spatial and temporal monitoring of the extent and severity of soil salinity imperative in order to take protective measures against further deterioration of the soil.
Other authors used traditional methods to assess, monitor and predict soil salinity, however, this method of salinization assessment is laborious and limited to small sample areas, thus making it not representative for large areas ( [9] [19] [24]). Owing to the complexity of monitoring soil salinity using traditional method, remote sensing data (multispectral and hyperspectral data) coupled with geo-statistical techniques have proved to be an appropriate method for monitoring soil salinity at different spatial and temporal resolution from national to regional scale ( [1] [4] [25]). Many authors demonstrated the utility of combining remote sensing data with ground-truth measurements to detect soil salinity ( [26] [27] [28] [29]). Linear Spectral Unmixing (LSU) method is one of the most reliable techniques to monitor soil salinity ([30] [31]). Linear Spectral Unmixing is used to estimate the abundance fractions of materials present in an image pixel (using endmembers) to finally elaborate abundance maps that are going to be very useful in the rest of the work.
In order to fill the gap in knowledge, the present work aims to valorise the techniques used in the field of remote sensing to detect areas affected by soil salinization in the arid and semi-arid zones of Tunisia specifically in the region of Ghannouch, Gabes. Our goal is to combine in-situ measurement with remotely sensed data (hyperspectral satellite image) in order to better understand the severity of soil salinization using integrated approach capable of delineating and mapping affected areas for proper land management, so as to ensure that such fragile ecosystem is not completely degraded in the future.

Investigation Area
Ghannouch, Gabes lies between Mediterranean and Saharan region. It is located in the South-East of Tunisia and falls within Latitude 33˚56' and Longitude 10˚03' as shown in Figure 1. The study area was chosen because of its important agricultural interests in this region, and the environmental problems related to soil, such as salinization. Geomorphologically, the study area belongs to the plain of Jeffara and more precisely to Jeffara coast (Gulf of Gabes) [33]. Ghannouch, Gabes, by its maritime position and its opening on the Mediterranean, is characterized by an arid climate. There is virtually no rainfall all year long. This climate is considered to be BWh by the Köppen-Geiger climate classification (B refers to a dry climate where annual evaporation exceeds annual precipitation, W means a desert climate and annual precipitation < 50% of the threshold, h: a dry, hot climate with an average annual temperature > 18˚C). The average temperature in Ghannouch is 19.3˚C, while August is the hottest month of the year with mean annual temperature of 27.6˚C. The mean annual rainfall is 176 mm, while January is the coldest month of the year, the average temperature in January is 10.9˚C. Evaporation in this region is relatively very high (between 1500 mm to 2000 mm) due to the dry climate conditions [33]. Therefore, the salt that is left after water evaporation on the top soils accumulates rapidly and accelerates soil salinization process. This fact leads to salt accumulation in the upper layers of the Chottsediments and to crust formation [34].

Data Pre-Processing
To build the model we used 14 spectral indices (these indices were generated from different remote sensing indicators (intensity, colour, salinity) as shown in Table 1, spectral bands from Hyperion data and ground truth measurements of 102 samples (EC measurements) which reveal the quantity of salt in the top layer of soil (0 to 10 cm depth).

Hyperion Data
The Hyperion is a high-resolution hyperspectral imaging instrument. The Hyperion images the earth's surface in 240 contiguous spectral bands with high radiometric accuracy, covering the region from 400 nm to 2.5 µm, at a spatial resolution of 30 m. The hyperspectral imagery provides opportunities to extract more detailed information than is possible using traditional multispectral data [35]. The importance of hyperspectral data in various studies as mentioned by [36] makes it suitable for soil salt content monitoring in semi-arid region.

Proposed Integrated Approach
Taking into consideration the complexity of salinization process, identification of salt affected regions remains challenging. Our approach exposes an attempt to predict salt affected areas in the South-East of Tunisia through several remote sensing and geo-statistical techniques. The flow chart in Figure 2 is a simplified description of the succession steps followed in this research.
Soil samples were collected in May and June 2010, which corresponds to the hyperspectral data acquisition date. The choice of the dry season to collect the samples was aimed at enhancing the detection of spectral characteristics of salt at the surface during salt accumulation at that specific time since soil salt rises in dry season due to capillarity. All the samples used in this study were token at least 30 m away from objects, which are not defined as soil (e.g.: trees, houses, streets, etc.) to minimize any noise that could affect the spectral signature.
At all sample locations, a specific procedure was used to collect the soil. Each analysed sample of this work is a mix of four soil samples. These 4 samples are collected from 4 corners of a (30 × 30) square, where the center is considered the location of the sample Figure 3). The mix of 4 soil samples collected from 4 corners of the square is the soil sample considered for chemical analysis Figure  4. These steps were applied for all samples, in order to optimize the representation of the samples within the pixel of the Hyperion image [37].   Salinity at the topsoil is determined by measuring electrical conductivity (EC). 1/5 soil/water diluted extracts is a convenient method used in this study to estimate soil salt content. To measure the EC of our samples, following steps are conducted: 1) Drying the samples, 2) Sieving (Size of the soil particle < 2 mm), 3) Agitation, 4) Measure the pH and then after a rest of 30 min the EC value. EC is usually expressed in decisemens per meter at 25˚C (dS/m).
We used twenty indices with the Hyperion data to map salt affected soil. We used a Pearson correlation between the remote sensing indices and the electrical conductivity measurements from the field to assess the efficiency of each index in characterising soil salinity. The LSU method was applied to determine the abundance of materials in the hyperspectral image based on three abundance maps (vegetation, soil and urbanism). LSU is used to distinguish if the pixel of each sample is representative for the soil or not. We explored all the pixels representing the location of the samples and then, we considered only the pixels incorporating more than 50% of soil (computed from the abundance maps, as explained in the results Section 3.1). This step was made only to make sure that the spectral signature from the sample is emitted mainly from soil. Subsequently, the correlations between the EC and the remote sensing indices are recomputed. The impact of the LSU method is discussed in Section 3.1. Multiple Linear Regressions (MLR) is a multivariate statistical technique and one of the most widely used techniques to determine the correlation between a response variable and some combination of two or more predictor variables. Several multiple linear regressions (MLR's) are explored in this study to predict soil salinity. All the statistical operations (computing of correlation, conducting of MLR and random sample selection) were done using XlSTAT software.

Hyperion Data and Its Preprocessing
The Hyperion data contain a spectral range of 356 -2576 nm at 10 nm band- module. Necessary parameters for the FLAASH were determined by the metadata of the image files. In FLAASH module, the atmospheric model was selected as "Mid-Latitude Summer", whereas "Rural" was used as aerosol model.
The dimension reduction of the atmospherically corrected Hyperion data was carried out using Minimum Noise Fraction (MNF) technique [38]. The MNF function identifies the noises and then allows the band classification. The MNF output images contain steadily decreasing image quality. Based on the eigenvalues, first thirteen bands of the 178 MNF bands were selected, the remaining bands with low (<1) eigenvalues were eliminated from further processing. The selected bands were inversed to reconstruct the MNF-corrected Hyperion data.
The atmospheric corrected Hyperion data with reduced dimensionality was used for further utilization.

Endmember Extraction
The Endmember extraction is one of the most fundamental and crucial tasks in hyperspectral data exploitation and an ultimate goal of an endmember extraction is to find the purest pixel. In this study we used Pixel Purity Index (PPI) function on selected MNF bands to identify the most spectrally pure pixels in our hyperspectral data. PPI is computed by projecting n-D scatter plots on a random unit vector. The n-D Visualizer is used then to locate, identify, and cluster the purest pixels and the most extreme endmembers in a dataset in n-dimensional space and from there we extract our endmembers.

LSU Technique
Linear spectral mixture analysis (LSMA) is a widely used technique in remote sensing to estimate abundance fractions of materials present in an image pixel.
In order for an LSMA-based estimator to produce accurate amounts of material abundance, it generally requires two constraints imposed on the linear mixture model used in LSMA, which are the abundance sum-to-one constraint and the abundance no negativity constraint. The first constraint requires the sum of the abundance fractions of materials present in an image pixel to be one and the second imposes a constraint that these abundance fractions be nonnegative.
While the first constraint is easy to deal with, the second constraint is difficult to implement since it results in a set of inequalities and can only be solved by numerical methods. Consequently, most LSMA-based methods are unconstrained and produce solutions that do not necessarily reflect the true abundance fractions of materials. In this case, they can only be used for the purposes of material detection, discrimination, and classification, but not for material quantification [39].
The detected signal is always a combination of signals produced by the different types covered by the pixel, that's why we used LSU method to identify the number of endmembers and then estimate the abundance fractions of materials present in a pixel as shown in Figure 5. Figure 6 shows the spectral behaviour of salt-affected soil in Gabes, Ghannouch of 4 samples with varying EC measurements. Visible, near-infrared (NIR) and

Spectral Characteristics of Ground Features
shortwave infrared (SWIR) are the investigated spectral regions provided by the Hyperion data. Reflectance profiles from Figure 6 show a ranking of the salinity classes. Within the visible and NIR-range, the four samples represented in Figure   6 show a good distinction between the different categories of salt-affected soils from the Hyperion data. Reflectance in all intervals shows that slightly saline soil (low electrical conductivity) have higher spectral response than salt-affected soils (high electrical conductivity).

Remote Detection of Soil Salinity from Spectral Indices
The generated abundance maps presented in Figure 7 show the fractional amount of material present at each pixel. The three abundance maps (e.g. soil, urbanism and vegetation) were generated through the application of the LSU method. These maps show the spatial density distribution of the main three components composing the investigated region. The soil abundance map shows the predominance of bare soil in the study area. The vegetation abundance map reveals a high vegetation density to the north-east where the Oasis is found. The urbanism abundance map from the LSU method was helpful to delineate areas where was the man-made structures, so samples from this area were avoided when building the MLR relationship.
In this study, we decided to work with pixel's samplings that contain more than 50% soil. We eliminated every single sample located in a pixel containing less than 50% soil and we ended up with 32 samples.
The Hyperion images the earth's surface in 240 contiguous spectral bands which six of them (B7: Blue; B14: Green; B24: Red; B42: Infrared; B117: SWIR1; B162: SWIR2) were considered as indices of soil salinity. A Pearson correlation between the electrical conductivity values and the Hyperion spectral bands was conducted to evaluate which spectrum interval could reveal more about the salt-affected area.
According to Figure 8, among the Hyperion spectral bands, the blue band gives the highest correlation (r = 0.31). The use of LSU improves the correlation within most of the applied remote sensing indices on the Hyperion data. There is a clear improvement in blue band where the correlation increases by 25%. Nevertheless, band SWIR1 showed a weaker correlation after LSU.
Intensity indices show low correlation with the EC, varying between 0.13 and 0.14 even after the improvement generated by the LSU. Hence, the three intensity indices used in this study do not show potential for discriminating salt-affected soil. When correlating the performed salinity indices and the EC of the soil samples, salinity index 11 (SI11) and ASTER_SI provide the highest correlation, not only among the salinity indices but among all the spectral indices  performed in our work. SI11 and ASTER_SI are among a cluster of salinity indices where only the SWIR1 and SWIR2 are combined. These indices show the highest correlation compared to other indices where the VNIR bands are used. This is due to the high performance of SWIR1 and SWIR2 bands in retrieving patterns and features of soil salinity in the investigated area. The Hyperion VNIR bands showed a low correlation with EC; therefore, the salinity indices computed from these bands have a limited potential for detecting soil salinity. The low spatial resolution of the Hyperion data is one of the main reasons for such a weak correlation. Furthermore, the collected samples cannot be com-pletely representative of the pixels because the sample represents only one point on the relevant 30 × 30 m pixel.

Multiple Linear Regression to Predict Salt-Affected Areas
A multiple linear regression model was applied to estimate the EC spatial distribution and predict salt-affected areas. Multiple linear regression (MLR) generates an equation where one or more independent variables (spectral indices that have the best correlation) is combined with estimated coefficients of the linear equation to finally predict the dependent variable which is EC.
The model is based on the data of the spectral salinity indices which gave the best correlation (as predictor variable) and the EC from ground truth measurements (as response variable). The best MLR approach found involves a combination of the salinity predictors SI11and ASTER_SI and was used to model the empirical relationship between electrical conductivity (EC) and soil salinity as showed by spectral indices. To create the MLR relationship, 80% of the samples were selected randomly by the software. The remaining samples were used for validation. The choice of the best model was based on the coefficient of multiple determination (R 2 ) computed by the model ( [40] [41]).
The best R 2 value in the regression output indicates that only 58% of the total variation of the predicted EC values can be explained by the predictor variables used in this model as shown in Figure 9.
The regression empirical relationship is given by the following formula: The Equation (1) shows the best MLR empirical relation which is based on the spectral indices SI11 and ASTER_SI. These two indices show the highest and the best correlation with the EC from the ground truth. Combining these salinity indices helps to create a more reliable MLR empirical relationship to predict the salinity in soil. The standard error (also known as the root mean square error) of the estimate is the square root of the residual mean square. Predicted values of electrical conductivity at the points representing salinity range of healthy soils, are often higher than the values from the ground truth measurements as shown in Figure 9.
The standard deviation of the data is about 1.57 mS/cm that's because the empirical relationship between measured and estimated EC values revealed an overestimation of the predicted electrical conductivity values. This slight overestimation found in the low values of electrical conductivity can be explained by the interference between salinity propriety and other soil proprieties which disturb the prediction. However, for samples with high electrical conductivity taken often from Sebkha, the main factor controlling the shape of the spectrum is the salinity propriety, which explains the decreasing of the RMSE values with the increasing of the electrical conductivity.

Conclusions
This study focused on the potential of LSU technique combined with remote sensing indices extracted from Hyperion data in improving the detection of salt affected areas in Ghannouch, Gabes. The results showed that the correlations improve in a remarkable way after applying the LSU method even if they remain moderate. This suggests that LSU technique has an important role in recovering more exact information regarding soil salinity.
A moderate coefficient of multiple determination (R 2 = 0.58) was found after applying the multiple linear regression model which makes it suitable for soil salinity assessment in the study area. However, several factors and/or indicators of soil salinization such as impact of land use land cover change, climate parameters, specie abundance, anthropogenic activities among others, coupled with the use of moderate to high resolution satellite imagery and an improved geostatistical model are needed in order to improve the results of the study for management strategies and early warning measures in the future.