_{1}

^{*}

The aim of the research is to compare spatial prediction methods: (RBF), (IDW), (OK), (UK), and (SK) for the production of the groundwater level map and the prediction error map in study area as well. Setting the foundations and criteria for choosing the most appropriate mathematical method for the construction of statistical surfaces in the representation of the level of groundwater in study area. These methods were used to predict the spatial distribution map of the groundwater level based on measured data from 764 wells in May 2016. The study reveals that comparing the spatial interpolation models and evaluating their accuracy, through some statistical indicators and cross-validation is the best way to choose the optimal model for the representation of data entered in any site. As a result of the statistical comparison between the five spatial interpolation models and validation of the results using (cross validation) it was found that the universal Kriging (UK) method is the best method to represent the level of groundwater in Salman district because this model has the lowest root mean square error (RMSE), the lowest mean error (ME), and the highest coefficient of determination (R2) value. The groundwater level and prediction standard error maps produced in the geographic information system (GIS) give additional data and information that describe the aquifer system in study area and will ultimately improve sustainable groundwater management.

The good application of hydrological methods is the basis for the management and development of water resources. As a result of the natural increase in the population of Iraq and the decline of the discharge of the Tigris and Euphrates because of the control of the countries of the upper basin, this has led to increased demand for water and this requires thinking about developing methods of searching for water and finding alternative sources of surface water to meet this challenge in the future as well as the scarcity of usable water is one of the greatest challenges in the twenty-first century [

The groundwater aquifer in AL-Salman district (dry zone) is the only source of household and agricultural needs in the study area. Therefore, it is necessary to rationalize the water in these areas through proper planning, which depends on the use of mathematical models to help the decision-maker to take correct steps in the planning and investment optimization of water projects. Proper planning requires the analysis and study of the behavior of the aquifer and the development of accurate digital maps that show the level of groundwater and its development over time [

All the phenomena with the spatial extension occupy a size in the space. To realize this size, we must see the outer surface surrounding this size, which is called the statistical surface. The spatial differences of this surface can be represented on the maps using one spatial interpolation model. Therefore, spatial statistical techniques are the main means of creating these maps. Since there is no universally approved method that can be adopted, this study compares different spatial completion methods: radial basis functions (RBF), inverse distance weighting (IDW), ordinary Kriging (OK), universal Kriging (UK) and sample Kriging (SK). Based on a set of statistical criteria, the best spatial interpolation model can be determined to represent the reality of the groundwater level in the study area evaluation. Moreover, the research aims to:

1) Setting the foundations and criteria for choosing the most appropriate mathematical method for the construction of statistical surfaces in the representation of the level of groundwater in study area.

2) Comparison of the prediction methods: IDW, RFP, UK, OK, and SK for the production of the groundwater level map and the prediction standard error map in study area.

The importance of research comes from the importance of studying groundwater related projects based on models that are in line with the development of GIS applications.

To reach the research objectives, the work was divided into three stages, as in

1) Collection, input, processing and analysis of data using the ArcGIS 10.5 program.

2) Comparison and evaluation of spatial interpolation models and selection of the best ones.

3) Production of groundwater level map and prediction standard error map to present and generalization of interpolation results.

The study area is located in AL-Salman district/southwest Iraq (^{2}) and constitute (4%) of Iraq’s total area of (435,052 km^{2}). The study area is a part of the southern desert in AL-Salman

basin which is characterized by limestone and dolomite beds. The Salman basin, to depth (400 - 500) meters consists of three main aquifers are Dammam Carbonate aquifer, Umm Er-Radhuma aquifer and Tayarat aquifer. The study area is generally, flat rocky terrain associated with structure ridges and isolated hills and karts depression. It is structurally lies within the eastern part of the stable shelf of Nubio-Arabian platform, which is characterized by large horst like anticlines and grabeen like synclines with local structures of short extend. The regional strike of the beds trends northwest-southeast with estimated dip amount of (2˚ - 3˚) towards the north east. The exposed rocks, in the study area, belong to Paleocene, Lower Miocene, Pliocene-Pleistocene and Quaternary ages. The Eocene sediments are the most prevailing ones. The rock sequence, is generally composed of carbonates with marl intercalations and lest amount of elastics [

The climate of the study area is characterized by a dry hot climate in the summer and mild in winter, temperatures range between 16˚C and 32˚C, the annual rate of rain is less than (80) mm, relative humidity (39%) and the total annual evaporation (3484) mm [

The spatial interpolation methods assessed here include in ArcGIS10.5 program the following:

・ Inverse distance weighted (IDW)

Inverted way: the weighted distance to derive the statistical surface expressed by a diver from the measured values at a number of points belonging to this surface and then a network of points is completed and the values of the phenomenon are calculated at these points according to a mathematical equation (Equation (1)). The calculation of the value of the phenomenon at any point of the network (statistical surface) is calculated and predicted in a way that is inversely proportional to its distance from the measured points after giving a weight per point [

This method is based on spatial correlation, where measured data are used at specific points in the region to estimate data for points where no measurements are available [

z j = ∑ i z i d i j n ∑ i 1 d i j n (1)

where z j : estimated value for the unknown point at location j. d i j n : distance between known point i and unknown point j. z i : value at known point i. n: user defined ExpoNet for weighting.

・ Radial basis functions (RBF)

This technique preserves the sample values so that it paints a flexible predictive surface that passes over all sample values. It is able to predict the values and values of the samples, but not to ignore them but to pass them. It is similar to the model of the hands, but predicts the values that lie above the maximum values is below the minimum [

∅ ( r ) = r 2 + c 2 (2)

where: r = the destance from sample to estimation, c = smoothing factor.

・ Kriging

The Kriging model is one of the most complex and robust methods, applying advanced statistical methods and needs to know spatial statistics because the data must be subjected to statistical examination before application. It depends on the distance and the relationship between the values known in predicting unknown values, and it is possible to predict values that exceed or less than known values but do not pass them as in the style of Spline. The Kriging method is the best procedure for nonlinear linear completion [

z ( s o ) = ∑ i = 1 n γ z ( s i ) (3)

where ( s o ): prediction location, γ: unknown weight for the measured value at the i location, z ( s i ) : measured value at the i site, n: number of samples.

The accuracy of spatial interpolation methods was assessed on the basis of three statistical criteria: Root Mean Square Error (RMSE), Mean Error (ME), and coefficient of determination (R^{2}). The model can be validated when (RMSE) is as low as possible, the (ME) is near zero, and the closer (R^{2}) of the correct one is, the better the model.

1) RMSE is used as an important parameter that indicates the accuracy of spatial analysis in geographic information system and remote sensing [

RMSE = 1 n ∑ i = 1 n [ z ^ ( x i ) − z ( x i ) ] 2 (4)

where: z ( x i ) is observed value at point x i , z ^ ( x i ) is predicted value at point x i , n is number of samples (sum of squared errors) observed-estimated (values) and n is the number of pairs (errors).

2) The mean error (ME), is used for determining the degree of bias in the estimates and its provides an absolute measure of the size of the error. The large values indicate larger discrepancies between predicted and observed values [

ME = 1 n ∑ i = 1 n [ z ^ ( x i ) − z ( x i ) ] (5)

where: z ( x i ) is observed value at point x i , z ^ ( x i ) is predicted value at point x i , n number of samples.

3) Coefficient of determination: it is called the linear correlation coefficient square (R^{2}). It is expressed by the ratio of total squares of regression divided by total squares. The value ranges between the correct one and zero and is calculated by the following equation [

R 2 = [ ∑ i = 1 n ( P 1 − P a v e ) ( Q 1 − Q a v e ) ] 2 ∑ i = 1 n ( P 1 − P a v e ) 2 ∑ i = 1 n ( Q 1 − Q a v e ) 2 (6)

where: Q a v e is the mean of measured values, P a v e is the mean predicted values, n: number of sample used for predication.

Before applying the spatial interpolation models, the exploratory analysis of the data used should be performed using geostatistical techniques supported by GIS programs. Spatial models give more representative results if the data are distributed naturally and may result in distorted results if the data are abnormal. The spatial statistical extension contains a set of tools for the distribution of data such as the histogram, the trend analysis tool, Semivariogram/Covariance Cloud, and some statistical indicators. The natural distribution of the data that takes the shape of the natural curve (bell) in which the value of the measures of central tendency (Mean, Median and Mode) is characterized by this curve that the coefficient of skewness is equal to zero, and the coefficient of Kurtosis is equal to 3. Each skewness coefficient is close to zero and all Kurtosis coefficients close to the value of 3 indicate a normal distribution of data [

(

gradual rise of the groundwater level from south to north, and the green curve indicates the decline of data from the west to the east. This means that the groundwater flow in the study area from the south and southwest to the north and north-east, which is consistent with the general slop of topography of the study area.

Spatial autocorrelation are used to detect and measure the similarity of contiguous phenomena that depend on the comparison between the value of the phenomenon and the average value of the structure (statistical value). If the difference between the contiguous parameters is smaller than the difference between all the parameters, it indicates that the adjacent values are similar because of the similarity of the surrounding conditions. In this case, it can be said that there is a positive reciprocal spatial autocorrelation. However, if the values of adjacent phenomena differ, it can be said that there is a negative spatial autocorrelation, in other words, a lack of spatial autocorrelation. Moran’s Index is one of the important measures in detecting the spatial autocorrelation between the elements of the phenomenon studied and the pattern of the spread of the phenomenon is it dispersal, regular or random. The value of the Moran directory is between +1 and −1. If the directory value is close to (+1), this indicates the clustered pattern. If the value is close to (−1), this indicates the random pattern [

In addition, the spatial autocorrelation relationship is detected by the semivariogram/covariance cloud instrument as in

of those data diverged among them. Previous analyzes allow us to conclude that the data used in this study are spatially interrelated and do not contain anomalous values.

After confirming the validity of the natural distribution of groundwater data and not containing abnormal data, and before producing the spatial interpolation map of groundwater levels, it is necessary to conduct comparative and assessment of spatial interpolation methods and choose the best model for representing the groundwater level in the study area.

In this study 64 tests were conducted in order to find the nearest model representing the reality of the groundwater level in the study area. Then the best method was chosen for each of the five models: 1) inverse distance weighted (IDW), 2) radial basis functions (RBF), 3) ordinary Kriging (OK), 4) simple

Kriging (SK) and 5) universal Kriging (UK). Then the process of comparing the final optimal models and selecting the best model to represent the reality of the groundwater.

To compare the spatial interpolation models and the selection of the best, the cross validation technique provided by ArcGIS 10.5 was used to judge statistically the accuracy of these models in the representation of groundwater levels in the study area as shown in Figures 6-9. As a result of the statistical comparison between the five spatial interpolation models and validation of the results using (cross validation) it was found that the (UK) method is the best method to represent the level of groundwater in Salman district because this model has the lowest Root Mean Square Error (RMSE), the lowest mean error (ME), and the highest Coefficient of determination (R^{2}) value as in

The geostatistical techniques allowed to find the best spatial interpolation method and to produce groundwater level map and the prediction standard error

Criteria | (UK) | (IDW) | (OK) | (SK) | (RBF) |
---|---|---|---|---|---|

RMSE | 10.65 | 11.032 | 11.366 | 14.798 | 16.597 |

ME | 5.365 | 5.504 | 5.711 | 7.072 | 12.544 |

R^{2} | 0.981 | 0.979 | 0.978 | 0.963 | 0.954 |

map.

1) Groundwater levels largely follow the surface topography in the study area, moving away from the surface and increasing in the highlands as in the southwestern part of the study area. The groundwater level approaches the surface in lowlands such as valleys and depressions.

2) The presence of a AL-Salman depression in the middle of the study area helped to approach the groundwater from the surface compared to the areas around the depression.

3) If the groundwater level intersects with the surface of the earth, the groundwater appears in the form of perfusion or springs as it is in the far north of the study area or near the AL-Sulibat depression.

4) The hydraulic gradient of the groundwater with the topographic slope is largely consistent in the study area from the south-southwest to the north and north-east.

5) The hydraulic gradient rate was 1.411 m/km, while the topographic slope

was 2.248 m/km.

These maps can be used as a basis in finding the best locations for drilling wells in the study area, through subtraction groundwater surface layer from digital elevation model by using ArcGIS and produce a map showing the depth of the groundwater, can be utilized in the planning, development and decision-making processes of the actors in Iraq.

The new thing that was discussed and reached in this research is to determine the best models of spatial interpolation in the representation of groundwater levels in the Salman area, after conducting comparisons and evaluation according

to several statistical criteria and then producing a digital map describing the spatial distribution of these levels and production prediction standard error map, where geostatistical techniques were applied in surfaces modeling and comparison in terms of accuracy. The study showed that the universal Kriging (UK) method is the best way to represent the groundwater level in the study area, because it gave the least root mean square error (RMSE) 10.64, mean error (ME) 5.36 and the highest coefficient of determination (R^{2}) 0.98 between the measured values and the predicted values by other spatial interpolation models. This required more than 64 tests of inevitable and statistical interpolation methods. Cross-validation was used to compare and evaluate the accuracy of each model, as it was an effective tool in comparison and assessment that allowed access to the most accurate model to represent the groundwater level with the least time and effort. The study reveals that comparing the spatial interpolation models and evaluating their accuracy, through some statistical indicators and cross-validation is the best way to choose the optimal model for the representation of data entered in any site.

This optimal spatial interpolation method has generated groundwater levels map in study area and as well as the uncertainty map associated with her it. Use of these maps help to detect the surface of the groundwater or piezometric level and the direction of the hydraulic gradient in the study area can be utilized in the planning, development and decision-making processes of the actors in Iraq.

The authors declare no conflicts of interest regarding the publication of this paper.

Njeban, H.S. (2018) Comparison and Evaluation of GIS-Based Spatial Interpolation Methods for Estimation Groundwater Level in AL-Salman District―Southwest Iraq. Journal of Geographic Information System, 10, 362-380. https://doi.org/10.4236/jgis.2018.104019