^{1}

^{*}

^{1}

^{1}

^{1}

The geostatistical technique of Kriging has extensively been used for the investigation and delineation of soil heavy metal pollution. Kriging is rarely used in practical circumstances, however, because the parameter values are difficult to decide and relatively optimal locations for further sampling are difficult to find. In this study, we used large numbers of assumed actual polluted fields (AAPFs) randomly generated by unconditional simulation (US) to assess the adjusted total fee (ATF), an assessment standard developed for balancing the correct treatment rate (CTR) and total fee (TF), based on a traditional strategy of systematic (or uniform) grid sampling (SGS) and Kriging. We found that a strategy using both SGS and Kriging was more cost-effective than a strategy using only SGS. Next, we used a genetic algorithm (GA) approach to find optimal locations for the additional sampling. We found that the optimized locations for the additional sampling were at the joint districts of polluted areas and unpolluted areas, where abundant SGS data appeared near the threshold value. This strategy was less helpful, however, when the pollution of polluted fields showed no spatial correlation.

Soil heavy metal pollution is becoming an increasingly severe global environmental problem, especially in developing countries such as China, due to the toxic, non-biodegradable, and persistent nature of heavy metals (HMs) [^{2} (Finland) and 16/100 m^{2} (Switzerland) [

Kriging interpolation has been used to model and map spatial distributions of soil heavy metal contamination for more than three decades [

The sampling at the beginning of a contaminated site investigation is mainly designed not for geostatistical interpolations, but for risk assessment based on legal regulations or intuition (e.g. the deviating color of soil) [

The commonly used validation methods such as cross-validation tend to waste some of the existing sampling data or require more independent samples to evaluate the effectiveness of the variogram model. Cross-validation only validates the prediction accuracy at sampling sites and cannot reflect the accuracy at unsampled sites. [

In this study we randomly generated a series of assumed actual polluted fields (AAPFs) by US and then used them to validate a cost-effective strategy based on the main two performance criteria: the CTR and TF. We then developed a cost-effective strategy using SGS and Kriging and determined relatively optimal locations for additional sampling through GA optimization.

Methods based on conditional simulation (CS) generate a series of realizations (equally probable solutions) of pollution distribution through original sampling data from one polluted field. In the present research we developed another approach based on US. First, we randomly generated a series of AAPFs. Next, we used the sampling data (logarithm) on each AAPF to generate the realization by Kriging. US honors the overall mean, variance, and spatial correlation while disregarding observations at specific sampling locations [^{2}. The threshold value is changeable, and different threshold values can generate AAPFs with different pollution rates (PRs), as shown in

Each AAPF is regularly or uniformly divided into N = n × n square meshes, and a sample is taken at the center of each mesh. The total number of samples is equal to the total number of the square meshes (N), depending on the number of samples, or number of meshes (n) of each side of the AAPF (

Excavation, solidification/stabilization, soil washing, electro-remediation, and phytoremediation have all been used as in-situ and ex-situ techniques to reduce the impact of metals in soil. Notwithstanding, the remediation of contaminated soils by HMs is still recognized as one of the most difficult problems to solve. A few technologies are available for this purpose, but all are costly [

To apply the optimal strategy to real-world remediation projects, the parameters used in the strategy should be as close as possible to real-world conditions and reflect the issues that most directly concern stakeholders or decision-makers: TF and CTR. TF consists of a sampling fee (SF: includes the preliminary sampling fee and additional sampling fee) and remediation fee (RF: includes the analysis fee, transport fee, treatment fee, etc.). One remediation scenario is assumed before the calculation of TF and CTR shown in

where 17.5 and 4.5 are the unit prices of sampling and remediation selected with actual market unit prices, respectively. We had two reasons for selecting a soil depth of 0.5. First, HMs generally accumulate more in soil between 0 and 5 cm than at other depths, and most of the studies so far performed have focused on the sampling

of surface soil (≤20 cm) [

To reduce the number of performance criteria in the GA optimization procedure for additional samples and optimally operate the trade-off between TF and CTR, the adjusted total fee (ATF) is introduced with the following function:

where the value of _{1} = ATF_{2}), whereas the TFs and CTRs may differ (such as TF_{1} < TF_{2}, CTR_{1} < CTR_{2}). In this case, the ATF (e.g. ATF_{2}) with a high CTR (CTR_{2}) is selected as the relatively better strategy in this study.

We performed this study using ordinary Kriging (OK), a method illustrated in detail in earlier literature [

Genetic algorithms (GAs) are a subset of heuristic evolutionary algorithms that generate globally optimal or near-optimal solutions for diverse, complex, and globally distributed problems by mimicking the process of natural evolution. While heuristic optimization methods cannot ensure global optimal solutions, they have the advantage of searching discrete solution-spaces globally. Gradient-based search methods lack this capability, as they require continuous solution-spaces and run the risk of getting stuck in local optimal solutions [

All of the statistic and optimization analyses in this study are performed with programs created and developed in VBA for Microsoft Excel (2010).

When sill values vary in the manner shown in

When the values of correlation scale differ in the manner shown in

While SP changes, the optimal number of samples with the relative lowest ATF varies. As

Based on the result of preliminary grid sampling analysis (optimal number of samples, 121; assumed threshold value, 1.5), one AAPF (shown in

A histogram of the sampling data on this randomly selected AAPF is illustrated in

The sample semivariogram calculated to account for the spatial structure of concentrations (see

By continuously using different Confidence Intervals (CIs from 50% to 95% in 5% increments, and 99%) on this AAPF after the structural analysis, the Kriging generates solutions (realizations) for the treatment areas with different corresponding ATFs until the best solution with the lowest ATF can be obtained. The best CI values, however, differ for different AAPFs. By analyzing the best CI values for different AAPFs, we find some relation between the best CI values and pollution rates (PR: APA/ total area of AAPF*100%, as shown in

The best CI range and best average CI value (calculated from the best CI range) can be suggested for AAPFs with different SDPR ranges, which are generally summarized in

SDPR Ranges (%) | 0 - 4^{a} | 4 - 14 | 14 - 24 | 24 - 44 | 44 - 64 | 64 - 84 | 84 - 100^{a} |
---|---|---|---|---|---|---|---|

Best CI range for Kriging (%) | 95 - 99^{a} | 85 - 95 | 80 - 90 | 70 - 80 | 65 - 75 | 55 - 70 | 50 - 60^{a} |

Best average CI values for Kriging (%) | 95^{a} | 90 | 85 | 75 | 70 | 60 | 55^{a} |

Best CI range for additional samples (%) | 95 - 99^{a} | 90 - 95 | 90 - 95 | 80 - 85 | 75 - 80 | 65 - 70 | 55 - 60^{a} |

^{a}These values are extrapolated based on the trend of best CI values suggested in the other SDPR ranges.

Based on the analysis above, the combination of preliminary grid sampling and the Kriging method can result in more accurate estimations of pollution distribution with relatively higher CTRs, which in turn can result in more cost-effective strategies with relatively lower ATFs for decision-makers for remediation. As a result,

Ten AAPFs were randomly selected in each SDPR range (except the first range (1% - 4%) and last range (84% - 100%)) based on

Here we use additional samples taken on 5 candidate areas (contoured by CI ranges: 75% - 80%, 80% - 85%, 85% - 90%, 90% - 95%, and 95% - 99%) of 10 randomly selected AAPFs (SDPR Range: 4% - 14%) as one example to show the changes of the ATFs after GA optimization.

As a result, the corresponding ATFs and CTRs for the 50 AAPFs were compared with the ATFs and CTRs calculated by the other three approaches, after GA optimization using the best CI range suggested for additional samples in

higher SDPR always offers relative less space for optimization. Even though the ATFs (blue, green, and purple lines) are closer to each other, the structures of the ATFs (the ratios or weights of the TFs and CTRs) vary. From

In a study on multi-phase sampling for soil remediation surveys by Marchant et al., the additional samples were generally dispersed on the boundaries between areas which, according to the first phase, either required or did not require remediation [

bles inside the red circles) are always selected at the joint locations of polluted and unpolluted areas, where there are abundant sampling data near the threshold value, whereas the additional samples are generally dispersed over the full scale of the AAPF rather than concentrated in any one area.

In conclusion, our study combined the grid sampling method, Kriging estimation, and GA optimization for additional samples into an overall strategy for investigating and remediating polluted fields. Many different AAPFs generated by US are used in place of real-world polluted sites to give relative optimal suggestions to engineers or decision-makers, such as parameters to use in Kriging estimation (as shown in

We thank Shweta Yadav, Ryota Gomi and Nguyen Thi Thuong for their assistance with the Figures used in this study.

Yong-QiangCui,MinoruYoneda,YokoShimada,YasutoMatsui, (2016) Cost-Effective Strategy for the Investigation and Remediation of Polluted Soil Using Geostatistics and a Genetic Algorithm Approach. Journal of Environmental Protection,07,99-115. doi: 10.4236/jep.2016.71010