Investigation of the Potential of Near Infrared Spectroscopy for the Detection and Quantification of Pesticides in Aqueous Solution

This research investigates the potential of near infrared spectroscopy (NIRS) for the detection and quantification of pesticides in aqueous solution. Standard solutions of Alachlor and Atrazine (ranging in concentration from 1.25 100 ppm) were prepared by dilution in a Methanol/water solvent (1:1 methanol/water (v/v)). Near infrared transmission spectra were obtained in the wavelength region 400 2500 nm; however, the wavelength regions below 1300 nm and above 1900 nm were omitted in subsequent analysis due to the poor signal repeatability in these regions. Partial least squares analysis was applied for discrimination between pesticide and solvent and for prediction of pesticide concentration. Limits of detection of 12.6 ppm for Alachlor and 46.4 ppm for Atrazine were obtained.


Introduction
In order to enhance monitoring of pesticides it is necessary to develop low cost, rapid methods for their detection which can be integrated into water flow systems [1].Vibrational spectroscopy comprises a group of methods that may be applied for monitoring water quality.Among the broad spectrum of techniques belonging to this family, to date Fourier Transform Infrared (FT-IR) and Attenuated Total Reflectance (ATR) spectroscopy in the mid infrared (MIR) wavelength range (2500 -16,000 nm) have been developed for contaminant detection in water [2].Due to the high absorption of MIR light by water, these techniques have depended on the use of pre-enrichment steps such as solid phase microextraction.Methods based on the coating of the ATR crystals with polymer films with affinity for certain contaminants have also been demonstrated.One example is a method developed for pesticide detection employing PVC coated ATR crystals; in that study, detection limits around 2 ppm were reported for Atrazine and Alachlor [3].However, a 15 minute enrichment time followed by 5 min water wash was required for each measurement.Such relatively lengthy measurement times rule out the possibility of on-line monitoring.
In the lower wavelength near infrared (NIR) range (750 -2500 nm), the absorption coefficient of water is around 100 to 1000 times less than that in the MIR.This facilitates greater sample thickness and direct measurement of water samples.In addition, NIR spectroscopy (NIRS) is ideally suited for rapid online measurements.However, NIR spectra are more complicated to analyse than IR spectra due to the combination and overlapping of vibrational modes present In order to extract useful information, it is necessary to apply multivariate techniques such as principal components analysis (PCA), partial least squares regression (PLSR) etc. [4].Nevertheless, the detection of low concentrations of contaminants in aqueous solution has been demonstrated using NIRS; researchers recently reported the use of NIRS for prediction of metal concentration in aqueous solutions using NIR transmission spectroscopy, with reported limits of detection ranging from 10 -40 ppm [5].Although metals do not absorb light in the NIR, their presence is detectable due to the interaction of metal ions with OH bonds in water.Aquaphotomics aims to exploit such in-teractions between water and NIR light to extract information on the state of aqueous systems [6].Water is the main component in numerous biological (and many non biological) systems; however, its structure is perturbed by the presence of various components such as salts, proteins, sugars and other bio molecules.With this in mind, Aquaphotomics aims to characterize the effect of various perturbations on water structure using NIR.Should changes in the water absorbance patterns arising from various contaminants be sufficiently distinguishable, the framework of Aquaphotomics shows potential for contaminant detection in aqueous systems.
The objective of this work is to evaluate the potential of NIRS and Aquaphotomics for the detection of pesticides directly in aqueous solution.Although researchers have demonstrated the potential of NIRS for the detection of pesticide residues on foods [7,8], to our best knowledge there are no previous studies reporting the use of NIRS for detecting pesticides directly in aqueous solution.Alachlor and Atrazine, were selected as test analytes for this study.During the 1980s, Alachlor was introduced as a substitute for Atrazine.These two herbicides have subsequently become important in monitoring of large scale water bodies and are commonly used in studies on the development of pesticide contamination detectors [3].Atrazine, one of the most frequently applied herbicides in the USA is a triazine pesticide and Alachlor, a major corn herbicide, is an acetanilide (Figure 1).The maximum contaminant level of each under the US EPA Safe Drinking Water Act (SDWA) is 3 and 2 μg/L (3 and 2 parts per billion (ppb)), respectively [9].

Sample Preparation
Due to the low solubility of the selected pesticides in water, working stock solutions at 100 mg•L -1 were prepared by direct dilution in a solvent of 1:1 methanol/ water (v/v) using deionized water from a Milli-Q water purification system (Millipore, Molsheim, France) [10].Further dilutions were made by serial dilution in this solvent to create a series with the following concentrations: 50, 20, 10, 5, 2.5, 1 mg•L -1 (ppm).The dilutions were made with the same solvent in order to ensure that changes in the absorbance signal were due to the pesticide and not due to the changing concentration of solvent.Methanol and standard quantities of Alachlor (catalogue number: P-102NM-250) and Atrazine (catalogue number: P-005NM-250) were purchased from Wako Pure Chemical Industries (Tokyo, Japan).
The experimental work was carried out in three stages.In the first stage (carried out between Dec 2010 and April 2011), relatively high pesticide concentrations were tested (5, 10, 50 and 100 ppm).This high range of concentrations was examined in order to test the feasibility of the method.In the secondary stage (carried out between June and August 2011), lower pesticide concentrations were employed to further test the detection limit of the proposed method (1.25, 2.5, 5, 10 and 20 ppm).In the third stage (carried out in November 2011), intermediate pesticide concentrations were tested (1.25, 2.5, 5, 10, 25 and 50 ppm).For the first two stages, each experiment was repeated 6 times (twice per day on three different days), while for the final stage, each experiment was repeated four times (twice per day on two different days).The second experimental day for each stage was chosen as an independent test set, and the calibration dataset was composed of the remaining data.

NIR Spectra Collection
Transmittance spectra were acquired using an NIR System 6500 spectrophotometer (Foss NIR-System, Laurel, USA), fitted with a quartz cuvette with 1 mm optical path length.Spectra were measured over the wavelength region of 400 -2500 nm, in 2 nm steps.The spectral data were transformed to pseudo-absorbance units (log(1/T), where T = transmittance).Transmittance spectra of the samples of different pesticide concentrations were col- lected in random order at a temperature of 28˚C ± 1˚C.The temperature of the sample holder was measured after each spectral acquisition.Five consecutive spectra were acquired from each sample.In order to monitor any potentially interfering signals, two control measurements were taken during each experiment.The first control was a sample of the solvent while the second consisted of measuring the empty space (air) between the light source and detector.These controls were measured at the beginning, middle and end of each experiment.The duration of each experiment was approximately two hours.

Data Analysis
All data analysis was carried out in Matlab (The Math-Works, Inc., Natick, MA) using in house functions.A number of data pre-treatments were applied to the spectra, as follows: mean centering, multiplicative scatter correction (MSC), extended multiplicative signal correction (EMSC), 1 st and 2 nd derivative Savitsky-Golay (SG) pretreatments and standard normal variate (SNV) pretreatment [4].In order to improve model robustness, calibration models were made using all 5 consecutive spectra.These models were then applied to mean of 5 consecutive spectra in the test set.
The EMSC model can be described as follows: where X represents an observed spectrum, b 0 , b 1 and b 2 are constants, I is the spectrum of an interferent (in practice multiple interferent terms can be included in the model), X is a reference spectrum (usually the mean) and ε is the residual.The constant terms can be estimated by multiple linear regression and a corrected spectrum X may be calculated by rearranging Equation (1): The resultant EMSC corrected spectra are orthogonal to those of the interferents.In our example, 1 st principal component (PC1) spectra of the controls were used as interferent spectra.

Exploratory Analysis
Principal component analysis (PCA) [4] was used for exploratory analysis and to examine the wavelength ranges at which the experiments were most repeatable.

Classification
Partial least squares discriminant analysis (PLSDA) [4] was employed to discriminate between the solvent and pesticide-containing solutions.The spectra of the solvent were designated a dummy index of 0 while those of the pesticide were designated a value of 1. PLS regression was applied to the data and a threshold was applied to the subsequent PLS predictions; any predicted value above the threshold was designated as belonging to class 1 while the converse were designated as belonging to class 0. In order to avoid model overfitting, the method proposed by Gowen et al. was employed [11].The % correct classification for each model on the independent test set was calculated.

Predictive Modelling
Calibration models were built to predict pesticide concentration using PLS regression (PLSR) [4].In order to avoid model overfitting, the method proposed by Gowen et al. was employed [11].After selecting the optimal number of latent variables for inclusion, root mean squared error of prediction (RMSEP) was calculated based on the predictive performance of the model on the test set.Due to the nonlinear distribution of pesticide concentrations, a log transformation was also applied; however, this did not improve the predictive ability of the models.Therefore only results for predictive models built using the original units of concentration are reported here.

Limit of Detection Calculation
The limit of detection of the procedure was calculated using Equation (3) [12]: Where the subscript blank refers to a sample not containing the pesticide and low refers to a sample containing a low concentration of the pesticide.In this study, the spectra of the solvent were used as blank samples, while spectra of pesticide solutions containing 1.25 -2.5 ppm were used as low samples.

Safety Considerations
Alachlor and Atrazine are hazardous materials and were handled under standard laboratory safety conditions.

Spectra of Pesticide Solutions
The mean log(1/T) spectra of the Atrazine and Alachlor solutions are plotted in Figure 2. The main features of these spectra are major absorbance peaks at 1450, 1940 and 2270 nm, and there a significant baseline effect is evident, which increases with wavelength (Figure 2(a)).The mean spectra of the Atrazine and Alachlor solutions re indistinguishable.However, when they are subtracted a from each other (Figure 2(c)), it is evident that the main regions of difference occur around 1420 and 1900 nm.In order to further investigate the spectral changes occurring due to the addition of Atrazine or Alachor to the water/methanol solvent, the average spectrum of the solvent was subtracted from the average spectrum of the 100 ppm pesticide solutions (Figure 2(b)).The root square of the difference spectra was scaled to the 0 -1 range to improve clarity and enable comparison of the spectral regions affected by the addition of pesticides.The wavelength regions most affected by the addition of pesticide, for both Alachor and Atrazine, occurred at 1450, 1908, 1974 and 2274 nm.The regions around 1450 and 1908 nm may be attributed to the first overtone and combination region of OH stretching and bending vibrations (for pure water, these occur within the ranges 1455 -1476 nm and 1875 -1910 nm [13]), the 1974 nm region corresponds to the combination of NH stretching and bending vibrations and the 2274 nm region is probably due to CH combination vibrations [14].When these spectra are subtracted from each other (Figure 2(d)), it is evident that the main regions of difference between Atrazine and Alachlor occur around 1420 and 1900 nm.These wavelength regions are related to the perturbation of the OH stretching and bending combination vibrations in the solvent.

Repeatability of Experiments
In order to investigate which wavelength regions would be most suitable for data modeling, the data was split into different wavelength ranges, from 700 -2500 nm in steps of 300 nm.Principal component analysis (PCA) was applied to the data for each day/wavelength range and the 1 st PC loadings for each day were compared, as plotted in Figure 3.The root square value at each wavelength range is shown to avoid any confusion caused by sign ambiguity in PC loadings.It can be observed from the PC1 loadings that the data from the wavelength region < 1200 nm and greater than 1900 nm is far noisier than that in the regions in between these wavelengths.The noise evident at the spectral edges can be related to the performance of the detector which is generally of lower efficiency at those wavelength regions.However, these noise features also arise due to the characteristics of the sample: the absorbance of the solvent exceeded 2 absorbance units at wavelengths greater than 1900 nm, due to the high absorption of light in this region.This indicates that the response of the detector is nonlinear in this region.It may also be observed that the Alachlor data showed greater repeatability (top line, Figure 3) than the Atrazine data (bottom line, Figure 3), especially in the 1300 -1600 nm region.The wavelength region that appeared least noisy and most repeatable was 1300 -1900 nm.For this reason, subsequent analysis was carried out in the following wavelength ranges: 1300 -1600,

Discrimination of Pesticide and Solvent
The first task of the analysis was to investigate the potential for NIRS to discriminate between the solvent and samples of solvent containing pesticide.The data in the 1300 -1900 nm wavelength region were subjected to a range of spectral pretreatments and calibration models were constructed as described in Section 2.3.2.In spite of applying numerous spectral pretreatments to the data, it was found that raw log(1/T) data was optimal for discrimination between pesticide solutions and solvent (Table 1).For the Alachlor dataset, 100% correct classification (CC) was achieved by mean centering the raw log (1/T) data and building the model in the 1600 -1900 nm wavelength range, while for the Atrazine dataset, 85% correct classification (CC) was achieved by mean centering the raw log (1/T) data and building the model in the 1300 -1600 nm wavelength range (although the same classification performance was achieved by application of SNV or EMSC pretreatment in the 1600 -1900 nm range).

Prediction of Pesticide Concentration
After discriminating the samples according to the presence or absence of pesticide, the next objective was to predict the amount of pesticide present.For this purpose, PLSR was applied.The model performance in terms of RMSEP on the independent test set for the range of pretreatments tested is shown in Table 2.For the Alachlor dataset, the model resulting in the lowest prediction error (11.3 ppm) was one built on SNV pretreated and mean centered data in the 1300 -1900 nm range.As for the Atrazine data, the best performing model (RMSEP = 15.7)resulted from the application of second derivative Savitsky Golay pretreatment (SG2) to data in the 1300 -1600 nm wavelength range followed by mean centering.This is the same wavelength range that was optimal for the classification of Atrazine, as discussed in the previous section.The poorer performance of prediction mod- els for Atrazine as compared to Alachlor-or both classification (see previous section) and quantification-is remarkable.This may be related to the lower repeatability of the AT data, as observed in the PC loading plots (Figure 1).

Discrimination of Pesticide and Solvent
Analysis of the high concentration experiments revealed that RMSEP values of 10 -15 ppm could be obtained, indicating the feasibility of the proposed method.Subsequent experiments were carried out to examine the potential of NIRS for detection of pesticides in lower concentrations.Similar to the results for the high concentration dataset, the best results for discrimination between pesticide and solvent was achieved using raw log (1/T) data (Table 3).100% correct classification was achieved for the Alachlor dataset with a model built on mean centered log(1/T) data in the 1300 -1900 nm wavelength range.Models built on the 1600 -1900 nm range, which was the optimal range for the high concentration Alachlor experiment, performed poorly in this case, achieving not greater than 71% correct classification.This indicates that different mechanisms underlie the classification model for high and low concentration datasets and that the first overtone of the OH stretching and bending vibrations (1300 -1600 nm) is important for the prediction of lower concentrations of pesticides.The best model for the Atrazine dataset was achieved for mean centered log(1/T) data in the 1300 -1600 nm range, with a classification accuracy of 83.3% attainable.In the case of Atrazine, the 1300 -1600 nm wavelength region was optimal for discrimination between pesticide-containing solutions and solvent for both high and low concentration datasets.

Prediction of Pesticide Concentration
The optimal calibration model for the prediction of Alachlor concentration was built on EMSC pretreated data in the 1300 -1600 nm wavelength range, resulting in an RMSEP of 4.4 ppm, while that for Atrazine was built on EMSC pretreated data in the wavelength range 1300 -1900 nm, resulting in an RMSEP of 15 ppm (Table 4).These low prediction errors indicate the potential of NIRS for prediction of low concentration of pesticide in queous solution.In order to examine the potential of a   first overtone of the OH stretching and bending modes of the solvent was important for their identification and quantification.The proposed method shows potential for direct measurement of low concentrations of pesticides in aqueous solution.However, the limits of detection achieved by analysis of combined low and high concentration experiments (12.6 ppm for Alachlor and 46.4 ppm for Atrazine) are high compared with the maximum contaminant level of each allowed under the SDWA (2 and 3 ppb), respectively.It is also important to note that these experiments were carried under artificial laboratory conditions.It is well known that the NIR spectrum of aqueous samples is susceptible to changes in the environment (e.g.temperature, humidity) and sample (e.g.pH, turbidity).Therefore, further experiments to test the effect of such perturbations on predictive ability should be carried out.

Figure 1 .
Figure 1.Chemical structure of Atrazine and Alachlor.

Figure 2 .
Figure 2. (a) Mean (mean of 5 -100 ppm concentration) spectra of Atrazine and Alachlor solutions; (b) main regions of difference between pesticide and solvent shown in normalised (root square difference scaled between 0 -1) difference spectra of 100 ppm pesticide-solvent; (c) main regions of difference between Atrazine and Alachlor shown in difference spectra of mean Atrazine and Alachlor spectra shown in (a); (d) main regions of difference between Atrazine and Alachlor shown by subtracting the spectra shown in (b).

Figure 3 .
Figure 3. Root squared First PC loading for PCA applied to raw absorbance data from (a) high concentration Alachlor (top line) and Atrazine (bottom line) experiments; (b) low concentration Alachlor (top line) and Atrazine (bottom line) experiments, where wavelength range is indicated above each plot.Experimental day is represented is by colour (red = day 1, blue = ay 2, green = day 3).d