Near-Infrared Spectroscopy Combined with Absorbance Upper Optimization Partial Least Squares Applied to Rapid Analysis of Polysaccharide for Proprietary Chinese Medicine Oral Solution

Near-infrared (NIR) spectroscopy was applied to reagent-free quantitative analysis of polysaccharide of a brand product of proprietary Chinese medicine (PCM) oral solution samples. A novel method, called absorbance upper optimization partial least squares (AUO-PLS), was proposed and successfully applied to the wavelength selection. Based on varied partitioning of the calibration and prediction sample sets, the parameter optimization was performed to achieve stability. On the basis of the AUO-PLS method, the selected upper bound of appropriate absorbance was 1.53 and the corresponding wavebands combination was 400 1880 & 2088 2346 nm. With the use of random validation samples excluded from the modeling process, the root-mean-square error and correlation coefficient of prediction for polysaccharide were 27.09 mg∙L1 and 0.888, respectively. The results indicate that the NIR prediction values are close to those of the measured values. NIR spectroscopy combined with AUO-PLS method provided a promising tool for quantification of the polysaccharide for PCM oral solution and this technique is rapid and simple when compared with conventional methods.


Introduction
Proprietary Chinese medicine (PCM) oral solution is a kind of health-care nourishing product, which is convenient to eat.According to the theory of traditional Chinese medicine, modern research results and practical experience, it is crafted by extracting some active components from a variety of Chinese herbal medicine.The compound polysaccharide, as the main active ingredients of PCM oral solution, can effectively regulate and enhance human immunity, prevent diseases and improve physical fitness.In the process of producing PCM oral solution, the real-time determination of the polysaccharide content is the necessary guarantee of monitoring the quality of the products.The conventional method [1] needs sample pretreatment and chemical reagent, which is difficult for real-time monitoring of production quality.Therefore, a rapid, simple, and reagent-free method has the significant value in practice.
Near-infrared (NIR) spectroscopy primarily reflects absorption of overtones and combination of vibrations of X-H functional groups (such as C-H, O-H, and N-H).Because of weak absorption strength, most of samples can be measured directly without preprocessing.This rapid, simple and non-destructive technique has obvious advantages and is commonly used in many areas, including agriculture [2]- [6], food [7] [8], environment [9], biomedicine [10]- [13] and pharmaceuticals [14] [15].However, to the best of our knowledge, a quantification method for the determination of polysaccharide in the PCM oral solution using NIR spectroscopy has not been developed yet.Since the NIR spectra have serious overlapping and no significant absorption band, especially for the PCM oral solution with multiple components, appropriate chemometric methods must be employed to obtain wavelength optimization and quantitative analysis models with high signal-to-noise ratio (SNR).It can achieve extracting information variables and remove the noise interference.Partial least squares (PLS) regression has been recognized as an effective multivariate analysis method, and has been widely applied in the spectral analysis field [2]- [13].
Zengjian oral solution is a well-known brand product of PCM healthy oral solution, which is produced via refining polysaccharide from natural plant such as tremella, enoki and Chinese wolfberry etc.In this study, absorbance upper optimization PLS (AUO-PLS) was proposed, and NIR spectroscopy combined with AUO-PLS method was successfully applied to the rapid and reagent-free quantification of polysaccharide for Zengjian oral solution.
The stability of the spectral analysis model is very important in practice.Numerous experiments show that differences in partitioning of calibration and prediction sample sets can result in fluctuations in predictions and parameters (e.g. the number of PLS factors), thus leading to unstable results [3] [5] [8] [9] [12].In the current study, a rigorous process of calibration, prediction, and validation based on randomness and stability was performed to achieve the goal of spectroscopic analysis.

Experimental Materials, Instruments, and Measurement Methods
A total of 1533 Zengjian oral solution samples were collected from infinitus (China) Company Ltd.The polysaccharide concentrations of these samples were measured with a UV-2300 UV-Vis spectrophotometer (Shanghai Tianmei, China) using mineral chameleon titration method.Mineral chameleon titration is capacity analysis method with potassium permanganate solution as titrant.It requires the use of chemical reagents, and by color reaction to achieve accurate quantification of the polysaccharide concentration of a sample.The measured values ranged from 330.26 mg•L −1 to 679.99 mg•L −1 , and the mean value and standard deviation were 484.67 and 52.53 mg•L −1 , respectively, which were used as the reference values for the calibration modeling of NIR spectroscopic analysis.Based on the obtained calibration model, a new method without chemical reagent for rapid determination of polysaccharide concentration of the PCM oral solution samples can be established with NIR spectroscopy.
An XDS Rapid Content TM Solution Grating Spectrometer (FOSS, Denmark) equipped with a transmission accessory and a 2-mm cuvette was used for spectroscopy.The scanning spectrum spanned 400 nm to 2498 nm with a 2-nm wavelength gap, including the overall NIR region and a part of the visible region.Wavebands of 400 -1100 nm and 1100 -2498 nm were used for silicon and plumbous sulfide detection, respectively.Each sample was scanned thrice, and the mean value of the three measurements was used for modeling.The spectra were obtained at 25˚C ± 1˚C and a relative humidity of 45% ± 1%.

Calibration, Prediction, and Validation Process with Stability
First, the 693 samples were randomly selected from a total of 1533 samples as the validation sample set, which were not subjected to the modeling optimization process.Then, the remaining 840 samples were used as modeling sample set and were further randomly divided into calibration (420 samples) and prediction (420 samples) sample sets for 100 times.The calibration and prediction models were established for all 100 divisions, and the model parameters were optimized depending on the mean prediction effects for all divisions to obtain objective and stable models.
The root-mean-square errors (SEC, SEP) and correlation coefficients (R C , R P ) for calibration and prediction in modeling set were calculated, respectively.For each division (i) of calibration and prediction sets, they were denoted as SEC i , SEP i , R C,i and R P,i , respectively, 1, 2, ,100 i =  .The mean values (SEP Ave , R P,Ave ) and standard deviations (SEP SD , R P,SD ) of SEP i and R P,i for all the divisions were further calculated, respectively.These values were used to analyze model prediction accuracy and stability.The equation SEP + = SEP Ave + SEP SD was used as a comprehensive indicator of prediction accuracy and stability of a model.A smaller value of SEP + indicated higher accuracy and stability.The model parameters were selected to achieve minimum SEP + .The selected model was then revalidated against the validation sample set.The root-mean-square error and correlation coefficient of prediction in validation sample set were then calculated and denoted as SEP and R P , respectively.The calculation formulas are as follows: ( ) where m is the number of validation samples; C k and k C  are the measured and predicted polysaccharide con- centrations of the kth validation sample, respectively; Ave C and Ave C  are the mean measured polysaccharide value and the mean predicted polysaccharide value of all the validation samples, respectively.

Selection of Number of PLS Factors with Stability
The number of PLS factors (F) is an important parameter of PLS method that corresponds to the number of spectral latent variables corresponding to sample information.The selection of a reasonable F is both necessary and difficult.If F was set too small, the sample information in the spectra was unable to be fully reflected.If F was set too big, extra noises would be led into the model, the prediction ability would descend in both cases.In the present study, F was selected according to minimum SEP + based on all divisions for the calibration and prediction sample sets.Thus, the optimal number of PLS factors exhibited stability and practicality.

AUO-PLS Method
Lambert Beer's law is described by the following equation: where λ is the wavelength; A(λ) is the absorbance; I 0 (λ) and I 1 (λ) are the intensity of incident light and the intensity of transmitted light through the sample, respectively; and T(λ) is the transmittance, i.e., the ratio of transmitted light intensity and incident light intensity.Conversely, Equation (3) can then be expressed as follows: According to the above equation, e.g. when A(λ) = 4, the transmitted light intensity was merely one ten thousandth of the incident light intensity, i.e., the 99.99% of the incident light was absorbed by the sample.In this case, the transmitted light was very weak and was difficult to detect; it would thus likely cause noise in the spectrum.Therefore, wavelength selection with appropriate absorbance values, which correspond to a high quality of sample information and low levels of noise, is necessary.In this study, a novel PLS-based wavelength selection method, named absorbance upper optimization PLS (AUO-PLS) was proposed on the basis of the selection the upper bound of absorbance, which can appropriately minimize noise bands.The specific steps are as follows: Step 1: A region of wavelength screening (Δ) was set in advance for the entire scanning region according to the physical and chemical characteristics of the measured objects and the instrument properties.Meanwhile, in the average spectrum for all samples within the region 4, the minimum and maximum values of absorbance were denoted as A min and A max , respectively.An appropriate step of absorbance (ε) was set.
Step 2: Set some value A * , * min A A ≥ , the upper bound of absorbance A upper was changed from A * to A max with the step ε.According to relationship between wavelength and absorbance within the region Δ, for each A upper , the absorbance interval (A min , A upper ) corresponded to a wavebands combination.
Step 3: Every obtained wavebands combination was employed for establishing the PLS calibration and prediction models.The corresponding SEP Ave , R P,Ave , SEP SD , R P,SD and SEP + values were then calculated.
Step 4: According to minimum SEP + , the optimal A upper was determined, and the wavebands combination corresponded (A min , A upper ) was also selected.
In this study, the region Δ was set to be the entire scanning region (400 -2498 nm) with 1050 wavelengths.The A min was greater than or close to zero, and the A max value was less than or close to five, therefore, A min and A max were set to 0 and 5, respectively.Noticed that around 1450 nm is another obvious absorption peak with absorbance value 1.40.In order to retain the relevant information of the region, the A * value was set as 1.40 (namely set A upper > 1.40), because the main purpose in here is to remove the noise bands with saturate absorption.The absorbance step ε was set to 0.01 and the number of PLS factors (F) was set to 1 shows a sketch map of the relationship between wavelength and absorbance for the case in which the absorbance value A upper = 1.53 and the corresponding wavebands combination is 400 -1880 & 2088 -2346 nm.

Wavebands Combination Selection with AUO-PLS
The NIR spectra of the 1533 samples of Zengjian oral solution in the entire scanning region (400 -2498 nm) are shown in Figure 2. As indicated in the figure, a saturate absorption region appears at about 1900 -2000 nm.The saturate region was caused by strong absorption of water molecules and scattering of some tangible components  in oral solution samples.AUO-PLS method mentioned in Section 2.4 was performed to avoid the noise wavebands with high absorption.
The SEP + values for each upper bound of absorbance A upper are shown in Figure 3.The results showed that, the prediction polysaccharide value achieved the minimum SEP + when about A upper = 1.53.The corresponding wavebands combination was 400 -1880 & 2088 -2346 nm with 871 wavelengths, and the prediction accuracy and stability results (SEP Ave , R P,Ave , SEP SD , R P,SD , and SEP + ) are summarized in Table 1.As a comparison, the full PLS model based on the entire scanning region was also established, and the prediction effects were also summarized in Table 1.The SEP + value for optimal AUO-PLS model was 27.81 mg•L −1 , which was obviously better than that of the full PLS model.The relative SEP value (RSEP) for the optimal AUO-PLS model was 5.6%.The results show that, by avoiding the noise wavebands with high absorption, the prediction ability was improved and model complexity was reduced.

Model Validation
The randomly selected validation samples, which were excluded in the modeling optimization process, were used to validate the adopted AUO-PLS model.The PLS regression coefficients were calculated using the spectral data and measured polysaccharide concentrations of all modeling samples depending on the selected parameter F. The predicted polysaccharide concentrations of the validation samples were then calculated using the obtained regression coefficients and spectra of the validation samples.
Figure 4 shows the relationship between the NIR predicted and measured values of the 693 validation samples.The evaluation values (SEP and R P ) for validation effect were 27.09 mg•L −1 and 0.888, respectively.The results indicate that the NIR prediction values of the validation samples are close to those of the measured values.Satisfactory validation effects were achieved for the random samples because stability was considered in the modeling optimization process.

Conclusion
Wavelength selection is crucial for spectroscopic analysis, as it improves the effectiveness of prediction, reduces model complexity, and aids in the design of a specialized spectrometer with a high signal-to-noise ratio.The proposed AUO-PLS method focused on the optimization of upper bounds of absorbance to avoid noise interference caused by high absorbance.Based on the relationship between wavelength and absorbance, the appropriate wavebands combination was selected.NIR spectroscopy combined with the proposed AUO-PLS method was successfully employed for the reagent-free and rapid quantitative analysis of polysaccharide for Zengjian oral solution.A rigorous process of calibration, prediction, and validation based on randomness and stability was performed to produce objective and stable models.We believe that AUO-PLS has such applicability and can be also applied to other brand product of PCM healthy oral solution.Upper bound of absorbance /-

Figure 1 .
Figure 1.Sketch map for relationship between wavelength and absorbance.

Figure 3 .
Figure 3. SEP + values for each upper bound of absorbance with AUO-PLS method.

Figure 4 .
Figure 4. Relationship between the predicted and measured values of the validation samples with AUO-PLS method.

Table 1 .
Prediction effects of full PLS and AUO-PLS models for polysaccharide.