Wavelength Selection for Near-Infrared Spectroscopic Analysis of Teicoplanin

Teicoplanin (TCP) is a multiple drug-resistant lipoglycopeptide antibiotic produced by fermenting Actinoplanes teichomyceticus. In this study, a mixture of TCP with the Tris-HCl buffer (TCP-TrisHCl) was used to simulate TCP fermentation broth. The reagent-free, rapid and simultaneous quantitative analysis models for TCP and Tris in the TCP-Tris-HCl mixtures were established by nearinfrared (NIR) spectroscopy. The equidistant combination partial least squares (EC-PLS) method and the equivalent model sets were proposed, the simplest equivalent model with the smallest number of wavelengths were further selected. The initial wavelength, number of wavelengths, number of wavelength gaps, number of PLS factors were 1520 nm, 28, 5, 5 for TCP and 1084 nm, 13, 6, 4 for Tris, respectively. Compared with the optimal EC-PLS models, the simplest equivalent models adopted fewer wavelengths. Thus, the redundant wavelengths were removed, the models were further simplified. The root-mean-square errors (SEP) and correlation coefficients (RP) for prediction were 0.043 mg∙mL−1 and 0.9998 for TCP, and 0.222 mg∙mL−1 and 0.9989 for Tris, respectively. The results indicate that NIR method can be applied to highly accurate quantitative analysis for TCP and provide valuable references for further application to TCP fermentation broth.


Introduction
Teicoplanin (TCP) is a multiple drug-resistant novel lipoglycopeptide antibiotic produced by the fermenting of Actinoplanes teichomyceticus [1].TCP has the strong bacteriostasis activeness to Gram-positive aerobic bacteria and anaerobion.In particular, for the infection caused by methicillin-resistant Staphylococcus aureus [2], it shows good effects.Compared with vancomycin, which is another antibiotic with good clinical effect, TCP has equal or better efficacy in antibacterial activity, lower toxicity and a more convenient and efficient administration route [3].Accordingly, TCP has significant economic worth and application perspective.During the fermentation process of TCP, the real-time measurement of TCP and other various components (such as biomass, nutrients, metabolites, etc.) is indispensible to quality control for fermentation.In addition, in pharmacokinetic studies, the measurement of blood TCP concentration is very important [4] [5].Therefore, it is necessary to establish a reagent-free and rapid measurement method for TCP concentration in a mixed system (e.g.fermentation broth or blood).At present, high-performance liquid chromatography (HPLC), liquid chromatography-mass spectrometry and microbiological assay [6]- [8] are the existing methods for TCP measurement.These methods are in need of lots of reagents and lengthy run times so that they are unsuitable for rapid measurement of TCP.
As a rapid, non-destructive, eco-friendly and cost-effective analytical technique, near-infrared (NIR) spectroscopy has been extensively used in agriculture [9]- [11], food [12], environment [13] [14], medicine and pharmaceuticals [15]- [18] and many other fields.It primarily reflects the absorption of overtones and the combination of the vibrations of the X-H functional groups (such as C-H, O-H, and N-H).TCP molecular contains the C-H, O-H, and N-H functional groups, which had significant NIR absorption.In this study, we aimed to explore the feasibility of the reagent-free and rapid quantitative analysis of TCP with NIR spectroscopy.
The Tris-HCl buffer, which is often used in biochemistry and molecular biology experiments for its stable nature, is suitable for use in simulating the physiological environment of a living body, such as the enzyme reaction in cell sap [19] [20].In this study, a quantitative analysis of a mixture of TCP with the Tris-HCl buffer (denoted as the TCP-Tris-HCl mixture) was performed.The Tris-HCl buffer is used to simulate the background liquid of TCP, which provides a reference to further application for quantitative analysis of TCP.The mixture samples were prepared with different concentrations of Tris to make them suitable for different backgrounds.The simultaneous quantification of TCP and Tris was achieved.
NIR spectra are generally composed of relatively week and highly overlapping bands.A multivariate calibration method must be used for quantitative analysis of NIR spectra.In parallel with chemometric developments, the reagent-free NIR analysis method shows substantial potential in drug monitoring.Partial least squares (PLS) has been proven an effective method to extract information and overcome spectral colinearity.However, the prediction effect of PLS is difficult to improve when the signal-to-noise ratio of a waveband is not adequately high [9] [11] [12] [14] [16].An appropriate wavelength selection is a key, albeit difficult, technical aspect for the rapid, reagent-free measurement of a complex system with NIR spectroscopy.
Moving window PLS (MW-PLS) is a well-performed and PLS-based method with wavelength selection in the study of many objects [9] [11] [12] [14] [16] [21].However, continuous-mode models based on MW-PLS typically lead to high model complexity.Based on the achievements of MW-PLS and multiple linear regression (MLR), the equidistant combination MLR (EC-MLR) was proposed in our previous study [11].The EC-MLR method has lower degree of freedom and lower computational complexity, which inherits the merits of both the continuous and discrete modes.Given that the PLS method is widely and easily used, and has better prediction effect than the MLR method in the case of the same wavelength combinations, in this study, EC-MLR was improved to equidistant combination PLS (EC-PLS) to achieve the appropriate wavelength selections for TCP and Tris.In fact, from an algorithm perspective, the EC-PLS covers the MW-PLS in the wavelength screening.In addition, a frame of equivalence model set was established in statistical sense and the simpler effective models were further obtained.

Experimental Materials, Instruments, and Measurement Methods
TCP standard products were purchased from National Institutes for Food and Drug Control (Beijing, China).Tris was analytical reagent.TCP-Tris-HCl mixtures were prepared to simulate TCP fermentation broth.
Given that the concentration of TCP can reach to 3.2 mg•mL −1 with the pH values of approximately 7.0 to 7.5 [22] [23], the samples were made up in following specific steps: 1) Ten Tris-HCl buffer solutions (100 mL −1 ), in the same pH values of 7.3, were prepared using Tris, double-distilled water and 1 N hydrochloric acid.These solutions, in which the concentration of Tris was 13 -40 mg•mL −1 , were numbered 1 -10 in ascending order of Tris concentration.2) Seventy two TCP aqueous solutions were prepared using TCP standard products and double-distilled water, in which the concentration of TCP was 0.681 -19.626 mg•mL −1 , and numbered 1 -72 in descending order of TCP concentration.3) Starting from the 1 st sample to the 70 th in accordance with the order, 1 sample was taken out every 9 th .The extracted 7 samples formed the 1 st group, then the 2 nd group until the 9 th one in the same cluster sample method.By the analogy, they were 9 groups prepared.The remaining 9 samples composed the 10 th group.So far the TCP concentration of each group was in a decending order.4) Seventy two TCP-Tris-HCl mixtures were prepared by TCP aqueous solutions (0.4 mL) from 1 group and Tris-HCl buffer solutions (0.4 mL) which number were corresponded.
A uniform statistic distribution of the concentrations of TCP and Tris in the 72 mixture samples was found.The concentrations for 72 mixture samples ranged from 0.338 mg•L −1 to 9.805 mg•L −1 for TCP, 6.272 mg•L −1 to 20.561 mg•L −1 for Tris, and the mean value and standard deviation were 4.114 and 2.225 mg•L −1 for TCP, 13.438 and 4.505 mg•L −1 for Tris, respectively, which were used as the reference values for the calibration modeling of NIR spectroscopic analysis.
All samples were used for spectrometry measurement.Spectra were collected using an XDS Rapid Content TM Liquid Grating Spectrometer (FOSS, Denmark) equipped with transmission accessory and a 2 mm cuvette.The scanning range spanned from 400 -2498 nm with a 2 nm wavelength interval, including the entire NIR region and a large part of the visible region.Wavebands of 400 -1100 and 1100 -2498 nm were used for Si and PbS detection, respectively.Each sample was scanned in triplicate, and the mean value of the three measurements was used for modeling.Spectra were recorded at 25˚C ± 1˚C and 46% ± 1% relative humidity.

Calibration and Prediction Process
A calibration and prediction process was performed to achieve the goal of modeling optimization.All samples were divided into the calibration (40 samples) and prediction (32 samples) sets.In order to ensure modeling representativeness and integrity, the calibration and prediction sets must cover the concentration ranges of the two indicators, and the distribution must be uniform.The root-mean-square errors (SEP) and correlation coefficients (R P ) for prediction were calculated, respectively.Calculation formulas are as follows: ( ) where n was number of prediction samples; , C  , respec- tively.SEP and R P were used to evaluate prediction accuracy of a PLS model.A smaller SEP value indicated higher prediction accuracy, and a larger R P value indicated higher prediction correlation.The model parameters were selected to achieve minimum SEP.

EC-PLS Method
EC-PLS method employed moving-window mode to select an appropriate combination of equidistant wavelengths to establish PLS model, the search parameters were set as the follows: 1) initial wavelength (I), 2) number of wavelengths (N), 3) number of wavelength gaps (G), and 4) number of PLS factors (F).The search range covered the entire scanning region, but it can also be reduced according to the actual conditions.The total number of wavelengths for the search range was set as N * .Therefore, N can be set as ( ) The combination of parameters (I, N, G) corresponded to a continuous waveband when G = 1, which corresponded to MW-PLS method.Therefore, EC-PLS is the promotion of MW-PLS in term of the algorithm.
In this study, the parameters I, N, G, and F were set as { } 400, 402, , 2498 , respectively.PLS model for each combination (I, N, G, F) was established.The global optimal PLS model was selected according to the following expression: ( ) * , , , SEP min SEP , , , The number of PLS factor (F) is an important parameter that corresponds to the number of integrated spectral variables.The selection of a reasonable F is necessary but difficult [9] [11] [12] [14]- [16].For any fixed combination of parameters (I, N, G) = (I 0 , N 0 , G 0 ), the optimal F was determined according to the following expression: ( ) ( ) On the other hand, due to the cost and material properties, the instrument design typically involves certain limitations of the position and number of wavelengths.At some instances, the demand of actual conditions is not met by the global optimal waveband.Therefore, local optimal wavebands that correspond to different positions and numbers of wavelengths are significant.For any fixed I = I 0 , the local optimal model was selected according to the follows: For any fixed N = N 0 , the local optimal model was selected according to the follows:

SEP
min SEP , , , And for any fixed G = G 0 , the local optimal model was selected by: ( ) ( )

Equivalence Model Set
As mentioned, the global optimal wavelength combination can be selected according to min SEP.However, the models with insignificantly fluctuating prediction accuracy are statistically equivalent because the samples are random and limited.Therefore, the optimal SEP value can slowly increase.The equivalence model set that corresponds to a certain percentage (α) was expressed as the follows: The α value was set through simulation experiments based on the actual data.From the obtained equivalent model set α Ω , the simplest equivalent model with the smallest number of wavelengths was further selected.The computer platform in this study was developed with Matlab 7.6 software.

Full PLS Models
The NIR spectra of 72 samples of TCP-Tris-HCl mixture in the entire scanning region (400 -2498 nm) are illustrated in Figure 1.As indicated in the Figure 1, the peaks located around 1400 nm and 1900 nm were caused by absorption of water molecules [14] [15], the other peak appeared at 2350 to 2500 nm could be caused by saturate absorption of the samples, which could lead to strong spectral noise interference.
Based on the entire scanning region, the full PLS models for the analyses of TCP and Tris were first established.The model parameters and prediction effects (SEP, R P ) are summarized in Table 1.The R P values both achieved above 0.9723.However, the number of the adopted wavelengths (N) was 1050, which led to high model complexity.Table 1.Parameters and prediction effects of the full PLS models, the optimal EC-PLS models and the simpler equivalent models for TCP and Tris.

Optimal EC-PLS Models
EC-PLS was performed to improve prediction effect and reduce model complexity.The obtained optimal parameters I, N, G, and F were 1508 nm, 62, 3, 7 for TCP and 1106 nm, 27, 4, 6 for Tris, respectively.The parameters and prediction effects are also summarized in Table 1.Results showed that the optimal EC-PLS models were obviously better than the full-PLS models, and the numbers of the adopted wavelengths (N) were significantly reduced.
The SEP values of the local optimal models for each I, N, and G are shown in Figure 2. The local optimal models with prediction effects close to one of the global optimal models remain good choices, and they could address restrictions for the position and number of wavelengths in instrument design.

Equivalence Model Sets for the Two Indicators
The method mentioned above was performed to select the equivalence model sets for the two indicators.The global optimal SEP values (SEP * ) were 0.041 and 0.206 mg•mL −1 for TCP and Tris, respectively.The values of α for the two indicators were both set as 0.08 through several simulation experiments, and the corresponding equivalence model sets were expressed as the follows: , , , | SEP 0.044 where, the equivalence model set (10) included 21 models for TCP, the wavelength combination ranged from 1470 to 1890 nm; while the equivalence model set (11) included 27 models for Tris, the wavelength combination ranged from 988 to 1338 nm.Positions of the wavelength combinations are shown in Figure 3, starting from left to right in the order of initial wavelength.The simplest equivalent models were further selected.The parameters I, N, G and F were 1520 nm, 28, 5, 5 for TCP and 1084 nm, 13, 6, 4 for Tris, respectively.The corresponding wavelengths combinations were 1520, 1530, 1540, 1550, 1560, 1570, 1580, 1590, 1600, 1610, 1620, 1630, 1640, 1650, 1660, 1670, 1680, 1690, 1700, 1710, 1720, 1730, 1740, 1750, 1760, 1770, 1780, 1790 (nm) for TCP and 1084, 1096, 1108, 1120, 1132, 1144, 1156, 1168, 1180, 1192, 1204, 1216, 1228 (nm) for Tris, respectively.The parameters and prediction effects are also summarized in Table 1.Compared with the optimal EC-PLS models (i.e., N = 62 for TCP and N = 27 for Tris), the simplest equivalent models adopted less wavelengths (i.e., N = 28 for TCP and N = 13 for Tris).Thus, the redundant wavelengths were removed and the models were further simplified.By the way, the abovementioned wavelength selections for TCP and Tris successfully avoided the noise wavebands with strong absorption of water molecules and saturate absorption of the samples.

Conclusions
A rapid measurement method for TCP has a great significance and applied value in drug production monitoring and pharmacokinetics measurement.In this study, a simultaneous quantitative analysis method of TCP and Tris in the TCP-Tris-HCl mixture was established with reagent-free NIR spectroscopy.
The EC-PLS method and the equivalence model sets were proposed to select appropriate wavelength combinations, the simplest equivalent models were further obtained for NIR analysis of TCP and Tris.Compared with the optimal EC-PLS models, the simplest equivalent models adopted fewer wavelengths.Thus, the redundant wavelengths were removed, the models were further simplified.The results showed that the predicted values were high correlated and in good agreement to the actual values.NIR spectroscopy combined with the proposed wavelength selection method was successfully applied for reagent-free, accurate and simultaneous quantitative analysis of TCP and Tris.The wavelength selection could provide a valuable reference for further application to TCP fermentation broth.We believe that the methodological framework has such applicability and can be applied to other spectroscopic analysis fields.
were the actual and predicted values for i th sample, respec- tively, 1, 2, , i n =  ; The mean actual and predicted values of all samples were denoted as Ave Ave , C according to actual algorithm running time; While G can be set as to actual algorithm running time.Then the following equality was fitted for arbitrary I, N, and G:

Figure 1 .
Figure 1.Spectra of 72 samples of TCP-Tris-HCl mixture in entire scanning region.

Figure 2 .
Figure 2. SEP of the local optimal models for each single parameter: (a) initial wavelength; (b) number of wavelengths; and (c) number of wavelength gaps.

Figure 3 .
Figure 3. Positions of the wavelength combinations of each equivalence model set for: (a) TCP and (b) Tris.

Figure 4 .
Figure 4. Relationship between the predicted and actual concentrations of (a) TCP and (b) Tris.Take the simplest equivalent models for the examples; the relationships between the predicted and actual values are illustrated in Figure 4.The figures show that the predicted and actual values had very high correlations and very low errors for the two indicators.The results also indicated the feasibility of high accurate and simultaneous quantitative analysis of TCP and Tris with reagent-free NIR spectroscopy.