Raman Spectroscopy for Forensic Identification of Body Fluid Traces: Method Validation for Potential False Negatives Caused by Blood-Affecting Diseases

Two critical issues in forensic science are identifying body fluid traces found at crime scenes and preserving them for DNA analysis. However, the majority of current biochemical tests for body fluid identification, which are applicable at the crime scene, are presumptive and destructive to the sample. Raman Spectroscopy provides a suitable alternative to current methods as a nonde-structive, confirmatory, and potentially in field method. Our laboratory has developed a chemometric model for the identification of five main body fluids using Raman spectroscopy. This model was developed using samples obtained from healthy donors. Thus, it is of most importance for the forensic application of the method to validate its performance for donors with diseases that might affect the biochemical composition of body fluids. In this study, the developed method was validated using peripheral blood samples acquired from donors with Celiac Disease, Sickle Cell Anemia, and Type 2 Diabetes. It was shown that the method correctly identified all samples as peripheral blood indicating that no false positives could occur because the blood traces were originated from donors suffering from the diseases.

thus, the approximate time of the crime [4]. Considering the wealth of information that can be extracted from a single body fluid stain, it is important that suitable body fluid identification tests are accessible to analyst. Ideally, such a test could be universally applied to all body fluids, used in the field, and wouldn't destroy the often, little amount of sample found at the crime scene. Further, such a test should be confirmatory, meaning it doesn't yield false negatives (a negative result that is actually positive) or false positives (a positive result that is actually negative).
Methods that forensic analyst currently use contain many of these characteristics and offer a robust analysis of body fluids, however, they often fall short of the ideal. Presumptive chemical tests that can be used in the field, such as the Kastle Meyers test, are quite sensitive. However, they aren't awfully specific and can yield false positive results for a variety of common household substances and forensically relevant materials like denim, as well as false negatives [5] [6]. Thus, confirmatory analysis is required in the lab to ensure that blood is present. Confirmatory tests for blood like the Takayama and Teichman test are effective in their selective identification of blood but are destructive in nature [7]. More often, forensic analyst may go directly for DNA analysis to confirm the presence of human blood. While this method is direct and confirmatory, it must be conducted in a lab setting as opposed to in the field and can be timely [3].
Raman spectroscopy can potentially serve as an alternative to these methods and remedy these shortcomings. It has a number of uses in forensic science [8].
Paring Raman spectroscopy with chemometric analysis can allow for the confirmatory identification of body fluids, even in the presence of some contaminants [9] [10]. The advantages don't stop here. Raman spectroscopy is non-destructive; thus, it won't consume the trace amount of sample being analyzed. Further, this method can be universally applied to several body fluids, as opposed to using a specific test to search for each type of body fluids [9] [11]. This universal approach can save both money and time. Finally, recent advancements have allowed for the creation of portable Raman spectrometers, meaning that there is a possibility for use in the field.
Having acknowledged the many advantages provided by the use of Raman spectroscopy for the analysis of body fluid stains, our laboratory has developed a novel method that can identify the five main body fluids (peripheral blood, semen, saliva, sweat, and vaginal fluid) from their Raman spectra using a machine learning model [9]. This method, outlined in Muro  While this method has already been shown to have remarkable promise for the more efficient analysis of body fluid stains, it still needs thorough validation before practical application by law enforcement agencies. Notably, the model used in this method was constructed using body fluid spectra from samples that came from healthy donors. There are a host of diseases that can affect the biochemical composition of body fluids, which in turn could affect their Raman spectra, and the ability of the model to identify the body fluid correctly. Thus, to ensure the forensic relevance of this method, potential false negatives arising from diseases that could affect the biochemical composition of a body fluid must be evaluated.
In this paper, the effect of blood-affecting diseases on the ability of the model to identify peripheral blood was evaluated. The chosen blood-affecting diseases have already been shown to pose a potential risk for false negatives due to their Raman spectra's dependance on the disease. While this is useful for disease diagnostics, it can complicate forensic analysis. The first blood-affecting disease, Celiac disease, is an autoimmune disorder where an individual's small intestine will damage itself if they don't remove gluten from their diet. This disorder affects 1 in 100 people world-wide [12]. Ralbovsky & Lednev have shown that blood from donors with Celiac Disease differs from that from healthy donors on a gluten-free diet [12].
The Raman spectra of blood have also been shown to be affected by whether or not an individual has Sickle Cell Anemia. This genetic disorder is characterized by abnormal hemoglobin, such as Hemoglobin S, which arises due to the replacement of glutamic acid with valine in the hemoglobin β amino acid chain [13]. As a result of the abnormal hemoglobin, the red blood cell forms a sickle shape. This can cause a variety of complications such as high blood pressure or even heart failure [14]. The changes seen in the Raman spectra of Sickle Cell Anemia blood can be explained by differences in hemoglobin. As it dominates the Raman spectra of blood, differences in hemoglobin can have such an effect on the overall Raman spectra [13] [14].
Type 2 Diabetes is a condition characterized by abnormally high glucose levels in the blood due to scarce or ineffective insulin, which can lead to severe complications such as organ failure [15]. This particular type of diabetes is the most prevalent form with 8.6% of American adults having Type 2 diabetes compared to 0.55% of the same population having Type 1 diabetes [16]. The Raman spectra of blood from donors with Diabetes have also been shown to differ from healthy controls. This could be due to several biomarkers of the disease, such as Hemoglobin A1c [15].
In this study, the validation of the Raman spectroscopic method for the iden-

Samples
Whole blood samples from 10 donors with CD, 3 donors with SCA, 4 donors with D2, and 1 donor with both SCA and D2 were purchased from BioIVT, Inc.
(Westbury NY). The donors were chosen such that there was variety in age, sex, and race. This is with the exception of the race of SCA donors, who were all African American. The samples were stored in a −80˚C freezer until use. Upon use, the samples were thawed, and 5 μL of the whole blood was deposited onto an aluminum covered glass slide. The slide was stored in a petri dish to prevent contamination and allowed to dry overnight.

Instrumentation
The Raman spectra of the blood samples were collected using an inVia Raman spectrometer (Renishaw, Inc. Hoffman Estate, IL) with WiRE 3.2 software and a Lecia research grade microscope. Prior to sample collection, the spectrum of a silicon standard was collected to ensure the instrument was properly calibrated. A 20× objective was used to focus the laser beam onto the samples. A 785 nm excitation wavelength was used at a laser power of 5%. For each spectrum, 20 ten-second accumulations were collected over a 300 -1800 cm −1 range. 10 -20 spectra were collected from different points within the same bloodstain using automatic mapping to account for intrasample variability. These parameters aligned with those used to record the Raman spectra of healthy blood used to construct the model developed in Muro et al., 2016.

Statistical Analysis
Cosmic rays were first removed from the Raman spectra of the blood samples in the WIRE 3.2 software. After, the spectra were preprocessed in MATLAB (MathWorks inc., Natick, MA) using the PLS Toolbox extension (Eigenvector Research, Wenatchee, WA). In accordance with the preprocessing used in Muro et al., 2016, the spectra were first baseline corrected using automatic weighted least squares, 5 th order polynomial. Then the spectra were normalized by total area. After preprocessing, each group of spectra was analyzed using the Support

Result & Discussion
This study sought to further validate a previously developed body fluid identification model based on Raman spectroscopy by investigating the potential for false negatives caused by blood-affecting diseases. In this study, 213 Raman spectra were collected from 18 peripheral bloodstains obtained from donors with various blood-affecting diseases. Peripheral blood from donors with CD, SCA, and D2 was evaluated in this study. The average preprocessed Raman spectra for each disease are displayed in Figure 1, alongside the average preprocessed Raman spectra collected from healthy controls that were used to test this study's protocols. The spectra exhibit remarkable similarity upon visual examination. Still, given the subjective and non-computational nature of visually evaluating spectra, analysis using the statistical model is necessary to ensure that the blood-affecting diseases won't contribute to false negatives.
Prior to analysis using the SVMDA model, the Raman spectra were preprocessed in the same manor that the spectra used to build the model were. Then, each disease group was evaluated by the model. While 100 CD spectra were collected, 95 were preprocessed and analyzed using the model, as 5 of the Raman spectra were excluded from the data set. Figure 2 shows the five spectra that were excluded based on visual inspection. These spectra exhibited excessive noise, or exorbitant amounts of cosmic rays. The obstruction by the noise/cosmic rays was evident from the visual examination and justified the removal of these spectra from further consideration, leaving 95 spectra to be analyzed by the model.
Of the 95 CD spectra analyzed, all 95 were correctly identified as peripheral blood.
For the other two diseases evaluated, SCA and D2, no spectra were excluded from preprocessing and analysis with the model. Thus, 52 SCA spectra and 75 D2 spectra were analyzed. It should be noted that one sample, for which 14 spectra were collected, is double counted for these two disease groups, as the  donor had both diseases. All of the SCA and D2 spectra were correctly identified by the model as peripheral blood.
The model was successful in identifying the 208 spectra resulting in 100% accuracy. The correct classification of all the spectra means that the model exhibited no false negatives despite analyzing samples from donors with blood-affecting diseases. These findings suggest that the analyzed diseases don't pose a risk for causing false negatives. This could be attributed to the fact that at the excitation wavelength used for analysis, 785 nm, the Raman spectra of blood is dominated by hemoglobin. Some of the diseases do influence hemoglobin, such as SCA which is caused by the atypical hemoglobin S molecule. However, the results suggest that the composition of these hemoglobin molecules and healthy hemoglobin are largely the same. While the difference between the two may be enough to result in functional differences and differentiation when a model is trained to do so, their Raman spectra are similar enough to still allow for the analyzed blood-affecting disease blood to be correctly identified.

Conclusion
Body fluid traces remain a crucial piece of evidence for forensic investigations. While current methods for the identification of body fluids often require further testing and destroy the evidence in question, our laboratory has recently developed a universal, non-destructive, and confirmatory body fluid identification method based on Raman spectroscopy paired with chemometrics. Samples from healthy donors have been used to construct and validate the method. A 2018 study by Fikiet et al. examined the potential for azoospermia to cause false negatives for semen identification [17]. In this study, the potential for false negatives N. A. Nichols, I. K. Lednev American Journal of Analytical Chemistry caused by blood-affecting diseases was evaluated. It was investigated whether the method could correctly identify peripheral bloodstains that came from donors who have CD, SCA, and D2. The method correctly classified all the analyzed peripheral blood spectra, thus achieving 100% accuracy. This finding suggests that the presence of the blood-affecting diseases don't result in false negatives for this method.