A Novel Next-Generation Sequencing Approach without Donor-Derived Material for Acute Rejection and Infection Monitoring in Solid Organ Transplantation

Background: Donor-derived cell free DNA (ddcf DNA) has been reported as a universal noninvasive biomarker for rejection monitoring in heart, kidney, liver, and lung transplantation. Current approaches based on next-generation sequencing for quantification of ddcf DNA, although promising, may be restricted by the requirement for donor material, as donor samples may not be available. Methods: We proposed a novel next-generation sequencing approach without donor-derived material and compared the non-donor-derived approach and the donor-derived approach using simulation testing and 69 clinical specimens. We also evaluated the performance for acute rejection and infection monitoring in lung transplantation. Results: The non-donor-derived approach reached similar efficacy as the donor-derived approach with a significant linear correlation of R = 0.98. Subsequent validation in clinical specimens demonstrated significant difference between the acute rejection *Bing Wei, Liuhong Zeng and Di Shao contributed equally to the work, and all should be considered as first authors. How to cite this paper: Wei, B., Zeng, L.H., Shao, D., Zheng, C.T., Yang, Q., Zhang, J.B., Xiao, D., Deng, Q.H., Lin, Y.P., Huang, D.X., Liu, L.P., Xu, X., Liang, W.H., Ju, C.R., Wang, J., Kristiansen, K., He, J.X. and Ye, M.Z. (2018) A Novel Next-Generation Sequencing Approach without Donor-Derived Material for Acute Rejection and Infection Monitoring in Solid Organ Transplantation. Journal of Cancer Therapy, 9, 623-638. https://doi.org/10.4236/jct.2018.99054 Received: July 23, 2018 Accepted: September 2, 2018 Published: September 5, 2018


Introduction
As the respiratory centre, the lungs require strong abilities for environmental adaptation and immuno-protection against microbial infections.For patients with end-stage lung disease, lung transplantation may constitute the only effective approach and may largely increase life expectancy and substantially improve quality of life [1].However, despite considerable advances and the wide use of immunosuppressant drugs, acute rejection (AR) remains a highly prevalent major complication of transplantations, especially in the first year post-operationally, impacting 50% to 90% of patients [2].It is also recognized as one of the risk factors for the development of bronchiolitis obliterans syndrome, which ultimately leads to long-term morbidity and mortality after lung transplantation [3].However, no reliable serum marker is available to monitor AR after lung transplantation [4].Transbronchial biopsy, the gold standard for diagnosis, is an invasive procedure that may cause side effects and is limited by inter-observer variability in grading [3] [4] [5].Apart from rejection, lung transplant recipients are also at risk of infections owing to hypoimmunity and susceptibility to immunosuppressants, poor clearance of airway secretions, impaired cough reflex, and impaired blood flow to the lung graft [6].Differential diagnosis between rejection and infection after lung transplantation has always been difficult for clinicians, as the symptoms are generally too similar to distinguish.Therefore, there is considerable need for simple and noninvasive approaches for early and accurate lung allograft rejection and/or infectious pathogen test methods.
In 1998, Lo et al. found that there were cell-free donor-derived DNA (ddcfDNA) tags existing in the plasma samples of transplant recipients and that these tags might be used for monitoring graft rejection [7].Since then, methods based on donor-specific chromosome Y, HLA marker, and single nucleotide polymorphism (SNP) sites from plasma DNA [8] [9] [10], with the aid of techniques such as digital droplet PCR coamplification at lower denaturation tempera-B.Wei et al. ture-PCR [11] [12], quantitative PCR and next-generation sequencing (NGS), had been used for transplantation rejection monitoring of the liver, kidney, heart, and lung [13] [14] [15] [16] [17].Current approaches for quantification of ddcfDNA that do not obtain massively parallel signatures and do not use donor-derived material such as digital droplet PCR may lead to instability and inconclusive results.Approaches based on SNPs by plasma sequencing could avoid this shortcoming and have shown great potential for application in solid organ transplantation.One of these methods is the genome transplant dynamics (GTD) approach [10] [14] [17], which used a bead-based system for genotyping from the genomic DNA of pre-transplant donors and recipients to distinguish heterologous SNPs and whole genome sequencing (WGS) from cell-free DNA (cfDNA) of the post-transplant recipient plasma to calculate the donor fraction by a weighted formula.As an essential step of the GTD approach, large-scale use of genotyping arrays would significantly increase the cost of rejection detection [18].The use of this approach may also be restricted by the need for a donor's genomic DNA information for genotyping, as the policy of privacy protection for donors is strictly enforced, and such donor samples may be lacking in the clinic treating the recipient.Therefore, a NGS-based approach not requiring donor-derived material would greatly enhance transplantation monitoring.
Here, we introduce an non-donor-derived cfDNA transplant dynamics (NDTD) approach that is implemented by genotyping with only genomic DNA from a pre-transplant recipient by targeted capture NGS in a mini-screen SNP array and calculating donor fraction with cell-free DNA from post-transplant recipient samples that contain cfDNA such as plasma and urine by extra-low depth WGS to monitor AR and infection.The scheme of the workflow used to monitor AR by the NDTD approach is shown in Figure 1.In the current study, ddcfDNA was first used as a biomarker of transplantation by the NGS approach genotyping without donor-derived materials to solve the differentiation between rejection and infection.A specific cut-off value algorithm was established to calculate acute rejection and non-rejection.Then, clinical specimens were brought in for validation.However, large cohorts should be examined for further validation and study.nONE array (BGI, Shenzhen, China) with a target region size of approximately 180 megabases (Mb) including the whole exome (44 Mb), a population representative tagSNP region (132 Mb), and the major histocompatibility complex region (4.9 Mb) for a specific population was performed to select the complete set of heterologous SNPs.Then, we selected particular random subsets of those SNPs for 100 repetitions each by increasing the ratio from 0.01 to 0.05 with graduations of 0.01, from 0.05 to 0.3 by 0.05, and from 0.3 to 0.9 by 0.1, and re-calculated the average of the donor fraction (Figure 2).The value clearly decreased simultaneously when heterologous SNPs were less than 10,000 and especially when below 5000.This indicated that a mini-screen array used for target capture NGS for genotyping should contain no less than 10 thousand heterologous SNPs.A sliding window of 50 kb across each chromosome was applied.54,571 (97.36%)SNPs were selected as one SNP per window from the Huma-nOmniZhongHua-8 Beadchip (Illumina, San Diego, CA).235 (0.42%) windows that were not located were filled with locations from the HumanCNV370-Duo Beadchip (Illumina, San Diego, CA).The remaining 1243 windows (2.22%) were filled with SNPs with highest minor allele frequency from the dbSNP [19] database.Finally, 56,049 target SNPs were selected and extended to 100 bp on both sides for the oligonucleotide probe design and capturing.

Quantification of ddcfDNA
High quality reads were firstly aligned to the human reference genome (UCSC hg19), using BWA or TAMPtools (for BGISEQ-100 sequencing data) with default parameters and then PCR duplications were removed by using SAMtools rmdup or BamDuplicates tools with default parameters.Next, genomic DNA sequencing reads from pre-transplantation recipient samples were genotyped by Caller TM tools with target-seq germline low stringency's parameters.SNP locations that were not genotyped by the above tools were genotyped autonomously after describing the base-pair information at each chromosomal position by SAMtools mpileup (total depth cut-off 6), based on variant allele frequency (25% -95% as heterozygote, >95% as homozygote).For sequencing reads of plasma samples, only unique mapping reads were reserved; sequencing information at SNP positions that corresponded to features in genotyping was collected using SAM tools mpileup.
Without the requirement of genotyping the pre-transplant donor genomic DNA, the predicted probability of a population allele such as reference homozygous P db (AA) , allele homozygous P db (BB), and heterozygous P db (AB) genotype frequencies were calculated in the East Asian population from the 1000 Genomes Project database [20] assuming Hardy-Weinberg equilibrium.For example, if the Reference allele A frequency is 0.6, the other allele B frequency is 0.4, then the genotype frequency of the reference homozygous P db (AA), allele homozygous P db (BB), and heterozygous P db (AB) are 0.36, 0.16, and 0.48 respectively.For post-transplant recipient plasma reads at SNP locations (recipient = AA/BB), four conditions of recipient-donor genotype combinations usable for calculating donor signal are considered (assuming no sequencing errors).The predicted probability of the donor is equal to the probability of the population allele when conditions 1 and 4are [P(AA) = P db (AA); P(AB) = P db (AB); P(BB) = P db (BB)] and greater than the probability of the population allele owing to the absence of the homozygous allele, which is the same as in the recipient, when conditions 2 and 3 are [P(AB) = P db (AB)/(P(AB) + P(BB)); P(BB) = P db (BB)/(P(AB) + P(BB)); P(AA) = P db (AA)/(P(AB) + P(AA))] (Table 1).The predicted base call accuracy rate of each plasma read at the target sites, Q, is calculated from the Phred score (Qs) [Q = 1 − 10 −(Qs−33)/10 ] when considering sequencing errors in reality.Finally, the ddcfDNA fraction could be calculated using the weighted formula that summarized the particular probability of the heterozygous and homozygous alleles differing between donor and recipient per read at SNP locations (recipient = AA/BB) in each plasma sample [Donor Frac- Table 1.Four conditions of recipient-donor genotype combinations in post-transplant recipient plasma.

Data Analysis of Pathogenic Agents
High quality reads of the sequencing data were primarily aligned using BWA mem tools (-k 32 -M -t 10) to the human reference genome (UCSC hg19).The remaining reads (usually less than 5%) that were unable to map to the human genome were secondarily aligned to the human-related microbe genomics database encompassing viruses, bacteria, fungi, and protozoa, that were mainly collected from National Center for Biotechnology Information (NCBI) genome database autonomously using BWA mem tools (-k 32 -M -t 10).The normalization value of one pathogenic abundance, abu, was calculated according to the formula, [abu = total reads of one pathogenic agent/(millions of mapped reads of all pathogenic agents in the same kingdom × kilobases of pathogenic agent genomic sequence)].Then, the species taxonomy and gene information identifier was annotated from the NCBI database.Finally, infection event for each recipient was determined with elevated levels of relative abundance, abu, in time-points dynamic monitoring instead of pathogen-specific thresholds to discriminate between colonization, infection, and disease.

Statistical Analysis
Coefficients of determination (R squared) were performed using Excel (Microsoft).Kolmogorov-Smirnov test and Welch's t test were performed in R 2.15.1.A P value of < 0.05 was considered statistically significant.ROC analyses were performed using GraphPad Prism 5.

Evaluation Testing of the NDTD Approach
To check the availability of the mini-screen target capture array, genomic DNA from two healthy volunteers (to simulate pre-transplant recipient and donor, respectively) was extracted and sequenced with mean depth = 1.7 gigabases (Gb), representing 110-fold coverage per sample (see online Supplemental Table S2), by targeted capture NGS in the selected mini-screen SNP array and then 14,804 (which is greater than 10,000) heterologous SNPs were detected by the GTD approach.Synchronously, we defined "0% donor" as the negative control and mixed cell-free DNA of the volunteer "donor" into the "recipient" with the donor DNA fraction varying from 0.5% to 10% to simulate the post-transplant recipient plasma samples.We sequenced eight simulation samples by extra-low depth WGS: mean depth = 1.59 Gb, 0.5-fold mean coverage per sample (see online Supplemental Table S3), with 8192 reads average, located at heterologous SNP sites.Finally, the donor fraction was calculated, showing a significant linear correlation (R² = 0.99, Figure 3  0.001) approach compared with % donor DNA in theory from the mock sequencing libraries in the evaluation assay."Total", "hete" and "homo" represent the calculated % donor DNA in testing using all donor SNPs including both heterozygous and homozygous donor SNPs, using just the heterozygous donor SNPs, and using just the homozygous donor SNPs, respectively.
donor DNA information is likely particularly lacking in long-term and severely affected patients.This demonstrated a significant linear correlation between % Donor in the library and % Donor DNA (R 2 = 0.98, Figure 3(b) y = 0.6967x + 1.7955, R 2 = 0.9816 P < 0.001).Next, the ddcfDNA levels from 69 plasma samples of lung-transplanted recipients that had undergone genotyping by targeted capture NGS in the mini-screen SNP array were calculated respectively by the GTD approach, which requires donor genotyping information, and by the NDTD approach, which is without the need of donor genomic DNA sample, in the same sample cohort (Figure 4).Overall, % donor cfDNA calculated by the NDTD approach was coincident with that calculated by the previous GTD approach (P = 0.2477, Kolmogorov-Smirnov test), which may indicate that the ddcfDNA levels could be quantified by our NDTD approach.Moreover, the slight variation of the two approaches, especially in the group with signs of acute rejection and plasma samples collected within 14 days, may imply possible deviation in SNPs between the public allele database and one particular individual in the amplified signal under the condition of higher ddcfDNA concentrations.It is shown that levels of ddcfDNA increased during the first 14 days post-transplant although this is a period absent of rejection.

Differentiation of Lung Transplant Rejection by the NDTD Approach
We performed quantification of ddcfDNA to monitor acute rejection and detection of the infectious agents simultaneously by the NDTD approach and compared the results with clinical examination in a cohort of 69 recipient plasma samples collected from 16 lung transplantation patients.For rejection surveillance, samples collected during the first 14 days post-transplant, which is a period absent of rejection and may exhibit elevated levels of ddcfDNA, were excluded.The ddcfDNA levels (n = 62) were significantly different between the AR group (4.83 ± 2.11 %, mean ± SD) and the non-rejection group (1.61 ± 0.63 %, mean ± SD) (P < 0.0001, Welch's t test).Findings were validated by biopsies (n = 17) and clinical indications (n = 45) (Figure 5).With the cut-off value of 2.999, this method exhibited 90.48% sensitivity (95% CI, 69.62% -98.83%), 100% specificity (95% CI, 91.59% -100%), and AUC 0.9266 (95% CI, 0.8277 -1.026) in lung transplantation (see online Supplemental Table S4).However, the difference was not significant between the non-rejection group and the chronic rejection group (P = 0.9340, Welch's t test), implying that additional chronic rejection events should be observed in further studies (Figure 5).According to these results, we may find that ddcfDNA levels from lung allograft recipients increase when rejection events occur, especially during acute rejection.
For detection of infectious agents, whole genome sequencing reads were used  Generally, the mean % ddcfDNA in the acute rejection group was higher than that in the non-rejection group and that in the chronic rejection group.
to evaluate the virus, bacteria, and fungus infection concurrently after removing host reads of human sequence.We found positive infection status that was vali-  S4).

Discussion
The distinction between rejection and infection after solid organ transplantation has always presented a problem for clinical therapy because the clinical symptoms are sometimes similar.There is no reliable marker for AR monitoring, which is limited to detecting restricted pathogen species in the clinic; detection of rejection and infection only using the same data from blood samples thus presents an exciting prospect.Our results demonstrate that the NDTD approach without donor-derived material has the ability to monitor acute rejection by quan- typing procedure implemented by targeted-capture NGS in a mini-screen SNP array requires less sequencing data and would reduce cost.The whole procedure including genotyping required 3 days, needing only 1.5 days if the genotyping step had been done in advance.Our approach showed high consistency with the previous GTD approach as shown in the validation step, which included both simulation tests and detection of events in lung transplantation.Additionally, the slight variation of the two approaches may imply an individual difference; thus, a more comprehensive and convincing public allele frequency database such as dbSNP may be required in the future.Subsequently, verification of the lung transplant cohort indicated that the differentiation of ddcfDNA between no rejection and rejection groups was obvious, especially in the case of acute rejection.
Notably, the subsequent sequencing data annotation was also indicative of pathogenic agents such as virus, bacteria, and fungus agents.Out of all the screened infectious agents, this approach delivered an advantage in virus testing, especially for CMV infection, which posed the most common threat for infectious complications after lung transplantation.To build up a more reasonable rejection-infection differentiated model, potential modifications include: 1) increasing the sequencing depth of plasma samples to capture more pathogen materials, although the current sequencing depth is sufficient for rejection monitoring; 2) building pathogen-specific thresholds to discriminate between colonization, infection, and disease; and 3) expanding sample types such as sputum, bronchoalveolar lavage fluid, nasopharyngeal swabs, and plasma samples.

Conclusion
In conclusion, these findings suggest that the NDTD approach has the ability of

Supplemental
Table S1.Lung transplant recipient demographic characteristics.

Figure 1 .
Figure 1.Scheme of the workflow used to monitor acute rejection by the NDTD approach.

Figure 2 .
Figure 2. Simulation tests by randomly decreasing numbers of informative and control SNPs for sample 04_1, 20_1, and 20_2.Donor percentages remained stable when the number of informative SNPs was no less than 10,000 (red dashed line).The x-axis indicates the number of informative SNPs; y-axis indicates the value of the donor percentage.
(a)) between the calculated donor fraction in the test and the donor percentage in theory, indicating sufficiency to measure organ transplant rejection by the mini-screen target capture array.Next, the donor fraction was re-calculated from the simulation data abandoning donor genotyping information by the NDTD approach as pre-transplantB.Wei et al.DOI: 10.4236/jct.2018.99054630 Journal of Cancer Therapy

B.
Wei et al.

Figure 4 .
Figure 4. ddcfDNA levels from 69 plasma samples in lung transplant recipients.Line graph by the GTD and NDTD approach between cumulative sample numbers (x-axis) and sort ascending % ddcfDNA (y-axis) is shown.69 samples were divided into 4 groups (red dashed line): 1) No rejection group of plasma sample collected > 14 days included biopsy score equal to A0 (quiescence), no treatment for rejection and no clinical signs of rejection (n = 37).2) Chronic rejection group of plasma sample collected > 14 days with biopsy proven (n = 4), which excluded the AR group.3) AR group of plasma sample collected > 14 days included biopsy score ≥ A1 (minimal-to-severe rejection), treatment for AR (steroid pulse therapy), and clinical signs of AR (n = 21).4) Plasma samples were collected during the first 14 days (n = 7).

B.
Wei et al.

Figure 5 .
Figure 5. Scatter plot for % ddcfDNA of lung post-transplantation plasma samples.Plasma samples were collected > 14 days (n = 62) post-transplant and divided into 3 groups: 1) No rejection (n = 37).2) AR group (n = 21).3) Chronic rejection (n = 4).Generally, the mean % ddcfDNA in the acute rejection group was higher than that in the non-rejection group and that in the chronic rejection group.
tification of ddcfDNA and to detect the infectious agents simultaneously.The approach was composed of two processes: genotyping of recipient pre-transplantation and ddcfDNA detection of recipient post-transplantation, which can be carried out on different sequencing platforms with automated data analysis.The geno-B.Wei et al.DOI: 10.4236/jct.2018.99054633 Journal of Cancer Therapy diagnosis and discrimination between rejection and infection post-transplant in lung transplantation and may be applied to other types of solid organ transplantation (such as heart, kidney, and liver) where ddcfDNA may also exist in the recipient's plasma.It demonstrates a cost-effective and noninvasive sequencing approach without the requirement of donor-derived genotyping, which will better satisfy the needs of clinical situations and show a wider range of clinical application to accelerate the development of precautionary molecular diagnosis in solid organ transplantation.

Table S2 .
Performance of the target capture sequencing quality.

Table S3 .
Statistics of the simulation plasma cell-free DNA libraries.

Table S4 .
Sensitivity and specificity performance by the NDTD approach in lung transplantation in ROC curve analysis.