Assessing Chemical Mixtures and Human Health: Use of Bayesian Belief Net Analysis

Background: Despite humans being exposed to complex chemical mixtures, much of the available research continues to focus on a single compound or metabolite or a select subgroup of compounds inconsistent with the nature of human exposure. Uncertainty regarding how best to model chemical mixtures coupled with few analytic approaches remains a formidable challenge and served as the impetus for the study. Objectives: To identify the polychlorinated biphenyl (PCB) congener(s) within a chemical mixture that was most associated with an endometriosis diagnosis using novel graphical modeling techniques. Methods: Bayesian Belief Network (BBN) models were developed and empirically assessed in a cohort comprising 84 women aged 18-40 years who underwent a laparoscopy or laparotomy between 1999 and 2000; 79 (94%) women had serum concentrations for 68 PCB congeners quantified. Adjusted odds ratios (AOR) for endometriosis were estimated for individual PCB congeners using BBN models.55) were associated with an endometriosis diagnosis. Combinations of mixtures inclusive of PCB #114 were all associated with higher odds of endometriosis, underscoring its potential relation with endometriosis. Conclusions: BBN models identified PCB congener 114 as the most influential congener for the odds of an endometriosis diagnosis in the context of a 68 congener chemical mixture. BBN models offer investigators the opportunity to assess which compounds within a mixture may drive a human health effect.


Introduction
Humans are exposed to diverse environmental chemical mixtures largely as a result of diet, occupation and modern life.However, much of the published literature focuses on a single compound or a select subset of compounds in relation to health outcomes, making it exceedingly difficult to synthesize the cumulative effects of chemical mixtures on human health, or to be sure the selected compound(s) is truly exerting the effect as presumed.As laboratory capabilities become even more sophisticated, allowing for the quantification of existing and emerging compounds at environmentally relevant concentrations, it is imperative to have analytical tools capable of modeling chemical mixtures in relation to health outcomes if we are to have a more complete understanding of the effects or modes of action.
Despite longstanding recognition of the importance of considering chemical mixtures, few analytic options are available to researchers when assessing possible health effects including those believed to adversely impact hu-man reproduction and development as recently summarized [1,2].Past analytic approaches for studying chemi cals and endometriosis include: restricting to a particular compound such as dioxin [3], the a priori selection of more prevalent PCB congeners [4] or those with similar structural activity [5,6], the simple summing of individual congeners with or without a weighting as in the case of toxic equivalency quotients (TEQ) for dioxin [7], summing then categorizing by purported biologic activity such as estrogenic or anti-estrogenic congener groupings [8,9], or using differentially weighted linear combination of congeners to identify subsets that are most associated with the outcome [10].
The purported relation between exposure to hormonally active persistent environmental chemicals and endometriosis is an example of a suggestive albeit equivocal literature that has utilized varying analytic approaches for assessing this relation.For example, plasma dioxin was reported to be associated with endometriosis [3] while no association was observed using either a TEQ [7] or simple sum [4] approach for persistent organochlorine compounds.However, when PCB congeners were categorized by purported biologic activity, an elevated odds of laparoscopically confirmed endometriosis was observed for anti-estrogenic PCB congeners [1].While chemical profiles may vary across populations, it is plausible to assume that each population is exposed to a mixture of chemicals underscoring the need for investigators to attempt to utilize all available chemical profiling in assessing health effects to ensure our statistical models are indeed biologically plausible and reflective of contemporary geographic exposures.
We present an approach for assessing the effects of individual PCB congeners in the context of a mixture of congeners in relation to endometriosis, including congeners that may have opposing biologic activity, especially for an estrogenic dependent disease such as endometriosis [11].The goal of our research is to utilize novel methods for identifying which, if any, compound within a chemical mixture is most associated with an adverse health effect such as endometriosis, thereby, allowing investigators to evaluate mixtures more closely reflecting human exposures.

Study Cohort and Data Collection
The study cohort comprised 84 (84% participation) women aged 18 -40 years undergoing incident laparoscopy or laparotomy at one of two university-affiliated hospitals between 1999-2000.Standardized interviews were conducted primarily in the women's homes prior to surgery for the ascertainment of medical and reproductive history and current lifestyle.Upon completion of the interview,  20 cc of blood was obtained from 79 (94%) women using venipuncture equipment free of the contaminants under study.Surgeons were instructed to fully inspect the pelvis of all women regardless of preoperative diagnosis or symptoms, and to complete standardized operative reports regarding all observed gynecologic disorders.Thus, an endometriosis diagnosis was based upon laparoscopic visualization, which is considered the gold standard [12,13].A more complete description of the study protocol is provided elsewhere [9].

Toxicological Analysis
Gas chromatography with electron-capture (GC-EC) was utilized for quantifying concentrations for 68 serum PCB congeners.Briefly, serum specimens were run in batches of 10 plus four quality control samples (i.e., reagent blank, matrix blank, matrix blank containing a mixed standard of 15 specific congeners at known values, and a duplicate participant sample).Concentrations were calculated from standard curves for the 15 calibration standards; the re-maining congener concentrations were calculated from response factors generated for all remaining laboratory measured congeners.Each congener concentration was adjusted for surrogate recovery and subtraction of reagent blanks [14].The limit of detection was determined as three standard deviations from the mean of at least ten matrix blanks.Serum lipids were determined according to Phillips and colleagues' methods [15].

Statistical Analysis
In the descriptive phase, we first assessed the frequency distribution for each PCB congener to develop a categorization required for specifying the purported etiological models.There were 68 PCB congeners in the dataset, though six congeners (#6, #30, #47, #49, #50, #204) had only one nonzero observation (quantified in <2% of samples) and were deleted given that our statistical methodology required each congener to have at least three distinct observed percentiles.The remaining 62 congeners were used to build a joint probabilistic model for assessing the odds of an endometriosis diagnosis adjusting for a priori confounders.The goal was to establish a broad model that can answer various questions and possibly generate testable hypotheses about the effect of individual or groups of chemicals on human health outcomes such as endometriosis in the context of a chemical mixture.
Specifically, we used a Bayesian Belief Network (BBN) [16][17][18][19] to assess the impact of individual or groups of PCB congeners on the odds of an endometriosis diagnosis controlling for confounders.The BBN describes the joint probability distribution of a set of variables (e.g., PCB congeners and endometriosis) via a series of conditional independence statements that are represented in the form of directed acyclic graphs.The graphs are representations of purported causal relations between exposures and an outcome [20].Our specified BBN utilized a non-parametric formulation that eliminated the normality of data assumption [18], which is important given that many chemicals are not normally distributed.The BBN analysis was implemented using the Uninet 2.74 software [21], developed by Risk and Environmental Modeling Group at the Delft University of Technology, to estimate the model using empirical (conditional) rank correlations as computed from the data.Thus, the BBN was modeled to identify which, if any, congeners within the mixture were most associated with an elevated odds ratio for endometriosis.
A schematic diagram of the final BBN model is given in Figure 1.Each variable in the figure is shown as an elliptical node that may be influenced by other nodes via the directed arcs.The BBN in Figure 1 has essentially hree types of nodes: confounders (orange), PCB conge-t ners (white) and the outcome or diagnosis of endometriosis (blue).The a priori confounders were: age (continuous), parity (nulliparous versus parous) and cigarette smoking status (yes/no) [9].In the BBN, confounders may influence PCB congeners and the outcome, age may influence parity, and PCB congeners may influence the outcome.The BBN is used to estimate the adjusted odds ratios (AORs) and corresponding 95% confidence intervals (CIs) for endometriosis when the BBN is conditioned on specific concentration levels of individual PCB congeners or groups of congeners.Each conditional structure can be then interpreted as being representative of the population of women in the cohort with serum PCB concentrations approximately equal to the conditional values.Each arc is associated with weights representing the strength of the relation between the nodes joined by the arc.The weights can be set a priori by the user or can be empirically estimated from the data.Given the limited amount of information available about the effect of each PCB congener on endometriosis, we estimated the weights directly from the data using the default feature in the software.Given a set of weights, at each node one can then compute the conditional distributions of the current node given specific values of all other parent nodes.Once established, these distributions can then be used to evaluate a multitude of relations between PCB congeners, confounders and endometriosis within the specified framework.For simplicity purposes, we estimated AORs and CIs for PCB congeners for a BBN conditioned on a relatively high value of the congener (75 th percentile of the observed concentrations) and that for a low value (25 th percentile of the observed concentrations).

Results
Thirty-two (38%) women were diagnosed with endometriosis, while 52 (62%) were not.As previously reported for this operative cohort, women diagnosed with endometriosis were older and more likely to be nulliparous and nonsmokers in comparison to unaffected women [9] underscoring the importance of controlling for these factors.
Use of the BBN identified PCB congener #114 as being highly associated with an elevated odds ratio of endometriosis even after adjusting for relevant covariates.Using the framework in Figure 1, the conditional structure led to a 0.24 probability of an endometriosis diagnosis for a woman with a 0.001 ng/g serum concentration of PCB #114, and a 0.48 probability of diagnosis for women with a higher concentration (0.013 ng/g serum).Specifically, the odds for endometriosis were about 3 times higher (AOR = 3.01; 95% CI = 2.25, 3.77) for women with elevated levels (75 th percentile) of PCB #114 in comparison to women with relatively lower concentrations (the reference 25 th percentile), representing a two times higher probability of an endometriosis diagnosis among women in the elevated group.Another significant PCB congener identified for endometriosis was PCB #136 (AOR = 1.79; 95% CI = 1.03, 2.55).Other suggestive congeners were PCB #99 (AOR = 1.56; 95% CI = 0.81, 2.31) and PCB #153 (AOR = 1.45; 95% CI = 0.77, 2.13), though CIs for both included one.A key observation is the larger observed effect for PCB #114 when grouped with other PCB congeners (i.e., PCB #99, 101, 136, and 153) observed to be influential given their rank order correlations with endometriosis (Table 1).Specifically, we estimated AORs for PCB #114 in the context of the other influential PCB congeners (i.e., #99, 101, 136, 153) and observed a near doubling of the AOR with the simple addition of PCB #136.The odds were about eleven times higher (AOR = 11.03;95% CI = 9.88, 12.18) for women who had all five PCB congeners in the 75 th percentile in comparison to women with all PCB congeners in the 25 th percentile.Of particular note is the lack of overlapping CI when various PCB congeners are considered in the context of PCB #114.To evaluate the incremental effect of the remaining influential congeners in a mixture with elevated levels of PCB #114, we estimated the AOR between two mixtures with one including all five PCBs (PCB #114, 99, 101, 136 and 153) at higher concentration levels and the other with only PCB #114 at the higher level and the remaining four PCBs were present at their 25 th percentile level.The AOR for PCB #114 and other congeners was higher than that for PCB #114 alone, though the CI included one (AOR = 1.65; 95% CI = 0.97, 2.33).Based on the findings from the BBN analysis, PCB #99, 101, 114, 136, and 153 were identified as the most influential agents in the chemical mixture.The significant influence of the five selected congeners on endometriosis outcome is depicted in the BBN in Figure 2. The five congeners are identified in shades of purple with larger and darker nodes representing increasing influence.

Discussion
While persistent environmental chemicals such as dioxin and PCBs have been associated with endometriosis, the weight of evidence has relied upon statistical analyses a Odds ratios adjusted for age (years), parity (nulliparous/parous) and current cigarette smoking (yes/no).In each row, comparisons are restricted to women in the 75 th percentile for each congener relative to women in the 25 th percentile.that focus either on an individual chemical or on a combination of selected congeners that have been simply summed and analyzed.We are unaware of any previous efforts attempting to analyze measured chemical concentrations, and more closely capturing individual exposure scenarios (irrespective of limits of detection), to explore potential mixtures in the context of other relevant biologic covariates and their effect on reproductive outcomes such as endometriosis.We utilized a BBN framework to explore the individual effects of PCB congeners and the odds of an endometriosis diagnosis so that the joint distribution of PCB concentrations and endometriosis could be assessed in the context of potential confounding among the measured concentrations.To this end, the BBN approach incorporates both a data-driven reduction approach to identify particular chemicals driving the effect conditional on all other exposures and biologically relevant covariates for endometriosis.Unlike other modeling approaches that may fail to reach convergence when sample size is limited relative to the number of chemicals being studied, the BBN approach is robust to this concern.
Our BBN identified a range of AORs for individual PCB congeners ranging from <1 to >3.PCB #114 conferred the largest effect on the odds of an endometriosis diagnosis in this study cohort among the 62 PCB congeners assessed.In fact, high (75 th percentile) levels of PCB #114 conferred three times the odds of endometriosis in comparison to low (25 th percentile) levels.More interestingly, the effect of PCB #114 was greatly enhanced in mixtures with elevated levels of PCB congeners #99, #101, #136, and #153.In the context of the 75 th percenttile for PCB congeners #99, #101, #136, #153, the AOR for PCB #114 increased nearly 11-fold in comparison to women with all concentrations in the 25 th percentile.Moreover, conditional on mixtures with high levels of PCB #114, the AOR for the other four influential congeners was 1.65.Even though the associated CI included one, the relatively large nominal value of the AOR points toward a possible incremental effect of the PCBs when evaluated in the context of a mixture.Combined these findings underscore the need to consider chemical mixtures, particularly since there may be geographic differences in the types of mixtures to which study populations are exposed.We believe that a more thorough analysis of all measured compounds (and the presentation of such data) is informative not only for the evaluation of potential health risks, but in delineating how mixtures may vary and whether an individual compound is etiologic irrespective of study population.
There are important limitations underlying our work that need to be considered when interpreting the results.Our intent was to demonstrate the feasibility of the BBN approach for assessing chemical mixtures to identify sig-nals that may inform etiologic or mechanistic research.To this end, the BBN approach can be viewed as an empirically based data reduction approach and not one for determining etiology, per se.This is important given our cohort size, though it is comparable to many published studies on this topic.Other data-reduction methods such as principal component analysis (PCA), canonical correlation analysis, factor analysis and structural equation techniques often require normality assumptions for optimal performance.The utility of the BBN approach is its lack of assumptions required for parametric models including the normality assumption among others.Thus, findings from the BBN are more robust compared to those obtained from methods that rely on parametric assumptions.
Previous approaches that sum congeners essentially assign equal weights to all PCBs and, thereby, may not fully account for influential compounds.Gennings et al. [10] develop a novel method where differential weights are assigned to chemicals in a linear combination according to an optimization procedure, thereby allowing subsets of congeners to be differentially associated with the outcome.Methods such as PCA produce weighted sums of congeners on the basis of observed variability of the congeners in the mixture without accounting for specific of each congener on the outcome under study.Moreover, the weights in the linear combination need not be directly related to individual influences.In the BBN approach, the relative contribution of congeners in the mixture can be evaluated with regard to the analytic model as they are derived from the data resulting added flexibility.The BBN approach can serve as a tool for determining the relative influence of chemicals in a mixture when limited information is available to advise the investigator, and in generating testable hypotheses for future research.
Of all the PCB congeners assessed, #114 was most informative for endometriosis in this cohort of women.The extent to which #114 is etiologically associated with endometriosis awaits corroboration, but hopefully illustrates the utility of the BBN approach for identifying possible etiologic signals driving observed health effects for a particular cohort or study sample.Still, the biological interpretation of congeners identified as relevant to human health or disease outcomes through BBN approaches requires toxicologic and other biologic input.As with any modeling approach, our findings are dependent upon valid and reproducible laboratory measurements in the context of biologically plausible specified models.Clearly, a team science approach is needed in building the BBN and in interpreting the results.However, compounds identified as being important for disease outcomes may warrant subsequent experimental research aimed at determining mechanistic pathways to aid in the interpretation of findings including possible statistical artifacts.

Conclusion
PCB congener #114 was identified by the BBN approach as the most influential compound within a mixture of 68 PCB congeners, conferring a 200% increased likelihood of an endometriosis diagnosis following laparoscopy.Thus, the BBN may be an approach for corroborating results across study populations, or in developing weighting schemes aimed at estimating the magnitude of effects exerted by individual compounds in keeping more closely with the manner in which human exposure arises.

Figure 1 .
Figure 1.Proposed Bayesian belief network displaying the influence of 62 PCB congeners on endometriosis.The BBN is depicting the proposed causal structure for PCB congeners, confounders and endometriosis as designated by the white, orange and blue nodes, respectively.

Figure 2 .
Figure 2. Bayesian belief network modeled using the Uninet 2.74 software with influential nodes highlighted in purple.An increased influence of a chemical mixture including high versus low levels is indicated by darker and larger nodes, respecively.t