Discovery of Key Molecular Pathways of C 1 Metabolism and Formaldehyde Detoxification in Maize through a Systematic Bioinformatics Literature Review

Computational systems biology approaches provide insights to understand complex molecular phenomena in living systems. Such understanding demands the need to systematically interrogate and review existing literature to refine and distil key molecular pathways. This paper explores a methodological process to identify key molecular pathways from systematic bioinformatics literature review. This process is used to identify molecular pathways for a ubiquitous molecular process in all plant biological systems: C1 metabolism and formaldehyde detoxification, specific to maize. The C1 metabolism is essential for all organisms to provide one-carbon units for methylation and other types of modifications, as well as for nucleic acid, amino acid, and other biomolecule syntheses. Formaldehyde is a toxic one-carbon molecule which is produced endogenously and found in the environment, and whose detoxification is an important part of C1 metabolism. This systematic review involves a five-part process: 1) framing of the research question; 2) literature collection based on a parallel search strategy; 3) relevant study selection based on search refinement; 4) molecular pathway identification; and 5) integration of key molecular pathway mechanisms to yield a well-defined set molecular systems associated with a particular biochemical function. Findings from this systematic review produced three main molecular systems: a) methionine biosynthesis; b) the methylation cycle; and c) formaldehyde detoxification. Specific insights from the resulting molecular pathways indicate that normal C1 metabolism involves the transfer of a Corresponding author.


Introduction
Computational systems biology approaches provide insights to understand complex molecular phenomena in living systems.Such understanding demands the need to systematically interrogate and review existing literature to refine and distil key molecular pathways.Systematic literature review is a powerful way to summarize the available scientific literature on a question [1].In such a review, a comprehensive search is followed by a careful filtering of results, providing a quality set of scientific studies, which are used as evidence in the review.In this paper, this process is further refined to not just review the literature but to identify molecular pathways.
In agricultural research, there is a growing need to utilize such a process towards bridging hands-on research in the field with a molecular systems understanding.The support for this research through the Rodale Institute, which has been supporting independent agricultural research for 30 years to provide farmers the tools and knowledge to increase soil health, crop quality and yields while simplifying farm management, exemplifies this trend and growing need.In this research, as a use case, this process is tested to identify molecular pathways for a ubiquitous molecular process in all plant biological systems: C1 metabolism and formaldehyde detoxification, specific to maize.Results from this use case yield a valuable and coherent set of molecular pathways that may be used for future effort in the computational modelling of C1 metabolism and formaldehyde detoxification.
The systematic bioinformatics literature review process herein involves a five-part process: 1) framing of the research question; 2) literature collection based on a parallel search strategy; 3) relevant study selection based on search refinement; 4) molecular pathway identification; and 5) integration of key molecular pathway mechanisms to yield a well-defined set molecular systems associated with a particular biochemical function.This process involves the use of the CytoSolve® Collaboratory™ and bioinformatics platform [2] for managing the entire systematic review process from literature review to molecular pathway identification.Findings from this systematic review produced three main molecular systems: a) methionine biosynthesis; b) the methylation cycle; and c) formaldehyde detoxification.

C1 Metabolism
C1 metabolism is perhaps the most important molecular processes in living systems.The C1 metabolism provides one-carbon units for proteins, nucleic acids, methylated compounds, and other biomolecules and is found in plants, bacteria, yeast, and mammals [3] [4].In higher plants this pathway serves to synthesize many important molecules, including methionine, formylmethionine-tRNA, pantothenate, thymidylate, adenosine, and serine, while also providing one-carbon units for important modifications such as DNA methylation.

Formaldehyde in C1 Metabolism
Formaldehyde has been classified as a mutagen, suspected carcinogen, and a highly toxic compound because of its ability to react with proteins, nucleic acids, and lipids [5].Formaldehyde detoxification, therefore, is vital to the functioning and survival of living systems, and is an important part of C1 metabolism by which higher plants are able to metabolize formaldehyde from the environment [6] [7].Detoxification of formaldehyde in C1 meta-bolism occurs via its conversion to formate and eventually to carbon dioxide and water [5].
The major source of formaldehyde generation in plants is via dissociation of 5, 10-CH2-tetrahydrofolate (THF) and oxidation of methanol, derived mainly from pectin demethylation.Other potential sources of endogenous formaldehyde are other oxidative demethylation reactions, glyoxylate decarboxylation, and cytochrome P-450dependent oxidation of herbicides such as glyphosate [8] [9].Since there is significant crosstalk between C1 metabolism and other essential pathways such as serine biosynthesis, adenosine metabolism, and tetrahydrofolate biosynthesis, plants have established complex regulatory mechanisms [10].The formation and detoxification reactions of formaldehyde are tightly regulated in order to prevent its accumulation [5].

Perturbation of C1 Metabolism in Maize
Maize is an important staple crop whose C1 metabolism has not been fully characterized.The maize kernel is mostly made up of the endosperm and embryo, tissues which are known to express C1 metabolism genes during their development [11].If formaldehyde detoxification, which is an important part of C1 metabolism, is perturbed then it is possible that formaldehyde may accumulate in the kernel, which is bound for human or animal consumption.

Importance of Systematic Literature Review to Understand C1 Metabolism and Formaldehyde Detoxification in Maize
A systematic literature review that identifies core molecular systems can be of immense value to an integrative understanding of C1 metabolism and formaldehyde detoxification.For example, formaldehyde levels about 200 times higher than accepted were observed in wild type corn plants grown with a conventional herbicide [5].
However, in such empirical observations, there is lack of clarity on the foundational molecular systems that could give rise to such perturbations of formaldehyde.A systematic review of the existing literature and the identification of molecular systems may provide the basis for future computational modelling to yield a detailed understanding and insights of the effect of any such perturbations, at the molecular level, on C1 metabolism and eventually on detoxification of formaldehyde in plants.
In this research, therefore, there are four key practical systems biological objectives for conducting this systematic review.These are outlined below.

Review the Pathways and Compartments Involved in C1 Metabolism in Plants
C1 metabolism is essential to all organisms and it provides necessary one-carbon units for proteins, nucleic acids, methylated compounds, and other molecules.In plants C1 metabolism also plays a role in detoxification of harmful one-carbon molecules.This review will reveal the pathways and enzymes involved in one-carbon metabolism and the compartments where it happens.

Describe the Regulation of C1 Metabolism in Plants
Such vital functions as protein synthesis and nucleic acid synthesis are highly regulated, and since C1 metabolism plays a role in these processes it too must be regulated.This review will describe how the plant cell controls one-carbon metabolism and the signaling involved in the regulation.

Summarize Formaldehyde's Role in C1 Metabolism in Plants
Formaldehyde is a toxic substance, yet it has been shown to be an intermediate in one-carbon metabolism.This review will determine the role of formaldehyde in this essential process and describe the mechanisms involved in formaldehyde detoxification.

Elucidate Mechanisms of Formaldehyde Detoxification and Accumulation in Maize
Maize is a very important staple crop which requires C1 metabolism as does all plants, and presumably uses formaldehyde as an intermediate in this process.This review will summarize the formaldehyde detoxification and accumulation mechanisms specific to maize, and also explore tissue-specific differences within maize.

Methods
The systematic bioinformatics literature review involves the five steps: 1) framing of the research question; 2) literature collection of an initial set based on a parallel search strategy; 3) search refinement to discover relevant set; 4) detection of papers for the study set with molecular pathway information; and 5) aggregation to organize key molecular pathway systems associated with a particular biochemical function.
This process was enabled through the use of the CytoSolve® Collaboratory™, a bioinformatics platform [2], which provides a cloud-based infrastructure for: 1) conducting and archiving search results from disparate data sources including PubMed, Google Scholar, and multiple online databases; 2) managing and annotating the identification of the molecular pathway diagrams; 3) integrating molecular pathway diagrams to create large scale molecular systems; 4) managing and identifying modeling parameters such as rate constants, initial conditions, etc.; 5) creating and simulating component molecular pathway models; and 6) integrating component models to create large scale functional and predictive models of biological phenomena.For the purpose of this research, features 1 through 3 were critical.

Framing of the Research Question
This systematic review paper frames the research question as follows: "What are the characteristics of C1 metabolism in plants, especially maize, and what is the role and fate of formaldehyde in this biological process?"This framing motivates the selection of critical search criteria for literature collection.

Literature Collection of Initial Set Based on a Parallel Search Strategy
Literature collection from an informatics standpoint was executed to ensure high recall to acquire the initial set.Based on the research question posed, 24 search criteria were developed as detailed in Appendix A. The PubMed and Google Scholar databases were searched using the search criteria.This resulted in executions of 24 parallel independent searches to produce the initial set.

Search Refinement to Discover Relevant Set
Search refinement was executed to ensure increased precision to find relevant set of papers.A precision search was performed, by constraining the initial set to C1 metabolism and/or formaldehyde within Titles or Abstracts, to acquire the relevant set.

Detection of Papers for Study Set with Molecular Pathway Information
The relevant papers were reviewed by domain experts through the CytoSolve user interface, as shown in Figure 1, to determine the study set, papers which from the relevant set that contained molecular pathway information such as: 1) description of species and reactions of C1 metabolism 2) cellular compartments containing species and reactions 3) relevant enzymes 4) flux through C1 metabolism 5) perturbations of C1 metabolism 6) molecular pathways in C1 metabolism; and, 7) accumulation of formaldehyde In this detection process, priority was given to those articles which were the most recent and which contained information and/or studies on maize or closely related grasses.

Aggregation to Organize Key Molecular Pathways
In this systematic review, the final process was the aggregation of the molecular pathway information from the study set to create a final set of key molecular pathway systems associated with the dynamics of C1 metabolism and formaldehyde detoxification.In Figure 2 is an example, from the CytoSolve user interface, of the methionine biosynthesis molecular pathway system (one of the three molecular pathway systems), aggregated from the study set.

Results
The systematic review of literature yielded three significant results, which are described in sections 3.1, 3.2, and 3.3 respectively.In Section 3.1, the processing results are summarized from the five-step process described in the Methods.In Section 3.2, the three major molecular pathway systems aggregated from the study set for C1 metabolism are described and discussed in detail.In Section 3.3, additional critical insights were identified concerning the regulation of C1 metabolism and role of formaldehyde in C1 metabolism.

Results from the Systematic Review of Literature
The final results of the systematic review are summarized in Figure 3. Based on the framing of the research question and the application of the search criteria through a parallel strategy, the literature collection of an initial set of 11,597 papers was derived from online databases such as PubMed and Google Scholar.Based on a Parallel Search Strategy, a comprehensive look at C1 metabolism and formaldehyde detoxification in plants was conducted.
The precision search performed on the initial set was constrained to C1 metabolism and/or formaldehyde within Titles or Abstracts of the papers from initial set and yielded the relevant set consisting of 216 papers.The relevant set was reviewed by the domain experts using CytoSolve user interface to identify 64 papers that formed the study set, which forms the basis of this systematic review.A final set of three molecular pathway  systems in the C1 metabolism was identified from the study set.Figure 3 summarizes the results from the search, collection, and relevance determination.

Molecular Pathway Systems Derived from the Systematic Review of C1 Metabolism
C1 metabolism is necessary to provide one-carbon units for biosynthetic reactions.The main sources of the one carbon units are formate, glycine, and serine.Methylation reactions involving S-adenosylmethionine (AdoMet) appear to be the destination for most one carbon units passed through this pathway.This reaction provides a methyl group to nucleic acids, proteins, lipids, and secondary metabolites.Other common products include pantothenates, adenosine, adenylates, and formylmethionine tRNA [3] [4].The systematic literature review of the study set yielded three major molecular systems in the C1 metabolism of plants that consisted of 98 biochemical species, 111 reaction kinetic parameters and 68 total biochemical reactions.The three molecular systems in C1 metabolism are: 1) methionine biosynthesis; 2) activated methyl cycle; and 3) formaldehyde detoxification.In total, these pathways consisted of 98 biochemical species and 111 reaction kinetic parameters and 68 total biochemical reactions and are discussed in detail below.

Methionine Biosynthesis
One carbon units are transferred through the pathway mostly by tetrahydrofolate (THF) which are complex molecules present mostly in the cytosol [12]- [14], but there is evidence that THF-mediated C1 metabolism also occurs in the mitochondria and chloroplast [15].The THF molecule goes through many interconversions with the typical starting point being the addition of a serine or formate molecule and the typical ending point being the synthesis of methionine.Studies on sycamore reveal that serine is the most common source of C1 units and it typically donates one C1 group [16].A study in Petunia appeared to support the assertion that methionine is the top destination for the C1 group: moderate levels of exogenous labeled formaldehyde mostly ended up in methionine [17].Formate is another possible source of C1 units (Hourton-Cabassa et al., 1998), although it can be oxidized to yield carbon dioxide and NADH [18], or two molecules of formate can combine to form glyoxylate [19].
Serine hydroxymethyltransferase (SHMT) catalyzes the conversion of serine and THF to glycine and 5,10methylene-THF, effectively passing a carbon group from serine to THF [20] [21].This reaction is very important since serine is the most common source of carbon groups in C1 metabolism.There are multiple SHMT isoforms in higher plants and these genes are active in different compartments [22].For example, in Arabidopsis there are two active isoforms targeted to the mitochondria, and they work together to handle an increased workload in non-photosynthetic tissues during times of increased demand for C1 metabolism [23].Glycine decarboxylase (GDC) is an important regulatory enzyme for photosynthesis and photorespiration [24] as well as being involved in C1 metabolism by donating a carbon group from glycine to the pathway.GDC often works in concert with serine hydroxymethyltransferase (SHMT), with whom it has a complex relationship as both are involved in several pathways [25] [26].
Overall, the folate-mediated reactions in C1 metabolism (shown in Figure 4) normally move from formate or serine to methionine.The synthesis of methionine is an important final step in this part of C1 metabolism.Methionine is made from homocysteine and a THF derivative in a reaction catalyzed by a methionine synthase in either the cytosol or chloroplast [27].C1 metabolism does not always lead to methionine synthesis.Many branches of the pathway include conversion of a THF derivative to formylmethionine tRNA, formylglycinamide ribonucleotide (FGAR), formamidoimidazole carboxamide ribonucleotide (FAICAR), or panthothenate.
One branching of the pathway occurs when 5,10-methylene-THF is converted to thymidylate and then the remaining dihydrofolate (DHF) is reduced to THF [28] [29].These reactions are catalyzed by the same enzyme which has thymidylate synthase and dihydrofolate reductase activity [30]- [33].This branching is one way that methionine biosynthesis can be avoided and THF recycled.
Since THF is such an important molecule, it is important to know how it is synthesized.THF synthesis involves the chloroplast, cytosol, and mitochondrion.The precursors dihydropterin and p-aminobenzoic acid are synthesized in the cytosol and chloroplast, respectively, and then are imported into the mitrochondrion where THF synthesis is completed [34] [35].

Activated Methyl Cycle
Methionine is the starting point for the Activated Methyl Cycle (shown in Figure 5), although not all methionine is dedicated to this cycle.In Lemnapencicostata (common Duckweed), Giovanelli et al. (1985) [36] showed that 20% of methionine becomes incorporated in proteins.If methionine remains in the pathway it is converted  to S-adenosylmethionine(SAM) in the cytosol [27] [37] [38] [62].There is no such conversion in the chloroplast so SAM must be imported from the cytosol to function in the chloroplast [27].
SAM can then bind to methyltransferase enzymes [39], and the SAM-bound enzymes can then go on to methylate DNA, RNA, proteins, and other biomolecules.After the methylation, the resulting S-adenosylhomocysteine is converted to homocysteine (with adenosine as a byproduct) which is then converted back to methionine [40].
Another part of activated methyl cycle is the s-methylmethylation cycle.Studies in angiosperms show that the s-methylmethionine (SMM) Cycle short-circuits the Activated Methyl Cycle by converting methionine and SAM into SMM and s-adenosylhomocysteine (SAH) [41] [42].This cycle appears to consume half the SAM and its purpose is to limit SAM levels to avoid over-methylation [42] [43].

Formaldehyde Detoxification
The folate-independent reactions are mainly concerned with the detoxification of formaldehyde (shown in Figure 6).Endogenous formaldehyde may come from methanol or sarcosine [44].The detoxification of formalde- hyde results in either formate or a THF derivative [4].Formate is normally oxidized into carbon dioxide in Arabidopsis [45], but it also known to be a carbon source in C1 metabolism.

Critical Insights in Regulation of C1 Metabolism and Formaldehyde's Role in C1 Metabolism
In addition to identification of the three major molecular pathway systems in C1 metabolism, the systematic review also revealed critical insights in: 1) regulation of C1 metabolism in plants; 2) formaldehyde's role in C1 metabolism; 3) effect of perturbation of C1 metabolism on formaldehyde detoxification and its accumulation in maize.

Regulation of C1 Metabolism in Plants
Since C1 metabolism is so important, it follows that plants have established a reversible mechanism, so there are situations in which this pathway appears to proceed in reverse.Li et al. (2003) [45] showed in Arabidopsis that although serine is not normally synthesized through the C1 metabolism, if its normal synthesis, via glycolate and glycine, is blocked, then it can be a product of one-carbon metabolism.Also in Arabidopsis, Loizeau et al. (2007) [51] showed that during folate starvation, the C1 flux is increased to nucleotides and decreased to methionine synthesis.When THF biosynthesis was inhibited in Arabidopsis, serine levels decreased, even though it could be synthesized without THF [52].
During photorespiration the C1 metabolic pathway appears to go backwards, with the oxidation of THF and formate [53].Also, during photorespiration it was shown in pea leaf mitochondria that THF is often heavily polyglutamated, lowering its affinity in SHMT and thus favoring the formation of serine, rather than the coversion of serine to glycine which is expected to be the normal course in C1 metabolism [22].
Finally, in a study using an Arabidopsis SFGH, s-formylglutathione hydrolase (SFGH) was inactivated, via enzyme modification, under oxidizing conditions [54].This is an interesting feature given the fact that SFGH is important in formaldehyde detoxification.

Formaldehyde's Role in C1 Metabolism in Plants
C1 metabolism plays a key role in metabolism of formaldehyde [6] [7].We have identified the molecular pathways that involve detoxification of formaldehyde, that lead to removal of formaldehyde in the plants.We also analyzed consequences of possible perturbation in the formaldehyde detoxification pathway and the chance of accumulation of formaldehyde in corn.The details are as follows.

Effect of Perturbation of C1 Metabolism on Formaldehyde Detoxification and Its
Accumulation in Maize Formaldehyde detoxification appears to have many paths, but the oxidation to formate is the most well-characterized.Glutathione-dependent formaldehyde dehydrogenase (FDH) is the most important enzyme in this path of formaldehyde detoxification.A study of the Arabidopsis FDH showed that the capacity of formaldehyde detoxification was proportional to FDH activity [8].The next enzyme on this formaldehyde detoxification path is SFGH, which is known to be regulated [54].Perturbations which hindered the functions of either of these enzymes would certainly hinder formaldehyde detoxification.
Although C1 metabolism is not a high priority in the developing embryo and endosperm in maize [57], C1 metabolism genes are expressed and their proteins are present in these cell types [11] [58]- [60], so a perturbation in this development may affect C1 metabolism, including formaldehyde detoxification and possible accumulation of formaldehyde.

Discussion
A systematic review of literature is the crucial step towards identification of critical molecular pathway systems for developing predictive computational systems biology models of biological processes.In this paper, a systematic bioinformatics literature review process is used to identify the critical molecular pathway systems involved in C1 metabolism and its role in formaldehyde detoxification.
C1 metabolism in plants is an essential biological process.This review has identified three molecular pathway systems in C1 metabolism: methionine biosynthesis, methylation cycle, and formaldehyde detoxification.Two major insights have also emerged from this systematic review.One insight is that C1 metabolism normally proceeds from serine to methionine and then the carbon group is donated to a biomolecule in a methylation reaction.However, in photosynthetic tissues C1 metabolism appears to proceed in reverse, synthesizing serine and oxidizing formate.The second major insight is that formaldehyde detoxification pathway can be blocked by a modification to s-formylglutathione hydrolase, which may cause the accumulation of formaldehyde if there is no alternative detoxification path.

Future Directions
The molecular pathway systems identified in this systematic review can be used to develop computational systems biology models of C1 metabolism.Such computational models may be valuable in understanding complex biomolecular phenomena such as: perturbation to formaldehyde detoxification, the effect of increased oxidative stress, the effect of increase activity of anti-oxidative enzymes, and others.Such modelling may prove valuable in expanding our knowledge of formaldehyde detoxification in maize.Formaldehyde is a toxic molecule, but this review confirms that C1 metabolism is active in maize embryo and endosperm to detoxify formaldehyde and provide carbon for the synthesis of important compounds, such as proteins, nucleic acids, and amino acids.

Figure 1 .
Figure 1.CytoSolve user interface to identify literature study set containing molecular pathway information on C1 metabolism and/or formaldehyde pertinent to maize and grasses.

Figure 2 .
Figure 2. The methionine biosynthesis molecular pathway system aggregated from the study set of papers relevant to C1 metabolism and formaldehyde detoxification.

Figure 3 .
Figure 3. Systematic Review Results.There were 11,597 scientific papers (initial set), which met our search criteria.Of those, 216 (relevant set) appeared to be interesting based on the title and abstract.Upon further review, 64 papers (study set) were chosen as the quality studies upon which this systematic review is based.We identified three major molecular pathway systems (final set) from the study set.

Figure 5 .
Figure 5. Methylation Cycle.A one-carbon molecule is passed from methionine (cyan) to a methyl group acceptor (CH3 Acceptor; gray) via a methyltransferase enzyme.The carrier molecule is eventually converted to homocysteine, which is converted to methionine (reaction not shown).The s-methylmethionine (SMM) cycle is shown as the conversion of s-adenosylmethionine (SAM) to SMM to S-Adenosylhomocysteine (SAH).