Partial Formalization: An Approach for Critical Analysis of Definitions and Methods Used in Bulk Extraction-Based Molecular Microbial Ecology

Partial formalization, which involves the development of deductive connections among statements, can be used to examine assumptions, definitions and related methodologies that are used in science. This approach has been applied to the study of nucleic acids recovered from natural microbial assemblages (NMA) by the use of bulk extraction. Six pools of bulk-extractable nucleic acids (BENA) are suggested to be present in a NMA: (pool 1) inactive microbes (abiotic-limited); (pool 2) inactive microbes (abiotic permissive, biotic-limited); (pool 3) dormant microbes (abiotic permissive, biotic-limited, but can become biotic permissive); (pool 4) in situ active microbes (the microbial community); (pool 5) viruses (virocells/virions/cryptic viral genomes); and (pool 6) extracellular nucleic acids including extracellular DNA (eDNA). Definitions for cells, the microbial community (in situ active cells), the rare biosphere, dormant cells (the microbial seed bank), viruses (virocells/virions/cryptic viral genomic), and diversity are presented, together with methodology suggested to allow their study. The word diversity will require at least 4 definitions, each involving a different methodology. These suggested definitions and methodologies should make it possible to make further advances in bulk extraction-based molecular microbial ecology.


Introduction
Partial formalization promotes the critical examination of assumptions made in scientific inquiry.This involves developing deductive connections among statements to facilitate the critical examination of their relationships.This approach appears to have the potential of improving the precision of terminology and corresponding methodologies used in bulk extraction-based molecular microbial ecology.This terminology includes cells, the community (in situ active cells), the rare biosphere, dormant cells (the microbial seed bank), viruses (virocells/virions/cryptic viral genomes), and microbial diversity.The objective was to provide definitions, and with these definitions in place, to suggest methodologies that would be defensible when examined in a partial formalization context.
To approach this problem, it is necessary 1) to provide information concerning sources of nucleic acids that can be recovered from natural microbial assemblages (NMA) by the use of bulk extraction, 2) to present suggested definitions of words and terms used in bulk extracted nucleic acids (BENA)-based molecular microbial ecology, and 3) to suggest defensible methodologies that will allow their study.The need for precision in language use in science has been discussed by Fang and Casadevall [1].To assist in the processes of developing relationships among words, their definitions, and methods used for their study, partial formalization is suggested to be able to provide valuable guidance.

Partial Formalization
Rudner [2] defined partial formalization in the following way: "In discussing elementary aspects of the subject, we shall count as partial formalizations those negligibly formalized systems exhibiting even one supposedly deductive connection among its statements, or which determine explicitly the usage of even one constituent concept.On the other hand, we shall still count as only a partial formalization an almost complete elaboration of a theory as a deductive system".Partial formalization thus can begin with scientific statements, including definitions, which subsequently can lead to the development of methods that can be used for their study.R. W. Lewis [3] suggested that papers by Lederberg [4], where postulates of a theory for antibody formation were discussed, and a paper by Monod et al. [5] concerned with the nature of allosteric transitions, were valuable examples of the use of partial formalization.In the latter paper [5], Monod and his co-authors provided descriptions of predictive experiments that would prove or disprove possible suggested mechanisms.
Suppes [6], as discussed by R. W. Lewis [3], provided a valuable summary of reasons for using formalization in science that were relevant to concerns for definitions and methodology used in bulk extraction-based molecular microbial ecology: a) Great scientists have done it.b) Clarifies concepts and makes them explicit.c) Makes explicit the fundamental assumptions.d) Clarifies the total structure of a discipline.e) Clarifies the relations of the parts.f) Makes possible the recognition of common aspects of the intellectual enterprise.g) Gives the kind of order that permits one to see both the forest and the trees.h) Enhances objectivity by increasing clarity and reducing foggy meanings.i) Forces completeness of thought by eliminating implicit assumptions.j) Simplifies by reducing the number of fundamental assumptions.k) Provides the best way to convince an opponent of your theory.l) Admits of efficient critical examination by others.
As noted in Figure 1, microbes can be inactive due to having abiotic envelopes that lie outside of the abiotic envelope of the environment (pool 1), or having abiotic envelopes that occur in the environment, but their biotic requirements (nutritional, energetic, symbiotic) are not able to be met (pool 2).Other microbes can be considered to be dormant (pool 3).These microbes are inactive, when first examined, as their biotic requirements (nutritional, energetic, symbiotic) are not being met at the time of examination; however, their biotic requirements may be able to be met at some time in the future; these inactive, dormant microbes are the "microbial seed bank".
The microbial community comprises only the microbes that are active in situ (pool 4).The viruses, as virions, virocells and/or cryptic viral genomes [7], also can be present in the NMA (pool 5).In addition, significant amounts of extracellular DNA and other nucleic acids can be present in NMA (pool 6).This eDNA can be protected from degradation by association with biological particles, or by association with minerals [7].

Definitions and Methodologies
Brief definitions for words and terms used in bulk extraction-based molecular microbial ecology are presented together with suggested methodologies.

Cells
Definition: Cells, in a biological sense, can be defined as "a small, usually microscopic mass of protoplasm bounded externally by a semi-permeable membrane … forming the smallest structural unit of living matter capable of functioning independently" [8].There is no need to prove that these cells are active in situ or capable of growth on any particular medium, or even "alive".The cells must not be compromised by the presence of eDNA in the sample matrix or sorbed on the cell surface.
Suggested methodology: The cells can be derived from the pools 1, 2, 3 and 4 noted in Figure 1.Cells can be recovered using a range of physical approaches including differential centrifugation, micromanipulation, filtration, flow cytometric and laser capture-based approaches.The most critical part of this analysis is the need to inactivate/remove viruses or eDNA that may be in the sample matrix or associated with cell surfaces.

The Microbial Community
Definition: A community, in a biological sense, can be defined as "an interacting population of various kinds of individuals (or species) in a common location" [8].In a microbial context, this has been defined as the in situ active cells [9].No eDNA should be associated with the sample matrix or sorbed to these cells.These in situ active microbes have abiotic envelopes that occur within the abiotic envelope of the environment being sampled, and their energetic, nutritional and/or symbiotic requirements are being met at the time of sampling of the NMA (Figure 1).
Suggested methodology: The in situ active microbes should be separated from inactive cells before molecular analysis A variety of techniques can be used, including single-cell micro-manipulation [10]- [12], flow cytometry [13] [14], fluorescence-activated cell sorting, and magneto-FISH [15].Single-cell genomics [12] [16]- [18] can be used to link these microbes with their desired in situ active phenotype to their molecular information.At the computational stage, molecular information from eDNA can be removed from the analysis [12] [19].Rinke et al. [20] and Hatzenpichler et al. [21] have applied a combination of these techniques to the single cell-based analysis of microbes in NMA.

Rare Biosphere
Definition: The "rare biosphere" is defined as "a numerically rare subset of the in situ active microbes that occur in natural microbial assemblages".This is based on a dictionary definition of the biosphere "living beings" together with their environment [8].For plants and most animals, describing them as "living" is based on their being able to be documented as being active in situ.In a microbial context, the word "biosphere", to be consistent, is centered on in situ active microbes.
Suggested methodology: The decision as to what constitutes a numerically "rare" member of the biosphere, the in situ active microbes, is arbitrary; this decision needs to be made and justified by the particular investigator (s).

Dormant Cells (Microbial Seed Bank)
Definition: Dormant microbes (the microbial seed bank) are microbes that have abiotic envelopes that are within the abiotic envelope of the environment being sampled, but their biotic requirements (energetic, nutritional and/or symbiotic) are not being met at the time of NMA sampling (Figure 1, pool 3).This is based on dormancy being defined as "temporarily in abeyance but capable of being activated" [8].Spores, cysts, heterocysts and other resting forms also may be included in this category.
Suggested methodology: Dormant (microbial seed bank) cells need to be differentiated from microbes that are active in situ, the microbial community (Figure 1, pool 2, as well as from those present in the NMA that are inactive (Figure 1, pools 1, 2 and 4) as well as from molecular information derived from eDNA and viruses (Figure 1, pools 5 and 6).This will require having knowledge of the range of range of abiotic and biotic changes that can occur in a particular environment.With this information available, individual microbes can be observed, either directly in situ, or after recovery by single-cell techniques, and determined to be able to respond and become active in situ under these changed abiotic and biotic conditions.
Environmental changes that could lead to dormant microbes becoming active include changes in oxygen availability (presence/absence/microaerophilic conditions) temperature, pH shifts, as well as changes in nutrients, oxidants, and reductants that have been documented to occur in the particular environment.Single-cell analyses of in situ activity, with changes in environmental conditions, are suggested as the most effective approach for dealing with this question.

Viruses (Virocells/Virions/Cryptic Viral Genomes)
Definition: Defining a virus will require consideration of three states that can occur: inactive virions, virocells (virions interacting with the ribosome-based microbe, forming the ribovirocell), as well as the cryptic viral genome, the temperate state [7].Inactive virions can be distributed throughout the abiotic envelope for the NMA (Figure 1, pool 5); they are simply particles.Virions thus do not comprise a community.The genetic information also can become integrated into the host genome in the temperate state, as occurs with bacteriophage [22] [23].
Suggested methodology: Virions are the nucleic acid pool 5 (Figure 1).Extensive literature is available for the recovery of virions from different environmental matrices.Simple enumeration provides only information on the standing stock of virions.It will be necessary also to document the viruses being replicated in the virocell state, and the occurrence of the viral genome as the cryptic viral benome, the temperate state.Physical recovery procedures should make it possible to recover virions, virocells or temperate forms from the NMA before molecular analyses are completed This problem has been discussed by Forterre [7].

Molecular Microbial Diversity
Diversity definition 1: Molecular sequence diversity.All molecular sequences can be recovered from the nucleic acid pools 1 -6 (Figure 1) by the use of bulk extraction.
Diversity definition 1: Methodology.Recovery of nucleic acids from all of the available pools should be carried out to be quantitative; this may involve repeated extraction from the same sample, until all bulk-extractable nucleic acids can be recovered [24].
Diversity definition 2: Methodology.The molecular sequences, OTU or phylotypes are derived from pools 1 -4 (Figure 1), without the requirement that they are proven to be active in situ.These cells should be free of eDNA, either in the matrix or sorbed to the cell surface.
Diversity definition 3: Acellular molecular diversity.Acellular molecular diversity can include viroids, virocells and/or the cryptic viral genome.
Diversity definition 3: Methodology.Depending on the research question, these sequences can be recovered as separate nucleic acid pools or as a single pooled sample.Guidance in development of these procedures has been provided by Forterre [7].
Diversity definition 4: In situ active microbe molecular diversity.This involves molecular sequences, OTU or phylotypes derived solely from in situ active microbes, the microbial community.This would involve pre-selection to assure, ideally, that only cells with the desired in situ active characteristics are being analyzed.No eDNA should be present in the sample matrix or sorbed to the cells.
Diversity definition 4: Methodology.Pre-selection should be used to assure, ideally, that only cells with the desired in situ active characteristics are being analyzed.No eDNA should be present in the sample matrix or sorbed to the cells.

Discussion
Partial formalization is based upon developing deductive connections between statements used in science, to make them available for continuing critical analyses.In this context, definitions of cells, viruses (virocells/virions/cryptic viral genomes), in situ active cells (the microbial community), the rare biosphere, dormant cells (the microbial seed bank ), and microbial diversity have been developed, together with suggested methodologies that will allow their study, when bulk extraction is used to recover nucleic acids from natural microbial assemblages.
The methods that are suggested to be able to provide defensible information related to these definitions, in many cases, present significant methodological challenges.The methodologies, suggested to be defensible to provide information on these words and terms, as defined, reflect this reality.
The microbial community presents special challenges.These are the in situ active microbes, usually a minor part of the observable assemblages, whether for filamentous fungi, based on active cytoplasm occurrence [25] or bacterial, where direct examination for activity has been used [26] [27].The algae and protozoa present special challenges for activity assessment; direct observation of physiological processes or molecular techniques can be used to provide estimates of their activity [28] [29].
Recovering microbes that represent the rare biosphere is a straight-forward process, based on the definition that has been suggested to be defensible.If one has information on the in situ active microbes (phenotypic, molecular) it only is necessary to set numerical criteria for rarity.If the rare biosphere is defined as in situ active microbes that are present at low frequency, the terms "inactive biosphere" [30] and "dormant biosphere" [31] become oxymorons.Dormant cells (the microbial seed bank) also present special problems.As shown in Figure 1, it is necessary to recover molecular sequences only from the pool 3 (Figure 1), inactive cells that can become active in situ, as the conditions in the abiotic envelope vary.
The term microbial diversity also presents unique challenges.Gest [32] discussed "the meaning of diversity" and he stated that "unfortunately, the word diversity can have several meanings, and the one in mind is frequently not specified".The most informative nucleic acid pool for describing microbial diversity, from an ecological standpoint, will be the pool 4, in situ active microbes.In most studies, however, when BENA are used to describe diversity, the specific source of the nucleic acids is not known and cannot be proven; in this case molecular sequence diversity is the appropriate descriptive term, as discussed in the context of an analysis of the QIIME approach to molecular microbial ecology [33].
All of these definitions, and their methodologies, are impacted by extracellular DNA (eDNA).As discussed by Böckelmann et al. [34], it is possible to amplify 16s rRNA genes from eDNA, and the potential exists for eDNA to be analyzed in other sequencing approaches, such as massive parallel sequencing.In single-cell genomic analyses, sequence information for eDNA can be removed by computational procedures [12] [19].This is not a trivial problem.Significant amounts of eDNA occur in soils.Pietramellara et al. [35] have noted that "eDNA represents a significant portion of the entire soil metagenome".eDNA is also important in marine environments [36] and in biofilms associated with the human body and disease processes [37], indicating the importance of eDNA as a factor that can influence bulk extraction-based molecular microbial ecology.

Summary
With the diverse sources of bulk-extractable DNA present in natural microbial assemblages, methods must be in place that makes it possible to prove the sources of bulk-extracted nucleic acids before conclusions are drawn concerning the source.Baas Becking [38], as translated by de Wit and Bouvier [39], noted that "everything is everywhere, but, the environment selects".In the context of bulk extraction-based molecular microbial ecology, this statement can be paraphrased as "DNA is everywhere, but, the method selects".Partial formalization is suggested to be able to provide defensible definitions and corresponding methodologies, for use in bulk extraction-based molecular microbial ecology.The "but" in the paraphrased dictum of Baas Becking clearly places the onus on the scientist; defensible methods must be used that will provide information on terms as defined.If the definitions and methods presented in this communication are determined to be indefensible, new definitions and methodologies should be formulated and subjected to further critical analysis by using partial formalization.