Genome-Wide Identification and Characterization of the Dof Transcription Factor Gene Family in Phaseolus vulgaris L .

The Dof (DNA-binding with one finger) proteins are a class of plant-specific transcription factors that can trigger several processes involved in plant growth and development, as well as in stress responses. Here, we performed a systematic bioinformatics analysis to characterize all Dof genes in common bean, which included analysis of the genome sequence, conserved protein domains, chromosomal locations, subcellular locations, phylogenetic relationships, gene duplications, and gene expression profiles in different tissues. Bioinformatics analysis revealed 36 putative genes related to PvDof that were classified into seven subfamilies (A, B1, B2, C1, C2, D1, and, D2) by comparative phylogenetic analysis. Based on our genome duplication analysis, a total of 36 genes were found to be distributed on all 11 chromosomes, and they expanded through gene duplication in tandem, suggesting the involvement of segmental duplication events in the evolutionary process. Synteny events and phylogenetic comparisons of the Dof proteins of common bean with those of A. thaliana, O. sativa, and G. max L. led to the identification of several orthologous and paralogous genes, which provided further insight into the diversity of the evolutionary characteristics of genes of this family in other plant species. Expression profiles revealed that most of the PvDof genes were expressed in different tissues, indicating that PvDof genes may be involved in various physiological functions during plant development. The results of this study provide additional information and potential biotechnological resources for further understanding the molecular basis of this gene family and conseHow to cite this paper: Ito, T.M., Trevizan, C.B., dos Santos, T.B. and de Souza, S.G.H. (2017) Genome-Wide Identification and Characterization of the Dof Transcription Factor Gene Family in Phaseolus vulgaris L. American Journal of Plant Sciences, 8, 3233-3257. https://doi.org/10.4236/ajps.2017.812218 Received: October 11, 2017 Accepted: November 26, 2017 Published: November 29, 2017 Copyright © 2017 by authors and Scientific Research Publishing Inc. This work is licensed under the Creative Commons Attribution International License (CC BY 4.0). http://creativecommons.org/licenses/by/4.0/ Open Access

Bioinformatics analysis revealed 36 putative genes related to PvDof that were classified into seven subfamilies (A, B1, B2, C1, C2, D1, and, D2) by comparative phylogenetic analysis.Based on our genome duplication analysis, a total of 36 genes were found to be distributed on all 11 chromosomes, and they expanded through gene duplication in tandem, suggesting the involvement of segmental duplication events in the evolutionary process.Synteny events and phylogenetic comparisons of the Dof proteins of common bean with those of A. thaliana, O. sativa, and G. max L. led to the identification of several orthologous and paralogous genes, which provided further insight into the diversity of the evolutionary characteristics of genes of this family in other plant species.Expression profiles revealed that most of the PvDof genes were expressed in different tissues, indicating that PvDof genes may be involved in various physiological functions during plant development.The results of this study provide additional information and potential biotechnological resources for further understanding the molecular basis of this gene family and conse-
The Dof (DNA-binding with one finger) is a plant-specific TF that contains 200 -400 amino acids and a single C2C2-type (CX2CX21CX2C-type) zinc-finger-like motif composed of 52 amino acid residues at the N-terminal, which specifically binds to a 5'-(A/T)AAAG-3' element [16] [17].Dof TFs are involved in several important functions [18], such as root light signaling [19], germination [20], regulation of stomatal development [21], development of the vascular system [22], and responses to biotic [23] and abiotic [24] [25] stress.As such, identification and classification of the Dof family in common beans is useful for future research on plant gene expression, as to date no study has been performed on identifying members of the Dof family in common bean.
Common bean (Phaseolus vulgaris L.) is one of the most important legume crops for human consumption, and is an exceptional source of protein, carbohydrates, and other nutrients [26] [27].Despite being the world's largest producer of common bean, with an average annual production of 3.5 million tons [28], common bean productivity in Brazil is still considered to be low due to several factors, such as the adverse effects of climatic conditions, and the occurrence of pests and diseases [29].Therefore, considering the importance of Dof TFs and the lack of information about this gene family in P. vulgaris, weidentified and characterized this gene family in P. vulgaris L. using a computational approach.We identified Dof-coding sequences and characterized them at both phylogenetic and structural levels in order to gain a better understanding of the genetic determinants of tolerance to abiotic and biotic stresses in this

Phylogenetic and Conserved Domain Analysis of Dof Proteins in P. vulgaris
A Maximum Likelihood tree was generated from the aligned amino acid sequences of theDof genes in order to assess evolutionary relationships.Our analysis revealed a distinct clustering of Dof proteins, and further analysis using phylogenetic tree topology allowed us to classify the PvDof gene family into four major classes (A, B, C, D) and seven orthologous subclasses (A, B1, B2, C1, C2, D1 and D2, which presented 8, 7, 6, 7, 5, 2, and 1 genes, respectively) (Figure 2).Phylogenetic relationships within multigenic families may provide additional information about the Dof genes evolution [9].We present detailed information  3.
about the 25 putative motifs of the Dof gene sequences in P. vulgaris, including names, widths, and best possible matches, in Table 3. Identification of each of these motifs is also illustrated in Figure 2, in which motif 1 is represented by the Dof domain that is uniformly found in all bean protein sequences (

Gene Structure, Chromosomal Location, and Gene Duplication Events of PvDof Genes
Structural diversity and characterizations of exon/intron structure were evaluated for each Dof gene (Figure 3).Genes in the subclasses A and D2 contained no introns, whereas genes in the subclasses B1, B2, C1, C2, and D1 all had one or two introns.The structural analyses of the PvDof genes were based on the results of the clades of the phylogenetic tree, suggesting that, as in other plants, members of the same subclass had similar structures and thus likely perform similar functions.
Genome chromosomal location analyses revealed that PvDof were randomly distributed in 10 out of 11 chromosomes (Figure 4), but the PvDof genes were unevenly distributed among chromosomes.The largest number of PvDof genes occurred on chromosome 2 (six PvDof genes), followed by five located on   4. chromosomes 3 and 6 (Figure 4).In addition, four genes were found on chromosome 9, chromosomes 1, 5, 10, and 11 each possessed three PvDof genes, and one gene was detected on chromosome 7 (Figure 4).Expansion analysis of the Dof gene family in the P. vulgaris genome was examined.Based on their chromosomal distribution and the high rate of sequence similarity, we determined that 26 duplication pairs arose from segmental and tandem duplication events; the lines in Figure 4 show the connections among these paralogs.Twenty-four of the paralog pairs were the result of putative segmental duplication events.Two pairs of paralogous genes occurred on the same chromosome, separated by only a short distance (<0.2 Kb), which suggests that the gene pairs PvDof24/PvDof25 and PvDof34/PvDof35 represent tandem duplication (Figure 4 and Table 4).Our results indicate that segmental duplication predominated in the expansion of the PvDof gene family in common bean, but that tandem duplication was also involved.
We calculated Ka and Ks values, as well as the Ka/Ks ratio, in order to estimate the date of the duplication events (Table 4).Segmental duplication events of the Dof genes in common bean occurred from 2.13 mya (million years ago) (Ks = 0.04) to 26.06 mya (Ks = 0.44), with a mean of 11.54 mya.However, estimations of the date of tandem duplication events in the paralog genes were not possible because these gene pairs (PvDOF24/PvDOF25 and PvDOF34/PvDOF35) differed only in their intron sequences.The Ka/Ks ratio of all duplication events was >0.3, which implies that significant functional divergence could have occurred after duplication.The Ka/Ks ratios of six duplicate pairs were <1.0, indicating that the PvDof genes evolved under negative selection acting against protein-coding changes.These results suggest that segmental/tandem expansion of the Dof gene family in common bean could be dated to relatively recent duplication events.A substantial number of Dof genes were systematically investigated, and synteny analysis was performed between P. vulgaris Dof genes and those of two other plants, one a dicot (A.thaliana) and the other a monocot (O.sativa).In addition, synteny analysis was performed on G.max, a legume closely related to P. vulgaris [37].As such, three comparative synteny maps were constructed, consisting of P. vulgaris against A. thaliana, O. sativa, and G. max (Figure 6).A total of 123 pairs of orthologous genes with synteny relationships were identified.Seven pairs of Dof genes were found with synteny relationships, including five AtDof genes and five PvDof genes in Arabidopsis and common bean, respectively (Supplementary Table S1).Only two pairs of matching Dof synteny genes were common to bean and rice, including two OsDof genes and one PvDof gene (Supplementary Table S2).A total of 114 pairs of synteny relationships were found between soybean and common bean, of which 62 GmDof genes and 33 PvDof genes were detected (Supplementary Table S3).However, no synteny was observed for the PvDof03, PvDof31, and PvDof35 genes, suggesting that these orthologous genes were formed following the divergence of P. vulgaris and G. max.It would appear that the Dof genes in P. vulgaris share an origin with those in A. thaliana, O. sativa, and G. max, but that subsequent expansion of the PvDof genes occurred following the monocot/dicot divergence.In addition, we observed clear losses and/or duplications of several of the Dof genes in the genomes of these plants.

Transcription Profiling of PvDof Genes in Different Tissues
We analyzed the transcriptional profiles of all 36 PvDof genes in 11 different plant tissues (young pods, stem_10, stem_19, flower buds, flowers, root_10, nodules, root_19, green mature buds, leaves, and young triloliates) (Figure 7).The expression patterns indicated that the PvDof10, PvDof30, PvDof36, PvDof12, and PvDof27 genes were classified into classes A and C, and were preferentially expressed in young pod and stem tissues.We then examined the response of the  PvDof23 and PvDof03 genes in subclass C1, as these were expressed only at very low levels in almost all of the tissues and organs of common bean (Figure 7).

Discussion
The Dof gene family, which is found in many plant species, is responsible for numerous transcription regulation functions associated with various biotic and abiotic stress responses.This gene family is especially prominent in such plants as Arabidopsis spp.and O. sativa [5], G. max [40], S. lycopersicum [9], S. officinarum [43], and P. heterocycla [12].In this study, we identified a total of 36PvDof genes in P. vulgaris (Table 1).The number of PvDof homologs identified in this study was similar to that found previously in Arabidopsis, rice, sorghum, and poplar [5] [7] [15].Our results indicated that the Dof genes in P. vulgaris are highly similar to those in other species.Our results also revealed that the conserved C2C2-Dof domain was uniformly observed in all PvDof proteins.This domain is indicative to be considered a functional TF pertaining to the Dof gene family [40] [44].Although the same number of Dof genes was found in Arabidopsis (36) and common bean (36), the common bean genome, at 650 Mb [45], is considerably larger than the Arabidopsis genome, at 145 Mb [46].As shown in Table 1, Cai et al. [9] found 34 genes in tomato (with a genome size of 950 Mb), indicating that genome size is not proportional to the number of genes.
Accurate classification was important for understanding the structures, functions, and evolution of the PvDof genes.In order to gain further insight into the evolutionary relationships between PvDof genes in common bean, we evaluated the exon/intron structural organization of all protein sequences.There were between zero and two introns in each gene, whereas most members of the same class/subclasses shared similar intron/exon organization (Figure 3).Our results corroborate those found in other species, such as Arabidopsis [5], Cucumissativus [47], and S. lycopersicum [9].Divergence in the intron/exon structure can provide important information on evolutionary factors when processing the phylogenetic relationships of several multigenic families found in plants [48].In addition, the MEME motif search tool was employed to identify and understand the diversity of the motifs in the PvDof genes, for which we identified 25 different conserved motifs that are present in each of the Dof protein sequences in P. vulgaris.The majority of PvDof genes within the same subclass shared similar motifs, suggesting that these conserved motifs are closely related and implying functional similarities between the proteins (Figure 2).Analysis of gene structure and conserved motif position provides additional information about the evolutionary relationships of this family in P. vulgaris [11].
Gene family expansion in plants is primarily the result of segmental/tandem duplication and transposition events.Gene duplication on different chromosomes is often due to segmental duplication events, whereas the presence of two or more genes on the same chromosome indicates a tandem duplication event [49].Thus, we analyzed the chromosomal distributions of the PvDof genes, which are shown in Table 4.We identified 24 pairs of paralogous genes randomly scattered throughout the genome, which we considered to be evidence of segmental duplication, whereas two pairs of genes found on the same chromosome were considered to be evidence of a tandem duplication event.Gene duplication plays an important role in gene family expansion and functional diversification [50].Comparing the ratio of non-synonymous (Ka) to synonymous (Ks) mutations provides a means of analyzing positive and negative selection of specific amino acid sites within the total length of Dof protein sequences between the different groups [11].Analysis of the Ka/Ks ratio indicated that, despite differences between the Ka/Ks values, most were substantially less than or equal to one, which suggests that the sequences within each of the class are under strong American Journal of Plant Sciences purification selection pressure and that positive selection may have acted.
Phylogenetic comparison and the construction of synteny maps of common bean Dof proteins showed that they were most similar to soybean Dof proteins, which reflects the similarity between the genomes of the two species.We found one extensive gene synteny between P. vulgaris and G. max, in which the total number of genes identified in common bean (91.66%, or 33 genes) were in synteny with Dof proteins in G.max.Previous studies have shown that P. vulgaris and G. max diverged from a common ancestor and shared a whole-genome duplication (WGD) event ~56.5 mya, and only diverged from one another ~19.2 mya [37] [51].In addition, G. max experienced an independent WGD ~10 million years ago [37] [52].This became evident when we compared the number of orthologous genes between these two species, in which 33 PvDOF syntenic genes from the common bean genome exhibited a 1:2 mapping to 62 GmDof syntenic ortholog genes in soybean.The PvDof05, PvDof25, and PvDof35 proteins appear to be unique to common bean, suggesting that these genes may have specific regulatory functions in this species, and may be involved in different physiological processes, although confirmation of this hypothesis requires further research.
Expression profiles were analyzed to determine the specificity of the Dof genes in common bean, which revealed that most of the PvDof genes were expressed in different tissues; moreover, detailed analysis of the expression patterns indicated that most genes pooled in the same subgroup had similar expression profiles.As shown in Figure 7, the expression levels of the PvDof10, PvDof30, PvDof36, PvDof12 and PvDof27 genes belonging to classes A and C were relatively higher in young pod and stem tissues, indicating that they may play important roles in the development of these tissues in bean.Wang et al. [14] reported that Ca-Dofs28, CaDofs10, CaDofs14, and CaDof16 were primarily expressed in the stems of Capsicum annuum, which is perhaps unsurprising given that the stem contains abundant vascular tissue; Kim et al. [53] also observed, in Arabidopsis, that the AtDof5.1 gene was highly expressed in vascular tissues.These expression profiles suggest that PvDof genes may be involved in various physiological functions during plant development.

Conclusions
Here, we examined the genome sequence, classification, chromosomal locations, and conserved motifs of the 36 Dof genes in common beans via genome-wide analysis.The PvDofgenes were distributed on 10 chromosomes, and the high degree of variation in their sequences provided potential evidence for diversifying functions.Multiple alignment of the PvDoF sequences revealed highly conserved cysteine residues, which are considered to be a unique feature of Dof TFs.
In addition, extensive in silico characterization of these proteins will provide insight into the diversity of their genetic structures in terms of numbers and intron/exon positions, as well as in terms of their functional diversity.Finally, American Journal of Plant Sciences phylogenetic comparisons of common bean Dof proteins with those found in Arabidopsis, rice, and soybean led to the identification of several orthologous and paralogous genes, which furthers our understanding of the evolutionary characteristics of this family of genes in P. vulgaris and other plant species.The results of this study provide additional information and potential biotechnological resources for further understanding the molecular basis of this gene family and consequently improvement of common bean crops.

Figure 1 .
Figure 1.Multiple sequence alignment of Dof domain sequences from the proteins of P. vulgaris.The typical features of Dof proteins showing four cysteine residues are indicated.Below the alignment, the conserved residues of amino acids are represented in blue in the upper boxes.

Figure 2 .
Figure 2. Phylogenetic relationships and organization of conserved motifs of Dofgenes sequences in common bean.Phylogenetic tree of 36 PvDof proteins was constructed using MEGA; the motifs identified by MEME software are represented by colored boxes and their consensus sequences are shown in Table3.

Figure 3 .
Figure 3. Schematic diagram of exon, intron, and untranslated region (UTR) organization, as indicated by yellow rectangles, gray lines, and blue rectangles, respectively.

Figure 4 .
Figure 4. Physical map of PvDof genes showing their chromosomal locations.Vertical bars represent the chromosomes and numbers at the left indicate gene positions (the scale on the left is in megabases, Mb).The chromosome number is indicated on the top of each chromosome (vertical bar).Red and green lines reflect segmental and tandem duplications, respectively.Data extracted from Table4.
To evaluate the evolutionary relationship of the Dof gene family among different plants, a phylogenetic tree was generated from the amino acid sequences of P. vulgaris, A. thaliana, O. sativa, and G. max.Maximum Likelihood analysis revealed a distinct clustering pattern of Dof proteins, and phylogenetic tree topology allowed us to classify the Dof gene family into four major classes designated: A, B, C, D and nine orthologous subclasses A, B1, B2, C1, C2, C3, D1, D2 and D3 (Figure5).Of these, classes C and B were the largest, containing 63 and 41 orthologs and accounting for 36% and 23% of the total predicted number of Dof genes, respectively, whereas class A, the smallest class, contained only 35 members and accounted for 19% of predicted Dof genes.The number of clusters found here was similar to the results of previous research[5] [41].Distribution among the subclasses was intervowen for the majority of the Dof members, indicating that Dof gene family expansion occurred prior to the divergence of common bean, Arabidopsis, soybean, and rice.The subclasses C3 and D3, which were species-specific to Arabidopsis and rice, respectively, may be the result of a gene loss event during dicot-monocot divergence[41] [42].

Figure 5 .
Figure 5. Phylogenetic tree of the amino acid sequences of Dof genes generated from 36 sequence of P. vulgares, 36 sequences of A. thaliana, 30 sequences of O. sativa, and 78 sequences of G. max, using 1000 bootstrap replicates.Individual PvD of subgroups are identified by the different colors on the tree.

Figure 6 .
Figure 6.Genome-wide synteny analysis of Dof genes.(a) Comparative map between P. vulgaris and A. thaliana.(b) Comparative map between P. vulgaris and O. sativa.(c) Comparative map between P. vulgaris and G. max.

Figure 7 .
Figure 7. Heatmap showing the expression profiles of common bean PvD of genes across different tissues based on specific libraries.FPKM average values were used, and hierarchical clustering in the different tissues is represented by the color scale.Tissues included in the analysis consisted of young pods, stem_10, stem_19, flower buds, flowers, root_10, nodules, root_19, green mature buds, leaves, and young triloliates.

Table 1 .
Comparison of the number of Dof genes of each classamong P. vulgaris, A. thaliana, O. sativa, G. max, and S. lycopersicum.

Table 2 .
General physical and chemical characteristics of the 36 PvDof genes identified in P. vulgaris.

Table 3 .
The MEME motif sequences and lengths in PvDof proteins.

Table 4 .
Date of duplication of the pairs of paralogous genes of the PvDof gene family.Ka represents the non-synonymous substitution number per non-synonymous site, Ks is the number of the synonymous substitution site; Ka/Ks represents the ratio of non-synonymous (Ka) to synonymous (Ks) substitutions.