1. Introduction
The Lilium L., belonging to Liliaceae, includes approximately 120 species, with the central distribution area in Hengduan Mountains and the Himalayas, many species of Lilium are cultivated as ornamental [1] or medicinal plants [2]. In 1949, Comber used the classic taxonomic methods to establish the most widely accepted classification of this genus [3]. However, in the genus Lilium, it is difficult to determine the phylogenetic position of some species based on only distinct morphological characters, especially those having great morphological variations across the distribution areas, because some characters, especially flower shape, are possessed jointly by distantly related species. For example, morphologically, L. nepalense D. Don could be placed in the section Sinomartagon for its revolute tepals, but the characters of its slightly funnelform were somewhat similar to those of the plants within the section Liriotypus. Moreover, the results of phylogenetic analyses using rDNA ITS sequences showed that L. nepalense was included in section Sinomartagon [4]. Other results using the ITS and psbA-trnH markers showed that L. nepalense was nested within Nomocharis Clade [5] or with Nomocharis formed a single branch and sister to section Liriotypus, but distant from section Sinomartagon in phylogenetic tree constructed by ITS and trnA-petH region [6]. There are two main reasons for these inconsistent results: firstly, L. nepalense inhabits a wide distribution range accompanied by heterogeneous ecological environments [7], perhaps resulting in high levels of genetic diversity in natural populations; secondly, it is likely that limited data sampling (e.g., partial chloroplast or nuclear gene fragments) were often insufficient to provide high resolutions for resolving the complicated phylogenetic position of L. nepalense within Lilium. Therefore, the phylogenetic position of L. nepalense varies according to different datasets or methodologies.
The chloroplast genome owns a conservative genome structure [8], but there are significant differences in length, and gene sequence between species, which has great advantages in phylogeny, and species delimitation in closely related species [9]. In recent years, with the rapid development of next-generation sequencing technology, an increasing number of complete chloroplast genomes were reported, and provided a promising method for the study of phylogeny and population genomics [10].
In this work, the chloroplast genome of L. nepalense collected from the wet forest in Lushui County, Yunnan Province in Hengduan Mountains was newly sequenced. Combining our data with previously published chloroplast genome data, the phylogenetic tree was constructed to reevaluate the status of L. nepalense. Species used to build the phylogenetic tree include 2 samples of L. nepalense (one from the dry valley in Yuanyang County of Southeast Yunnan [11], and one from an unknown location), 2 species of section Sinomartagon, 3 species of section Liriotypus and 3 species of Nomocharis Clade, and Fritillaria cirrhosa D. Don was used as the outgroup.
2. Materials and Methods
2.1. Plant Material, DNA Extraction, and Sequencing
Samples of L. nepalense were collected from Lushui county in Yunan, China (27˚01'42.85"N, 100˚11'49.87"E) (Table 1). Specimens of they were kept in Herbarium of Kunming University under the collection No. YGS1907020. Genomic DNA was isolated using Super Plant Genomic DNA Kit (DP360) from 50 mg silica gel dried leaves by a modified method and DNA sample quality was examined with agarose-gel electrophoresis, and the concentration was measured by ultra-micro nucleic acid analyzer. The qualified samples were sequenced by the IIinuina Hiseq X platform from Novogene Inc.
2.2. Assembly and Annotation of the Chloroplast Genome
GetOrganelle [12] and CLC Genomics Workbench v8.0 were used to assemble the sequence from raw reads. Gene annotations were carried out by using the web program GeSeq [13] in MPI-MP CHLOROBOX (https://chlorobox.mpimp-golm.mpg.de/index.html) with BLAST search for protein coding genes through the references by cultivated species L. nepalense (GenBank no.: MW136391), tRNAscan-SE server for tRNAs [14], HEMMER for rRNAs [15]. The physical structure of plastomes was drawn by OGDRAW version 1.3.1 [16].
2.3. Phylogenetic Tree Constuction
In addition to two published chloroplast genome data of L. nepalense (MK493301, MW136391), genomes of 9 Sequenced species in section Sinomartagon [L. taliense Franchet (KY009938) and L. primulinum var. ochraceum (Franchet) Stearn (KY748298)], section Liriotypus [L. bakerianum Collett & Hemsley (KY748301), L. amoenum E.H. Wilson ex Sealy (MT880912) and L. souliei (Franchet) Sealy (MW085076)], and Nomocharis Clade [L. gongshanense (Y.D. Gao & X.J. He) Y.D. Gao (MK493297), L. meleagrinum (Franchet) Y.D. Gao (MK493299) and L. pardalinum (Franchet) Y.D. Gao (MH029495)] in Lilium were also downloaded from NCBI used to build phylogenetic tree, Fritillaria cirrhosa (MH593346) was outgroup.
Table 1. Comparison of chloroplast genome among two datas of the L. nepalense D. Don.
Maximum likelihood (ML) estimation with 1000 bootstrap replications was conducted using IQ-TREE 2 [17]. For Bayesian Inference (BI) analysis, MrBayes 3.2.6 [18] was used to BI tree.
3. Results and Discussion
3.1. Genome Content and Organization
The chloroplast genome of L. nepalense was a circular molecule of double-stranded DNA and possessed a total length of 152,206 bp (Figure 1, Table 1). The whole genome sequence contained a large single copy (LSC) of 81,854 bp
Figure 1. The circular map of the cp genome of L. nepalense D. Don.
Figure 2. Phylogenetic tree of L. nepalense D. Don with other species in section Sinomartagon, section Liriotypus and Nomocharis Clade in the genus Lilium constructed on the basis of the whole chloroplast genome. The bootstrap percentage values and Bayesian posterior probability were listed at each node.
and a small single copy (SSC) of 17,563 bp, which were separated by a pair of inverted repeats (IRa/IRb) of 26,399 bp (Table 1). The average GC content of whole chloroplast genome sequence was 37%, and the GC content in LSC, SSC, IRs regions were 34.8%, 30.6%, and 42.5%, respectively (Table 1). There were 135 genes found from L. nepalense plastid genome sequence, which included 89 protein-coding genes, 38 tRNAs, and 8 rRNAs. Among those genes, 20 duplicate genes (8 protein coding genes, 8 tRNA genes, 4 rRNA genes) and 23 introns contained genes were also detected (Figure 1, Table 1). The complete chloroplast genome of L. nepalense was submitted to the genomics into the GenBank with accession number of MW853784. The data obtained in this study are slightly different from those of Wu et al. [11], except that GC content, rRNA gene number and tRNA gene (Table 1).
3.2. Phylogenetic Analysis
The phylogenetic relationships among 10 species, along with sequenced cp genomes were further investigated using IQ-TREE and MrBayes through the likelihood and Bayesian methods with a bootstrap of 1000 replicates to assess the reliability. The phylogenetic results showed that L. nepalense sampled from different habitats (dry valley and wet forest, respectively) was more closely related to each other, this suggests that genetic variations between different populations of the same species were smaller than in different species. Besides, all of the three L. nepalense samples showed more close relationship with L. taliense, belonging to the section Sinomartagon, than to any other Lilium species with a strong bootstrap value (Figure 2), indicating that our results were in accordance with that of traditional taxonomy (Comber, 1949). Which revealed that L. nepalense should be included in section Sinomartagon but then section Liriotypus (Figure 2).
4. Conclusions
Chloroplasts are the sites of photosynthesis in plants and the cp genome usually be used for systematic analysis of their highly conservative genome structure but significantly different gene sequence. We studied the characteristics of the cp genome of L. nepalense, an economic plant with great value as horticultural crops and medicine. Our tree result indicated that L. nepalense should be placed in the section Sinomartagon but the section Liriotypus with highly bootstrap value. This study provides a basis for solving a long-standing puzzle for the status of L. nepalense in Lilium.
Habitat type may play an important role in the genetic divergence of species, which has been proved by studies on Phragmites australis [19] , Blanus sp. [20], Dendrobates pumilio [21] and so on. L. nepalense presents an excellent opportunity for this kind of research. Unlike other Lilium species, L. nepalense has a wide range and is distributed across different habitats. Our study showed that there were some differences in the genetic structure between different habitats (dry valley and wet forest, respectively).
This characterized cp genome will also provide essential data for the development of molecular markers and the further population genetics research of L. nepalense.
Data Availability Statement
The genome sequence data that support the findings of this study are openly available in GenBank under the accession no. MW853784 (https://www.ncbi.nlm.nih.gov/nuccore/MW853784.1/).
Authors’ Contribution
Genshen Yin conceived the idea and wrote the draft of the paper. Shilong Wang performed the data analyses. Shuangshuang Zhang and Wenlei Cheng performed the experiment. Minghua Dong helped perform the analysis with constructive discussions. All the authors contributed to the final version of the manuscript.
Acknowledgements
We thank Mingzhong Mo (Forestry and Grassland Bureau of Honghe Hani and Yi Autonomous Prefecture, Yunnan, China) for his assistance with field sampling. This work was supported by the National Natural Science Foundation of China [Grant No.31860107] and the Basic Research Joint Special Fund project of Local undergraduate universities in Yunnan Province [Grant No.202001BA070001-163].