Cloning and Bioinformatics Analysis of Rosa rugosa S Locus F-Box Gene ( RrSLF )

In order to reveal the phenomenon of R. rugosa pollination incompatibility, the full-length cDNA sequence of S Locus F-box Gene was cloned for the first time from the pollen of Rosa rugosa “Zilong wochi” with RT-PCR and RACE methods and named as RrSLF. The full-length cDNA is 1236 bp with an open reading frame of 1122 bp, encoding 343 amino acids. The derived protein has a molecular weight of 43.7 kD, a calculated pI of 6.24, an F-box conserved domain at position 343 741, and belongs to F-box family. The derived protein is a Hydrophobicity protein secreted into the cytoplasm. There is no transmembrane domain and no signal peptide cleavage site, twenty-one Ser phosphorylation sites, seven Thr phosphorylation sites, seven Tyr phosphorylation sites, two N-glycosylation sites, and no O-glycosylation sites. There are 22.25% α-helixes, 31.37% random coil, 32.17% extended peptide chain, and 14.21% β-corner structure. This protein and the SFB/SLF protein from Rosaceae Prunus fruit, including Prunus speciosa, share a sequence homology of 59% 61%; all of the proteins contain an F-box conserved domain, two hypervariable regions HVa, HVb, and two variable regions V1, V2. Furthermore, their phylogenetic relationships are consistent with their traditional classifications. These results were meaningful to reveal the molecular mechanism of Rosa rugosa pollination incompatibility and improve the theory and techniques of breeding ornamental Rosa rugosa.


Introduction
Rosa rugosa is a famous traditional Chinese flower of the genus Rosa.It is not only an important flowering plant, but its fruits also have considerable ornamental, edible and medicinal values.Ornamental fruit rugosa is an emerging ornamental plant and has become the new favorite in landscape greening because of its beautiful color and appearance, high fruit setting amount and long fruit setting period.However, Rosa rugosa has gametophytic self-incompatibility (GSI) and inbreeding does not produce fruits [1] [2] [3].Therefore, artificial selection to pollinate plants is necessary for Rosa rugosa in gardening.Special attention must be given to the selection of varieties and plant spacing to avoid fruitlessness and low fruitfulness, which will otherwise impair the ornamental and application values of Rosa rugosa.It is important to overcome the self-incompatibility of Rosa rugosa and breed new varieties of ornamental fruit rugosa with self-compatibility [4] [5].But so far the mechanism of self-incompatibility of Rosa rugosa has not been reported yet.
Previous studies have shown that like other species of the family Rosaceae such as apricot and pear, Rosa rugosa also displays S-RNase-mediated gametophytic self-incompatibility, which is regulated by S-RNase gene from style and SFB/SLF gene from pollen [6] [7] [8].Ushijima et al. first cloned pollen-specific SFB/SLF gene from Prunus dulcis in 2003 [9].Later pollen-specific SFB/SLF gene has been cloned from Prunus mume, Cerasus avium, European apricot, Prunus salicina and Prunus armeniaca.Many researchers believe that the pollenspecific SFB/SLF gene can encode for F-box proteins that can specific recognize and ubiquitinate heterogenous S-RNase, resulting in compatible reaction between pistil and pollen [10] [11] [12].The discovery of pollen-specific SFB/SLF gene facilitates the investigation into the mechanism of GSI reaction, but further experiments are needed to prove that the F-box proteins are involved in the interaction between SCF complex and substrate protein [13] [14] [15] [16].The plant species identified with self-incompatibility are generally those with partial self-incompatibility at different levels.Rosa rugosa, however, has complete selfincompatibility and therefore serves as an ideal test material for understanding the mechanism of gametophytic self-incompatibility [17] [18] [19].
In this study, we attempted to clone the SFB gene from the Rosa rugosa pollen and make bioinformatics analysis.The purpose was to provide clues for understanding the mechanism of GSI on the molecular level, not only of Rosa rugosa, but also of other plant species in a broader sense.

Plant Material
The plant material, Chinese representative R. rugosa "Zilong Wochi", was from the rose germplasm resources garden at Shandong Agricultural College.Rosa rugosa "Zilong Wochi" is the most representative traditional rose in China.

Pollen Preserved
Between May 2016 and June 2016, the robust "Zilong Wochi" anthers were collected at 5:00-6:00 pm the day before blooming.The anthers were taken to the lab to dry powder and collected the pollen, the styles were collected and flash frozen with liquid nitrogen and then stored in a −80˚C freezer.

Total RNA Extraction and cDNA Synthesis
An EASYspin plant RNA Rapid Extraction Kit from Adlai Biotechnology Co., Ltd. was used to extract the total RNA from the R. rugosa pollen tissue.Agarose gel electrophoresis and spectrophotometer were used to determine the quality and concentration of the RNA.An EasyScript First-Strand cDNA Synthesis Su-perMix Kit from Bei-jing TransGen Biotech Co., Ltd. was used to synthesize the first-strand cDNA.

Cloning of the Middle Fragment
According to the reported SLF sequences of Rosaceae, the degenerate primers F1 (5'-CATCTACTCTGCCTCCACCA-3') and R1 (5'-GAAAGAAAGACCATTGA-AGAGC-3') were designed with Primer Premier 5.0.PCR amplification was conducted using the synthesized cDNA in Section 2.2.2 as a template and F1 and R1 as the primers.The reaction system included 1 µL cDNA, 1 µL F1 primer (10 µmol/L), 1 µL R1 primer (10 µmol/L), and 12.5 µL PCR MIX, with ddH 2 O added to a total volume of 25 µL.The reaction conditions were: 94˚C for 3 min; 94˚C for 30 s, 55˚C for 30 s, and 72˚C for 30 s for a total of 36 cycles; and then extension at 72˚C for 10 min.Next, 1% agarose gel electrophoresis was used to detect the PCR products.The target PCR fragment was recovered with the MiniB-EST Agarose Gel DNA Extraction Kit Ver.3.0 (TaKaRa).The recovered fragment was ligated to the pMD18-T vector and then transformed into E. coli DH5a.The positive clones were selected and sent to BGI for sequencing.

Full-Length Gene Sequence Splicing and Verification
DNAstar software was used to splice the middle fragment, the 5'-terminal se-quence, and the 3'-terminal sequence in order to obtain the full-length cDNA sequence of the gene.The 5'-and 3'-primers for the spliced sequence were designed with Primer Premier 5 as follows: F2 (5'-ATGACGTCCACAATTTGTAAGAA-3') and R2 (5'-TTAATTCGGTAATACCAAACTTTC-3').The spliced sequence was amplified using the re-verse transcription product of cDNA as a template, and then, it was further validated and verified.

Cloning of the Rosa rugosa SLF Gene
The cloned middle fragment is 401 bp (Figure 1(a)), the cloned 3'-terminal fragment is 751 bp (Figure 1(b)), and the cloned 5'-terminal fragment is 267 bp (Figure 1(c)).These three fragments were spliced together with DNAstar in order to obtain a 1236 bp cDNA sequence.The spliced sequence was then validated by PCR ampli-fication (Figure 1(d)).In addition, the Blast analysis confirmed that all its homologous genes are the SFB/SLF gene and named RrSLF (GenBank accession number: KY446808).

Bioinformatics Analysis of the RrSLF Gene
The RrSFB gene has a full length of 1236 bp, an open reading frame of 1122 bp, a tion result demonstrated that no signal peptide cleavage site, thus indicating a non-secretory protein.The transmembrane domain analysis showed that no transmembrane domain exists.The phosphorylation site prediction results demonstrated that there are twenty-one Ser phosphorylation sites, seven Thr phosphorylation sites, and seven Tyr phosphorylation sites, thereby providing a reference for the future study of the regulation of gene expression and protein modification.The glycosylation site prediction results showed that there is two N-glycosylation site and no O-glycosylation sites.The secondary structure prediction result demon-strated that there is 22.25% α-helix, 31.37%random coil, 32.17% extended peptide chain, and 14.21% β-corner.The BLAST results showed that the protein shares 59% -61% homology with the SFB/SLF amino acid sequences of Rosaceae Prunus fruit including Prunus speciosa (ADZ76515.1),Prunus armeniaca (AAT69249.1),Prunus pseudocerasus (ADZ74124.1),Prunus salicina (BAF91849.1),and 22% -30% homology with the SFB/SLF amino acid sequences of Non Rosaceae plants include Petunia x hybrida (ADD21613.1),Solanum lycopersicum (NP_001316390.1)、Populustrichocarpa (RP65220.1),Antirrhinum hispanicum (CAD56853.1).The multiple sequence alignment result demonstrated that the RrSFB protein and the above plant SFB/SLF amino acid sequences all have a F-box conserved domain, two hypervariable regions HVa, HVb, and two variable regions V1, V2 (Figure 2).Furthermore, the constructed phylogenetic tree revealed that RrSLF is closely related to SFB/SLF from the same family member Prunus pseudocerasus, Prunus avium, and Prunus speciosa, whereas it is relatively distant from Petunia x hybrida, Solanum lycopersicum and Populus trichocarpa, which are from different families, consistent with the traditional classification results (Figure 3).

Relationship between the S-Locus Gene and GSI
GSI is controlled by S-locus with allelic variants.The S-locus consists of at least two genes: one is specifically expressed in the styles and termed style-specific S-gene; the other is specifically expressed in the pollen grains and termed grain-

Bioinformatics Analysis of the RrSLF Gene
There has been a major breakthrough in the pollen-specific S-gene in the family Rosaceae, Scrophulariaceae and Solanaceae.The SFB/SLF gene has been identified as the most potential candidate gene for the pollen-specific S-gene.In these families, the pollen-specific SFB/SLF gene is localized downstream of the pollen-specific S-RNase gene, with transcription in the reverse direction.Pollen-specific SFB/SLF gene consists of one F-box domain, two hypervariable regions and two variable regions.The F-box domain and one variable region are located at the N-terminus of the amino acid sequence; the two hypervariable regions and another variable region are located at the C-terminus.We found that in RrSFB gene, the N-terminal amino acid sequence consists of one F-box domain and one variable region (V1); the C-terminal amino acid sequence consists of two hypervariable regions (HVa, HVb) and one variable region (V2), which agrees with the previous findings.Ushijima et al. believed that like S-RNase gene, the hypervariable regions of the SFB/SLF gene are the sites where selfincompatibility is acting.The recognition ability of the SFB/SLF and S-RNase genes can be decreased by site-directed mutation.The RrSLF gene obtained in this study contained two hypervariable regions at the C-terminus, with HVa exhibiting higher polymorphism than HVb.Self-compatibility mutation can be generated by interfering with the expression of the hypervariable regions and by altering the unique recognition of the S-gene.This method can serve as a new strategy for breeding self-compatible Rosa rugosa varieties by genetic transformation.

Homology Analysis of the RrSLF Gene
The majority of the studies on pollen-specific S-gene are conducted in the family Rosaceae, Scrophulariaceae and Solanaceae, especially in the genus Prunus.Blast alignment indicated that the amino acid sequence homology of the SFB/SLF gene between species of the genus Prunus (eg., Prunus speciosa) is about 80%; the amino acid sequence homology of the SFB/SLF gene between Rosa rugosa and species of the genus Prunus is only about 60%; the amino acid sequence homology of the SFB/SLF gene between Rosa rugosa, species of the genus Prunus, and Petunia x hybrida belonging to another family is less than 30%.The above results indicate high phylogenetic variability of the amino acid sequence of the SFB/SLF gene.We further constructed the phylogenetic tree based on the SFB/SLF gene and found that the RrSFB gene had the smallest phylogenetic distance from the SFB/SLF gene derived from the species of the same family and the longest phylogenetic distance from the SFB/SLF gene derived from the species of different families.This agrees with the conventional plant classification.It is inferred that the evolution of the SFB/SLF gene corresponds with the phylogenetic relationship among the plant species from which the gene is derived.

Limitation of Our Study
If we want to make sense of the difference between the Rosa rugosa SLF and others is the origin of evolution or different ecological groups, the specific mechanism needs further study.The relationship between the pollen gene and the mechanism of self incompatibility of Rosa rugosa also needs to be separated and identified, which can provide valuable experience for further study on the mechanism of self incompatibility.
BLASTX (NCBI) was used to study the homology of the nucleotide sequence and the deduced amino acid se-quence.The ORF finder (NCBI) was used to search for an open reading frame, and the Conserved Domains da-tabase (NCBI) was used to analyze the conserved domains.The ProtParam Tool was used to analyze protein physical and chemical properties.Post Prediction, WOLF PSORT, and SubLocv were used to predict protein sub-cellular localization.Furthermore, ProtScale was used to predict hydrophilic or hydrophobic protein proper-ties.The SignalP 4.0 Server was used to predict the protein signal peptide.The TMHMM Server v2.0 was used to predict the protein transmembrane domain.The NetPhos 2.0 Server was used to predict potential protein phosphorylation sites, and the NetNGlyc 1.0 Server and NetOGlyc 3.1 Server were used to predict potential protein glycosylation sites.ExPaSy-SOPMA was used to predict protein secondary structure.DNAMAN5.2.2 was used to conduct multiple sequence alignment.The Neighbor-Joining method from Mega5 was used to create the phylogenetic tree.

Figure 2 .
Figure 2. Multiple alignment of the RrSLF with other SLF.Notes: The color represents the homology of the gene sequence.The deeper the color, the stronger the homology.

Figure 3 .
Figure 3.The phylogenetic tree derived from the alignment of amino acid secquences of RrSLF and other SLF.