1. Introduction
Mucins are a part of the glycoprotein family and the main proteinaceous component of mucus, produced by epithelial cells. [1] [2] [3] synthesized most mucins form a thick mucus layer, while others form a glycocalyx coating [4] [5] . Mucin’s general structure and biochemical properties such as, ysozymes, immunoglobulins, complex mixture of proteins, ions, lipids, and other components [6] , protect cell surfaces, and their specific molecular structures regulate the local molecular microenvironment near the cell surface [7] . Additionally, they act as a protective barrier; block the passage of bacteria, large molecules, and other infectious organisms [8] . Mucins also play roles in several other biological functions, including the regulation of gene expression, cell proliferation, cell differentiation, embryogenesis, immunity, apoptosis, lubricator for internal organs in the body by forming mucus, and key role between epithelial cells [9] . Moreover, mucins play a role in carcinogenesis [10] and they are involved in the innate immune system [11] [12] . Consequently, mucins are multifunctional proteins and likely undertake multiple roles also in cnidarian species. In previous cnidarian mucins study by [13] a repertoire of mucin genes was identified in A. veratra, but the study did not investigate the mucin expression in this species.
The sea anemone A. veratra belong to phylum Cnidaria, within class Anthozoa and order Actinaria [14] [15] is also known as green shore anemone [15] . A. veratra is a native to Australian and New Zealand coasts [15] [16] . In the intertidal zone, A. veratra can be found to be covered with coarse sand and shell grit (Figure 1) and is exposed to air during the tidal cycle. To cope with this periodic exposure, it produces large quantities of mucus as an external covering. The production of large amounts of mucus has been proposed as one potential adaptation to aerial exposure in this environment, but gene expression patterns of mucin genes have not been examined in this species in response to temporal and spatial environmental stresses.
Figure 1. Sea anemone A. veratra. The figure shows in (a) the sea anemone A. veratra covered with coarse sand and shell grit (Image from Silva, 2013). (b) A. veratra at high tide. The A. veratra size up to 6 cm in length and up to 4 cm in diameter (Image by Alaa Haridi and Physiological Genomics Lab group in the Queensland University of Technology, 2014, 2015).
Previous studies in cnidarians have examined the response of genes to different environmental stresses. These studies have included the following stresses: temperature in Montastraea faveolata, A. pallida [17] and Porites astreoides larvae [18] ; thermal stress in A. millepora [19] ; sunlight stress in A. tenebrosa [20] ; and air exposure in Palythoa caribaeorum [21] ; and Veretillum cynomorium [22] . In fact, these studies have investigated the influence of several environmental stress factors on different stress response genes in cnidarians, but no studies have examined the expression of mucin genes under any stress treatment.
To date, sea anemone molecular studies have suggested the potential role of the mucin genes in wound healing and in the development stages in N. vectensis [23] [24] . None of these studies has examined the response of mucin genes, in particular, not in the green snakelock A. veratra. What the molecular role of the mucin genes is under aerial exposure stress is still obscure. This deficiency highlights the need to study the expression of mucin genes under simulated environmental stresses in cnidarians, such as intertidal sea anemones.
This current study addressed this question by examining and analysing the response of A. veratra mucin genes in response to an experiment involving three hours of aerial exposure. The high-throughput next-generation sequencing technology and bioinformatics analyses were used to complete this study. This information will help researchers to understand mucins potential significant role in protecting this species during low tides.
2. Materials and Methods
2.1. Sample Collection and Air Exposure Experiment Design
Two samples of A. veratra were collected from Port Cartwright, Queensland, Australia, in February 2016 and maintained at the marine laboratory at the Queensland University of Technology. In this facility, they were housed in 50 L aquarium glass tanks under controlled conditions that reflected their natural environment as detailed in [25] for details. The first A. veratra individual was aerially exposed for three hours before RNA extraction, while the second sample was kept immersed in the tank as a control. In this study, the air exposed sample was named (Air) and the control sample, named (Water) (Figure 2).
Figure 2. Experiment samples. Aulactinia veratra, water sample fully submerged in water (immersed as a control) and air sample (3 hours of exposure to air). (Images by Alaa Haridi and Physiological Genomics Lab group in the Queensland University of Technology, 2014, 2015.)
2.2. RNA Extraction, Library Preparation, Sequencing, Assembly and Quality Assessment
Individual sea anemones (air and water samples) were snap frozen in liquid nitrogen and stored at −80˚C until RNA extraction. The individuals were homogenised in liquid nitrogen and total RNA was extracted from the whole organisms using a Trizol/chloroform RNA extraction protocol [26] [27] . RNA quality and integrity were tested using a bioanalyzer 2100 RNA nano chip following the protocol of [28] . Library preparation was undertaken using an Illumina True-Seq stranded mRNA sample preparation kit (Illumina) and following the manufacturer’s instructions. Sequencing was performed on the Illumina NextSeq 500 platform using 150 bp paired end reads. The non-biological sequences as well as low quality reads (Q < 20) were removed using Trimmomatic [29] [30] . High-quality sequence reads with thresholds of Q > 20 were used for assembly and other downstream analyses. The raw RNA sequence reads were deposited to GenBank® at the National centre of biotechnology information (NCBI), the accession numbers are available in the supplementary file: Table S1.
The two sets of clean reads were assembled using the Trinity v2.0.6 short read de novo assembler using default settings and the stranded tag [31] . CD-Hit v.4.6.1 [32] [33] was used to remove redundant and chimeric sequences from all assemblies [28] . In addition, the core eukaryotic genes mapping approach (CEGMA v.2.5) [34] was used to assess the assembly completeness by determining the percentage of full-length sequences corresponding to 248 highly conserved eukaryotic proteins [26] .
2.3. Functional Annotation and Gene Ontology
Transcriptome annotation was conducted using the Trinotate pipeline V3.0 following method detailed as per [13] for details. Gene Ontology (GO) terms were assigned to contigs that received BLAST hits and had functional annotation information. The distribution of GO terms across Molecular Function (MF), Biological Process (BP) and Cellular Component (CC) categories were visualised in WEGO [35] as per [36] .
2.4. Read Mapping and Differential Gene Expression Analysis
BOWTIE2 software v.2.2.5 was used to map reads back to the reference assembly [37] for both species in which differential gene expression analysis was undertaken. Transcript abundance was estimated using RSEM v1.2.19 [38] . Fragments per kilobase of transcript per millions (FPKM) was calculated for all transcripts. Differential gene expression (DGEs) analysis was performed using the EdgeR Bioconductor package [39] in the trinity pipeline following the scripts at (https://github.com/trinityrnaseq/trinityrnaseq/wiki/Trinity-Differential-Expression). The gene list with statistically significant differential expression based on a false discovery rate (FDR) of <0.001 and log fold change (FC) ≥ 2 were then clustered using hierarchical clustering to produce a heatmap at (https://github.com/trinityrnaseq/trinityrnaseq/wiki/Trinity-Differential-Expression). The list of differentially expressed genes was filtered to find if any of the mucin candidates were in this list.
2.5. Gene Set Enrichment Analysis (GSEA)
Gene set enrichment analysis was conducted on the list of DGEs using trinity scripts at (https://github.com/trinityrnaseq/trinityrnaseq/wiki/Trinity-Differential-Expression). The results were filtered to determine if specific ontologies were over or under-represented in comparison to the overall assembly based on FDR of <0.01.
3. Results
3.1. Sequencing, Assembly, Annotation, and Gene Ontology
A total of 69,513,111 clean reads were assembled into 118,019 contigs. The assembly was highly complete and contained a high proportion of full-length transcripts (98.4%). More statistics are available in the supplementary file: Table S2.
Overall, 47,513 contigs returned significant BLASTx hits with a stringency of 1E × 10−5, and gene ontology terms were assigned to 35,574 contigs. The distribution of GO terms showed that the highest number of GO terms was assigned to BP, then MF and finally the least terms were assigned to CC. In the BP category, cellular process (20,728 gene) (74.5%), metabolic process (14,716 genes) (52.9%) and biological regulation (12,237 genes) (44%) were the most assigned GO terms. Binding (18,438 genes) (66.3%) and catalytic activity (11,099 genes) (39.9%) were the top assigned GO terms under the MF category. The greatest number of GO terms assigned under the CC category were; cell (22,942 genes) (82.4%), cell part (22,940 genes) (82.4%) and organelle (15,978 genes) (57.4%). WEGO plots was presented in the supplementary: Figure S1.
3.2. The Effect of Air Exposure on Mucus Production in A. veratra
At the beginning of the experiment, the A. veratra oral disc and tentacles were completely closed. Then, during the first hour, the oral disc opened slightly while the tentacles were still drawn, and there was some mucus production. During the second hour, there was a clear increase of mucus covering the oral disc. During the third hour, mucus production increased again.
3.3. Comparative Different Gene Expression Profiles, and Cluster Analysis across Water and Air Treatments
Overall 5686 differentially expressed sequences were found in A. veratra across water and air treatments. The 5685 sequences included 3243 genes significantly up-regulated in the water treatment, while the remaining 2443 genes were up-regulated in the air treatment. The expression level across the two samples based on (FC ≥ 2 and FDR of ≤0.001) is shown in a hierarchically clustered heat map (Figure 3). Contigs with up-regulation are indicated in yellow, the contigs with down-regulation indicated, in purple. The contigs expression pattern similarity across the samples is indicated by the three generated sub-cultures, as shown in Figure 4.
Figure 3. Cluster analysis heat map. Hierarchically clustered heat map showing the RNA-sequence expression levels of 5686 differentially expressed contigs across the water and air samples based on (FC ≥ 2 and FDR of ≤0.001). The log2-transformed median-centred FPKM was the expression values. The up-regulation and down-regulation contigs are indicated by the yellow and purple colour intensities, respectively. The similarity of expression patterns across the samples is represented by the three levels of clustering as indicated by the blue, red and green coloured bars. These three levels correspond to the sub-clusters shown in Figure 4.
Figure 4. The three levels of clustering. Cluster plots show the contigs with similar expression patterns across the two samples (air and water) resulting into three sub-clusters. The grey lines show the expression of the contigs, which was measured in log2-transformed median-centred FPKM, while the blue lines show the expression profile for each cluster.
3.4. Aulactinia veratra Mucin Genes Differentially Expressed across Water and Air Treatments
Overall, mucin5B-like, mucin4-like and mucin-like genes were differentially expressed. Specifically, the mucin5B-like and mucin4-like were differentially expressed under the aerial exposure treatment, while the mucin-like was differentially expressed in the water treatment.
3.5. Gene Set Enrichment and Functional Analysis
Functional enrichment analysis was performed on the differentially expressed contigs, and it revealed a number of over-represented GO terms in the air and water treatments. The enrichment results based on FDR < 0.01 identified 42 statistically over-represented and 29 under-represented genes in the water sample; while in the air sample, 28 were over-represented and 7 were under-represented genes. The highest number of over-represented genes in the water and air samples was under the BP category, followed by CC in the water treatment and MF in the air treatment sections, with the lowest being the CC in the air treatment option, and MF in the water sample (Figure 5). The set of over-represented genes in the air treatment is available in the supplementary file: Table S3. Specifically, the majority of the gene ontology terms under the BP in the air treatment belonged to the defence and immune response functions. Additionally, genes belonging to “binding”, “transport”, and “developments” were over-represented in response to air exposure. On other hand, a group of extracellular genes constituted the majority of over-represented genes under the CC category in the water treatment, available in supplementary file: Table S5. However, the under-represented genes in the air treatment included those which referred to cellular activities or to the receptor pathway and activity are shown in supplementary file: Table S4. This result showed a reduction in the gene activity inside the cell when the animal body was exposed to environmental stress. In contrast, the majority of under-represented genes in the water treatment were rich in receptor activity, signal transduction and transporter genes are presented in supplementary file: Table S6. Among the over represented genes from the air treatment, the extracellular region ontology was enriched by mucin5B-like, while the calcium ion binding ontology was enriched by the mucin4-like.
Figure 5. Genes enrichment results based on FDR < 0.01. The bars show over and under-represented genes among the three gene ontology categories, Cellular Component (CC), Molecular Function (MF) and Biological Process (BP), in air and water samples based on FDR < 0.01. The highest number of over-represented genes across water and air samples was under the BP category.
4. Discussion
To examining the expression of A. veratra mucins which were identified by [13] , the RNA sequencing and bioinformatics approaches were used. Current study reported that A. veratra mucin4-like and mucin5B-like genes were differentially expressed in response to air exposure; and a single mucin-like gene was differentially expressed in the water treatment.
4.1. Mucin Expression
The identified mucin5B-like and mucin4-like genes were the only responsive mucins to the three hours of air exposure, indicating that they may play an important role when the animal is out of the water and confronting conditions of stress such as air exposure. From this data we hypothesise that mucin5B-like and mucin4-like are the mucins that respond to this type of stress in the sea anemone A. veratra. In a previous study, a change in mucin5B-like expression was observed during development of the sea anemone species N. vectensis [24] . This expression data indicates that mucin5B-like may have different roles across ecologically divergent sea anemone species such as A. veratra and N. vectensis, or that it undertakes multiple roles in different metabolic and developmental processes. This indicates that mucin5B-like and mucin4-like may have a role in the production of mucus and protection of this sea anemone during aerial exposure.
The function of the mucin-like gene that was differently expressed in the water treatment is not clearly understood, but it may have a protective under submerged conditions. For example, in fish natural mucus secretions occur to safeguard the skin against pathogen invasion [40] . The expression of the mucin-like in the water sample could also be interpreted as mechanism to increase mucus production for a similar role as that observed in fish.
4.2. Genes Functionally Associated with Stress Response
Most of the over-represented genes under the aerial treatment were related to defence and immune response functions. This indicated that the response to air exposure in A. veratra induces stress related genes, which has been observed in a number of invertebrate species under environmental stress [41] . An earlier study on N. vectensis identified three different groups of genes related to stress-response in sea anemones, including genes belonging to wound healing, genes belonging to immune response and genes belonging to chemical stresses [42] . [43] examined genome wide patterns of gene expression in Acropora hyacinthus under thermal stress and found genes belonging to these classes to induced in thermally sensitive individuals. Overall, this data indicates that aerial exposure in A. veratra is stressful, and that the induction of stress response genes may serve a protective role during this time.
The gene expression data from A. veratra is indicative of the cellular stress response (CSR), a universal cellular defence mechanism, which is initiated in response to changes in the extracellular environment [44] . Depending on the duration and severity of stress, cells either reinstate the process of cellular homeostasis to the previous state or take on a distorted state in the new and changed environment [45] . The response includes control of the cell cycle, DNA and chromatin stabilization and repair, removal of damaged proteins, protein chaperoning and repair, as well as various metabolism functions [44] . Consequently, the induction of stress response genes seen in A. veratra may be associated with the CSR and allow cells of this animal to re-establish normal function when returned to water.
5. Conclusion
This investigation of mucin genes expression in A. vertara provided a first look at how mucin genes in the intertidal species respond to aireal exposure during low tide. Overall, the experiment has shown that the expression of A. veratra mucin4-like and mucin5B-like genes is induced by aireal exposure. This indicates that they may have an important role in defence against environmental stresses. The study data contributed to our understanding of mucin4-like and mucin5B-like genes and genes related to stress in sea anemones. Future mucin gene studies in cnidarians can use these findings as baseline for future comparison so as to provide a greater understanding of mucin gene expression in cnidarians in general.
Acknowledgements
I would like to express my thankful to Dr. Peter Prentis for his guidance and engagement during the conduction of this research. Also, I extend my thanks to the Evolutionary and Physiological Genomics Lab group in the Queensland University of Technology for their help and support during the experimental procedures. I would like also to thanks the Marine lab group at Queensland University of Technology for their advice and caring for the marine animals.
Supplementary Tables and Figures
Table S1. Sequences read accession numbers. The raw RNA sequence reads accession numbers at the National centre of biotechnology information for the air and water A. veratra transcriptomes.
Table S2. Assembly statistics. The assembly statistics metrics generated from A. veratra transcriptomes (air and water treatment reads were merged to a single reads file, then assembled).
Table S3. Gene set enrichment. List of 28 over-represented genes generated from the aerial exposure sample based on (FDR < 0.01), GO terms in bold refer to the ontologies enriched by mucin5B-like and mucin4-like.
Table S4. Gene set enrichment. List of 7 under-represented genes generated from the aerial exposure sample based on (FDR < 0.01).
Table S5. Gene set enrichment. List of 42 over-represented genes generated from the water sample based on (FDR<0.01).
Table S6. Gene set enrichment. List of 29 under-represented genes generated from the water sample based on (FDR < 0.01).
Figure S1. WEGOplots. The distributions of A. veratra gene ontology termsacross GO three main categories, cellular component, molecular function, and biological process.