GIFeGSH : A New Genomic Island Might Explain the Differences in Brucella Virulence

An imported dog was confirmed to be positive with canine brucellosis in Sweden in 2010. The whole genome of Brucella canis SVA10 was subjected to phage analysis (WGS-PA) and was assigned to the Asian B. canis cluster. Further analysis indicated that the genome of B. canis SVA10 is smaller compared to genomes of the same species. A 35,781 bp genomic island (GI) was found to be absent in strain SVA10 which was detected by read mapping the paired reads to the genome of B. canis ATCC 23,365. The lacking genes of genomic island GIFeGSH are mainly coding for iron uptake enzymes and parts of the glutathione pathway. A screening of all available whole genome sequences of Brucella strains confirmed that GIFeGSH is also missing in four more strains of B. canis but present in several strains of B. abortus, B. melitensis, B. suis, B. ovis, B. microti, B. pinnipedialis, and B. ceti. Parts of the GI were present, but scattered in two other B. canis strains. The aim of this study was to find differences in the genomes of Brucella which might explain former described differences in virulence. The analysis was extended to all available Brucella genomes after the detection of a genomic island in strain SVA10.


Introduction
The genus Brucella currently consists of 11 species of which B. melitensis, B. abortus, and B. suis are further classified in different biovars.The biovar concept is useful especially regarding epidemiological source tracing queries.However, the genetic divergence within the whole genus Brucella is very low and makes genotyping challenging.Phylogenetic analyses based on 16S rDNA commonly used for bacterial speciation are not possible with the Brucella species since they share 100% identical 16S rRNA genes.Comparing the whole genomes of all Brucella species with DNA-DNA hybridization, the similarity is still between 87% and 100% [1].Due to the high genetic homology the genomospecies concept was recommended for the genus Brucella [2].However, to avoid confusion not at least in medical diagnostics, The International committee on systematic bacteriology, Subcommittee on the taxonomy of Brucella recommended to continue applying the former vernacular names for the nomen species [3].Those taxonomic issues are showing the difficulties that exist due to the high genetic homology between the Brucella species.Consequently, the subdivision of Brucella species in distinguishable strains is even more challenging.
Brucella canis infects dogs and humans and is together with B. melitensis, B. abortus, and B. suis one of the more dangerous Brucella species regarding zoonotic potential, infectious dose and global distribution.Sweden is officially free of brucellosis and there has only been one case [4] and one outbreak of canine brucellosis [5], both caused by imported dogs.B. canis strains can either be assigned to the Africa, America, and Europe AAE group (AAE group) or to the Asian cluster (A group) by whole genome sequencing-phage analysis (WGS-PA) [5].The WGS-PA cluster assignment is based on the type and number of remaining prophage fragments that are not completely removed during bacteria's evolution.The causative strain of the Swedish brucellosis outbreak in 2013 was assigned by WGS-PA to a cluster of strains that are mainly distributed in the AAE group.Brucella strains sometimes contain genomic islands (GI), which range in size from 7 to 44 kilobases (kb) [6].The GIs and even the same GIs can be present in different Brucella species and known differences in pathogenicity and virulence might be explained by genes of the GIs [6] [7].Each virulence gene can theoretically be located on a GI and due to the fact that many GIs are not stable even strains of the same species might be different regarding the virulence properties.An example for a gene which leads to a higher virulence and a higher efficiency in colonizing the host is the horizontally acquired gene coding for γ-glutamyltranspeptidase [8].The gene product regulates the glutathione pathway and thus the antioxidant regulation in the host cells.
The aim of this study was a detailed investigation of the genome of the B. canis strain SVA10, including screening for genomic variations as well as the comparison of the SVA10 genome to other Brucella species genomes regarding potentially genes which might explain differences in Brucella virulence.

Brucella canis Strain SVA10
Brucella canis strain SVA10 was isolated at the National Veterinary Institute of Sweden from an American Staffordshire terrier imported from Poland [4].An aliquot of the original frozen stock was cultivated on Farrell agar [9].

Whole Genome Analysis
A validated workflow for sequencing and analysis of bacterial samples was chosen for whole genome sequencing (WGS).The DNA of B. canis strain SVA10 was extracted from cultivated colonies using an EZ-1 extraction robot and EZ-1 DNA tissue kit (Qiagen, Hilden, Germany).The libraries were prepared with a Nextera XT sample preparation kit (Illumina, San Diego, CA) which allows to proceed samples with a DNA concentration of 0.5 ng/µl.WGS was performed using a 2 × 300 paired-end run on an Illumina MiSeq platform (San Diego, California, USA).
The reads were de novo assembled using the Mira plugin version 1.0.1 in Geneious version 8.1.7 [10].The average nucleotide identity (ANI) as well as the mol% G+C content were determined using the Gegenees software version 2.0 with a score threshold of 20% [11].Screening and annotation of phage genes was done using PHAST [12].The annotation of genes as well as the assignment of genes to pathways was done by the Kyoto Encyclopaedia of Genes and Genomes database [13] [14] in addition to the annotation pipeline of the National Center for Biotechnology Information (NCBI).The Mauve version 2.3.1 plugin [15] was used in Geneious for the alignment and positioning of sequence features.The whole genome sequences of all other B. canis strains with available WGS data in the NCBI database were downloaded from NCBI (Table 1).

Results and Discussion
The reads were assembled into 12 contigs which were deposited in the NCBI database with the accession numbers MAXW00000000, Bioproject PRJNA328097 and Biosample SAMN05363686.
The range of the genome size of B. canis is 3,217,060 bp (strain F7/05A) to 3,318,660 bp (strain Oliveri).The genome of B. canis strain SVA10 consists of 3,264,482 bases with a G+C content of 57.24%.The examined strain has one of the smallest genomes of all available sequenced B. canis strains.In addition to the de novo assembly, the sequence reads of strain SVA10 were mapped against the whole genome sequence of the type strain of the species (ATCC 23,365 T ; 3,312,769 bases) with the aim to detect the lacking genes of the missing 48,287 bases.The main differences between the two strains regarding lacking genes were observed at chromosome II (Table 2).A genomic island, called GI FeGSH was missing in strain SVA10 compared to strain ATCC 23,365 T beginning on genome position 625,001.The size of GI FeGSH is 35.8 kb with 36 coding sequences (CDS) and a DNA G+C content of 57.6 mol%.The sizes of the GI and of the annotations in Table 2 were calculated in Geneious version 8.1.7 [10].It was confirmed by blasting the predicted tCDS of GI FeGSH against the predicted tCDS of SVA-10 that no alternative amino acid sequences exist in strain SVA10 that would be able to compensate for the proteins encoded on the lacking GI FeGSH .
Two major systems, coding for a Fe 3+ uptake system and a glutathione pathway were a striking feature on GI FeGSH (Table 2).The iron uptake in general is either realised by the water-soluble Fe 2+ or in case of Fe 3+ by siderophores which are small molecules that are able to bind the iron outside of the cell.The siderophore can either be actively transported through the cell membrane or it binds to receptors outside of the cell followed by iron reduction to Fe 2+ .In that way Fe 3+ becomes available in environments where no dissolved Fe 2+ is present.
However, due to the availability of Fe 2+ in the host cells there is no need to make Fe 3+ available, which also means that Fe 3+ reduction and uptake are not necessary as long as the bacteria are in the host.Generally, redundancy between iron utilization systems is found in different bacteria and thus absence of one system may not impair the organism's ability to colonize or infect its host.However, the lack of these genes becomes relevant in terms of survival outside of the host, which could allow for extended survival times and increase the chance of infection.The second predominant cluster of genes that are not present in the genome of B. canis strain SVA10 are genes that are regulating the glutathione pathway (EC:1.8.1.12,EC:3.4.11.2, EC:6.3.1.9,EC:3.5.2.9).Glutathione is an antioxidant and uptake and utilization systems are present in several bacterial species [16].
Horizontal acquisition of glutathione utilization has been described before in other bacteria; for example, Campylobacter strains are able to use glutathione by using the horizontally acquired gene coding for γ-glutamyltranspeptidase and such strains are more efficient colonizers of the mouse intestine [8].The ability to use glutathione could be an advantage in Brucella, and may increase colonization and infection potential.Therefore, the presence of the glutathione pathway in a few strains might explain why some B. canis strains are more virulent than others and why they may differ in zoonotic potential.Additional genes related to bacteria's virulence that were found on GI FeGSH are Hemolysin III and Peptidase.
The necessity of additional genomic content in B. canis is disputable since obligate intracellular as well as facultative intracellular pathogens show a trend to reduce their genomes during evolution [17].Therefore, features such as iron and glutathione utilization may be important during environmental survival or to exhibit a hypervirulent potential.Future characterization studies of these features can shed light on the importance of the features described here and complement virulo-and epidemiological typing of B. canis.Not much is known about the epidemiology of human infections caused by B. canis [18] [19] [20].Thus, the strains with and without GI FeGSH were clustered and analysed regarding phylogeny and epidemiological parameters.The analysis of all available whole genome sequences of B. canis strains showed no evidence for epidemiological links between the strains that are lacking GI FeGSH (Table 1).
The detection of GI FeGSH which contains potential virulence genes gives the possibility to assess the effect to the host, the possible transmission as well as the consequences of a Brucella infection.Curing brucellosis caused by bacteria containing GI FeGSH is expected to be more challenging and the bacteria might affect the host cells with a higher efficiency.GI FeGSH might be applicable for source tracing if a strain contains unique scattered parts of GI FeGSH in their genomes as the strains HSK A52141 and SCL (Table 1, Figure 1).Due to the high genetic homology it is difficult to find epidemiological markers within the genomes of several Brucella species.A new whole genome sequenced strain of Brucella canis (SVA10) which was derived from an infected dog imported from Poland to Sweden was analysed and assigned to the Asian WGS-PA cluster [5].A genomic island called GI FeGSH was detected in strain ATCC 23,356 T by comparative genomics.GI FeGSH which encodes genes related to the iron uptake and the glutathione pathway was also detected in other Brucella species and B. canis strains but it was found to be absent in the examined strain SVA10 as well as in a few other B. canis strains.GI FeGSH may therefore be useful as an epidemiological marker in B. canis if the GI is stable.

Table 1 .
Summary of all Brucella canis strains with available WGS data (2016), origin of the strains, assignment to the WGS-PA cluster and presence of the genomic island GI FeGSH .

Table 2 .
Genes of the genomic island GI FeGSH which is present in Brucella canis strain ATCC 23,365 T and absent in Brucella canis strain SVA10.