Multilocus Sequence Typing of the Guangdong Isolates of Riemerella anatipestifer from Ducks in China

It is a very important significance to explore the bacterial typing methods rapidly, accurately and understand the genetic relation between isolates for controlling effectively the disease and preventing the dissemination and diffusion of the pathogen. The changes in the genetic material nucleotide sequences result in variation and evolution of bacteria, the development of molecular biology and genomics make typing of bacteria from phenotype to molecular typing. Riemerella anatipestifer is the causative agent of polyserositis of ducks and geese. We studied 54 isolates of Riemerella anatipestifer from Guangdong in China by multicolor sequence typing (MLST). The result showed that 54 isolates of Riemerella anatipestifer were divided into 14 STs, and among E3, b12xiao and Cb1 three isolates have with independent of type STs. It was found that there was high homology of RA and low genetic variability in the same area. The synonymous mutation and non synonymous mutation rate: dN/dS in seven housekeeping genes was lower than 0.25. Cluster analysis showed that the 54 isolates were clustered into 5 groups. eBRUST analysis showed that all isolates were clustered into one Group. It also proved that the genetic relationship was very closed in duck plague pathogen in this area; the source may be the same ancestors.


Introduction
Riemerella anatipestifer (RA) is a gram negative Corynebacterium, do not move, do not form spores. Analysis of 16S rRNA gene sequence based on the determined, it belongs to the rRNA superfamily V Flavobacterium family [1].Riemerella anatipestifer disease caused by Riemerella anatipestifer infection is a highly pathogenic, contagious communicable disease [2].The disease is one of the main bacterial diseases in the harm of duck, and it led to economic problems very seriously for duck farm in the world.Serotypes of RA are numerous, the reports of serotype RA accepted a total of 21 species, named blood type 1 to type 21.In the same farm, changes have been observed both when infected duck with RA serotype in each year, and continues to change [3]- [9].Pathology analyzed RA of 80 strains with Hae III enzyme, [10] Amplified fragment of rrn gene of PCR 16S rRNA, identified a further 20 reference strains (670/89) does not belong to RA. Cheng Anchun isolated 4 isolates which were not belong to the serotype 1 -21, from died ducks of 5 day old ~90 days of age in different generations with typical infectious zerositis of duck at 2003 in China.The four isolates were named new serotype 22 -25 [3] [4].As new serotypes emerging and the laboratory serum samples and standard strains being not available, the method of serological typing of RA in epidemiology and specific protection in RA disease have been limited.
It is necessary to determine the serotype of main epidemic of RA in a region for controlling RA infection.However, as much and complex of the serotypes in RA, one duck would mixed infection various serotypes, the RA serotype was also changed frequently in same farm [11].Agglutination test found RA strain which can react with two or more standard antisera, so far there are still many RA strains that failed in serological classification [12], these might lead to new serotype of RA increasing, and the RA serotyping more complicated.And serotyping of RA are often limited to in a few professional laboratory.Because there is no standard strain or all standard antisera, most of laboratory and farm could not serotyping.Therefore, it is necessary to research other effective genotyping method for RA typing, and to monitoring RA infection quickly and accurately [13].The typing method, multilocus sequence typing (MLST), which are based on determining sequences of 7 housekeeping gene of isolates have become a reliable technique to study of the epidemiological infection.The technology reveals the relationship between isolates by analyzing small differences between alleles in the 7 housekeeping genes, thus provides a method of group structure definition within a species, and also reveals the closely related isolates in different geographic origin, the branch of anatomy and other different properties [14].Duck farming in China is very serious state in Riemerella anatipestifer infection.It was difficult to control and immune and prevention RA, as lack of standard serotype strains each area, and lack monitor timely and effectively for the main epidemic serotype strains during the outbreak.In this study, we studied the genotype of 54 isolates of RA from Guangdong in China and established the first sequence database of RA by using MLST technology.

Materials and Methods
The isolates of Riemerella anatipestifer provided by Joint laboratory of animal disease diagnostic center of Harbin Veterinary Research Institute of Chinese Academy of Agricultural Science-Shaoguan University.All of the 54 isolates of RA were isolated from the brain tissue of diseased or dead duck in Guangdong area of China (Table 1).The isolates of Riemerella anatipestifer were idenfied to be virulent bacterial by the physiological and biochemical identification.Seven housekeeping genes were mdh, gluD, rpoB, dnaB, gyrA, groEL and gpi (Table 2).

The Reaction Conditions
For the convenience of the PCR reaction in the same cycle conditions, the reaction conditions are optimized.The reaction cycle conditions were: 94˚C denaturing 5min, denaturation 40s at 94˚C, 52˚C for 40 s, 72˚C for 1min, a total of 32 cycles, the cycle after the end of 72˚C, final extension 7min.3 UL after PCR amplification products was used in 1% agarose gel electrophoresis.

Primers
Seven housekeeping genes mdh, gluD, rpoB, dnaB, gyrA, groEL and gpi were used in MLST.Besides the primer of groEL gene refer to [15], other the primers of six housekeeping gene were designed.The six genes sequence  were downloaded in genome sequence of ATCC11845, DSM15868, RA-CH-1 and RA-GD in R. anatipestifer from Genbank.Six genes sequences were alignment by DNAStar software.The primers were designed by Primer Premier5 base on both ends of the gene from 500 to 900 pb, and the primers selected were analyzed by Oligo7.PCR primers for the six genes were identified according to the evaluation results (see Table 2).

Establishment of Riemerella anatipestifer MLST Database
We established the sequence database of RA (Riemerella anatipestifer equence/profile database) collaboration with Kerth Jolley at animal science department of Oxford, English [16].The the database platform at the University of Oxford, Britain.According to the MLST database platform (http://pubmlst.org/)which was set up by the professor Keith Jolley, the alignment program file of 7 loci each strain based sequence and the strain name, address, country, year and sequence of 7 sites for each strain were uploaded to the platform.It was established that Riemerella anatipestifer sequence/profile database and Riemerella anatipestifer isolates database.For each site and a unique allele sequence tags to identify unique number, these gene fragments contained precise code number inside the box.Seven digital used identification for each isolate.The data stored in the internet database (http://pubmlst.org/ranatipestifer/).

Statistical Analysis of MLST Data
The sequence analysis, allele analysis, clustering analysis and recombination analysis of seven loci and 54 isolates were finished by means of MEGA4.0 and START2 at first.MLST analysis of seven housekeeping genes was used START2 which would need to input the allelic profile, the identification files of isolates, number of alleles sequence file documents.The database data were analyzed by using the allele frequency, document frequency, frequency of polymorphism, code utilization rate and the content of GC function.Evaluation of the correlation between the isolate was that compared with the same sequence of the isolates were identified as having the same allele.For each strain, the combination of alleles at each locus was identified as the mulitlocu sequence typing (ST).The correlation between ST was analyzed by no weight matching arithmetic arithmetic average method (UPGMA) and constructing diagram based on the ST allele mismatch matrix.

Amplify of Seven Housekeeping Genes
Under the optimized reaction system and cycling conditions, except for a handful of primer dimer containing two bands, the remaining isolates in 7 loci were amplified by single specific bands.Figure 1 shows the PCR product results the JS6, JS8, Cn3 isolates.

MLST Analysis
Riemerella anatipestifer MLST database was established by used of the database platform which made by professor Keith Jolley in University of Oxford, Britain.That 54 RA isolates had 14 alleles spectrum, which is divided into 14 series, showed that most isolates have same sequence type.There are three isolates of the E3, b12xiao, Cb1 that possess their independent ST type.The frequency of allele in ST-1 and 14 ST was 11, 20.37% in all isolates.The frequency of ST-3 allele was 8, 14.81% in all isolates.The frequencies of ST-2 and ST-11 showed 7, and account for 12.96% in talisolates.The frequencies of ST-5 and ST-7 was 4, and account for 7.41% in all isolates.The frequency of remaining five ST were 2 and 1, hold respectively 3.7% and 1.85% in all siolates.From the perspective of 7 loci have the same point of view, ST between isolates from the high degree of homology, difference is very small.Andin between the different ST, the analysis of the 54 isolates.There are 4 or more than 4 alleles of the same, so there are only 1 -3 alleles of the 54 different strains, visible their homology is very high, the low degree of variation.

Polymorphism Analysis of Gene Locus
The 7 housekeeping gene allele were analized.Table 3 showed clearly that the number of rpoB alleles was the highest in seven housekeeping genes, up to 13.In addition to the frequency of alleles 1 and allele 4 should  being 11 times and 10 times, the other alleles had been fairly well distributions in number.There were three alleles in the dnaB locus, and the allele 2 appeared 33 times, it was 61.1 percent in total bacteria number.The allele 1 appeared 19 times, and it was 35.2 percent in total bacteria number.There were two alleles in the groEL locus, and allele 1 had appeared 52 times, and was 96.3 percent in total bacteria number.There was a total of three alleles in mdh site.And allele 1 had appeared 52 times, it was 96.3 percent in total bacteria number.The allele 25 and 54 had appeared only one time.The allele number was two in gluD locus, and among them allele 1 appeared 53 times, and occupied 98.1percent in total bacteria number.There were 3 alleles in gpi site, the among of allele 2 appeared 36 times, allele 1 appeared 11 times, and occupied 66.7 percent and 20.4 percent respectively in the total bacteria number.While the gyrA gene sequence was the same in all 54 isolates, and it explained this gene is highly conserved in the evolution of this gene.It is not difficult to find that 54 isolates in each loci are concentrated in a 1 or 2 alleles, the seven housekeeping genes selected in 54 isolates were extremely high conservative, and little genetic variability.

Analysis of Polymorphic Loci
The polymorphic loci in 7 genes was analyzed by START2 (Table 4).The gyrA fragment was highly conserved in all the isolates and it did not contribute in the classification study.The gpi polymorphism was maximum, and reached 46.The polymorphism of meh fragments was minimum and reached only 14. From the ratio of number of alleles and polymorphic loci, rpoB relative to the other 5 gene fragment for typing efficiency was the highest, but the overall point of view, these 6 genes were very low for this type of classification efficiency.This because the 54 isolates were closed to similar source and separation period.The dN/dS of synonymous mutation rate and non synonymous mutation of six genes excepting gyrA gene was less than 0.25.It explained that the 6 loci were purified selection, and they are not affected by natural selection.So the 54 isolates genetic relationship is very close, and was most likely in the same branch on the system development tree and was most likely from the common ancestor.The main features of the seven genes fragments were summarized in Table 5.

NJ Tree Analysis of 54 Isolates of Riemerella anatipestifer
The phylogenetic tree was constructed by software START2 and selected p-distance by Nucleotide substitution model base on series sequence seven housekeeping gene fragments from 54 isolates, the generated NJ phylogenetic tree was shown in Figure 2 and Figure 3.The 54 isolates were divided into six branches.From the Figure 2 and Figure 3, we can found clearly that the isolates same as sequence type was place on same branch in the development tree.It was instructed that the isolates with same ST had high homology and low variation.The each isolates with ST-2 were place on same branch, among isolateRA09 and R06 was from Guangdong A. WS5 and WS2 was from the same farm.ZL4, JL2 and 2 was respectively from three different farm: Guangdong H, Guangdong G and Guangdong F. Isolate S1, S5, SK6 and SK1 with ST-5 were isolated from Guangdong E. The isolate ZL9, ZL12 and ZL2 with ST-7 isolated from the same area; the branch ST-10 source with Guangdong C isolates H8-1 and H6-1, source for Guangdong M isolates tnda2, tnda1, tnxiao4 and source for Guangdong K isolates DN1, DN2, DD2, DD4 in the same branch of ST-11; small branch of ST-6 including for Guangdong B isolates Cn3, Cn4 and source for small branch of Guangdong I1 isolates HC3, HC4 belong to the branch;  a small branch of the ST-3 branch in Including the source for Guangdong C isolates H2, H6, Hy5, Hy4 and I1 isolates HR2, HR3, Guangdong and Guangdong D JS6, JS8 ST-1, another small branch in 1 and 13 isolates from the same, the rest are from Guangdong Branch of A; and HN6, HN4 and HP3, HP5 respectively.From two different parts of Guangdong I, they belong to the branch of ST-13, ST-4, but in the inner branch; isolates E3, Cb1 and b12xiao in a separate branch.From above, the isolates from the same area were divided in the same branches on the tree, which indicates that the sequence differences between isolates from same origin was smaller than that of the isolates from the different sources.Which is consistent with law of the genetic evolution, the isolates from the same area shared the same gene pool.Because the genes exchanging frequently among them, therefore, the evolutionary difference was small and very close genetic relationship.But they existed also differences between the isolates from the same farm or same area, and distributed in different branches.Such as RA09, RA06 and RA1 all from Guangdong A can be divided in different branches, and ST between them is also different.These differences between isolates associated with significantly sub region.The difference of the isolates in same subregion was small than that of the isolates in between subregion.Of course, the smallest branch also includes the sources of different isolates, such as a small branch of the ST-3 branch in includes three isolates different sources.It was indicated that possible presences same epidemic pathogenic bacteria in different area, or that the area was adja-  cent each other, also exist gene flow between the region, and may also be convergent evolution between the isolates different sources.

The Analysis Based on ST (eBURST Analysis)
The eBURST software provided another analysis method based on ST flora relationship by uses data of associated allele spectrum (allelic profiles).The analysis results of 54 isolates were shown in Figure 4 by using the default group definition.The relationship between ST was shown in Figure 4(a) and Figure 4(b).The results showed the entire clonal complex in ST between SLV and DLV.From the results of the analysis can be seen, fourteen STs were divided into one group of eBURST.The predicted founder was ST-10.And the Singletons includes ST-9, ST-8, ST-6, ST-14, ST-11 and ST-1.This six STs may be very close with some members of the group in the phylogenetic relationship but because the conservative group definition method shows only the link between SLVs, so they are not included in the group, unless they are ST in the SLVs group.

Discussion
MLST analysis for 54 isolates from Guangdong area in China found that MLST can be used for the identification and the typing of Riemerella anatipestifer isolates.And it can reveal the correlation for the isolates from the different geographical origin and different isolating time, and provide reference for the prevention and treatment of RA.The results showed that 54 isolates produced a total of 14 STs.The most isolates were same type with sequence, and ST type of the isolates from same source was the same.They were divided in the same branch on the evolutionary tree.The isolates E3, b12xiao and Cb1 have only their own unique ST type.The three isolates were from Guangdong J, L and O.And E3 and b12xiao isolated in the same year.Because the three local isolates are only selected one isolate respectively, so it was not easy to known what was the relationship between the population, the region and the separation time.Nevertheless, we can still found clearly from the cluster analysis that exist larger differences the isolates from different geographical origin, different separation time compare with the isolates from the same source and the same separation time.
A key point of MLST technology is the housekeeping gene selection.However, so far, it has not proven that housekeeping genes can be widely used typing in bacterial pathogens.Although some MLST program used in some gene sites in common, this is the results for which was required by the diversity widely of bacteria and MLST variability [17].Seven housekeeping genes: dnaB, groEL, gyrA, mdh, gluD, gpi and rpoB been chosed in this study, because these genes been used to genetic studies of Riemerella anatipestifer and often used in MLST scheme.The allele number in the seven genes showed that there were 13 alleles in rpoB and there were alleles from 2 to 3 in five genes.While the sequence of the gyrA gene in 54 isolates was same exactly and has not any difference, an allele is only focused on mdh, gluD and rpoB from isolate Cb1.These results indicate that the seven genes in 54 isolates were very conservative and lack of genetic heterogeneity.The seven housekeeping genes been compared with ATCC11845 genome Riemerella anatipestifer standard strain from the NCBI released.It found that the seven genes across 503,160 bp to 1,815,473 bp in the full-length 2,164,087 bp genome and placed in the central of the genome, and the 7 loci covered more than 60% of the whole genome.Therefore, the 7 housekeeping genes selected should be suitable MLST technology for Riemerella anatipestifer.That the difference was very small among isolates might be the reasons: the forms of the farms of the isolates were close, and were isolated from Guangdong area, and was also close to the separation for years.These isolates from region adjacent may share a gene pool and the gene flow high frequency between each other.They do not have enough time and space variation between them.So it could be found the same result no matter from the ST analysis and eBURST analysis that the majority of isolates were the same ST type and there are only differences between 1 -3 alleles in ST.The 54 isolates were divided into one only colony by eBURST analysis.
GyrA gene is a DNA helicase subunit A gene.It has reported that a variety of bacterial [18] gyrA gene of quinolone determining region (QRDR) mutation was related with bacteria resistance to quinolones.Ying L found that that the mutations of gyrA gene in Riemerella anatipestifer was related to the bacteria with quinolone antibiotics resistance by which QRDR fragment of gyrA gene in 26 isolates was amplified and the gene loc was analyzed [19].And he speculated that the mutation of amino acid on 87 sites was the hot spot of quinolone resistance associated mutation in this bacteria.Zhong Chongyue thought that the tolerance of RA was mediated mainly by the pump of resistance to quinolones.The SSCP induced mutant strain gyrA gene and no mutation.The gyrA gene have not mutated when the isolate which been induced to resistance to antibiotics by artificial induced was detected by Single-Strand Conformation Polymorphism，SSCP.In this study, the sequence of gyrA gene did not contain QRDR fragment by the BLAST search.The sequence of 650 bp long gyrA fragment in all of the 54 strains has not any variation.It was indicated that the genetic relationship in 54 isolates was very close, but also know that it was not the contribution in MLST scheme for RA.
Patrick [20] found that there were 10 MLST schemes when 36 different bacteria were analyzed by used combined sequences of 7 housekeeping genes.The sequence analysis of 6 genes and 7 genes has the same resolution rate while in the other 26 schemes, MLST resolution rate of 7 gene fragments than that of 6 gene fragments was high 0.0004.This result showed that the gene with no contribution to the resolution rate or contribute little loci can be replaced.But this is not to say that with the number of alleles minimal loci in the MLST scheme is the minimum resolution rate.Diancourt [21] found that the rplB as a housekeeping gene could be 31 STs, while ST is still 31 excluding the gene when analysis was performed by MLST in 52 isolates of Lactobacillus casei.The rplB gene is eliminated.Therefore, the gyrA in this experiment should not affect the result of MLST analysis for isolates of Riemerella anatipestifer.

Table 2 .
List of gene fragments and primer for RAMLST.

Table 4 .
Allele of seven housekeeping genes.

Table 5 .
Main parameters of seven housekeeping gene from54 isolates of R. anatipestifer.