The Locus Pgaabcd of Acinetobacter Junii Putatively Responsible for Poly-β-(1,6)- N-acetylglucosamine Biosynthesis Might Be Related to Biofilm Formation: a Computational Analysis

Poly-β-(1,6)-N-acetylglucosamine (PNAG), the chief mediator of intercellular adhesion in many bacteria, plays an important role in biofilm formation. The pgaABCD locus was recognized from the whole genome sequence of A. junii SH205. The enzyme glycosyltransferase, PgaC, catalyzes the production of PNAG with N-acetyl-D-glucosamine monomer. In this study, the possibility of PNAG biosynthesis in A. junii SH205 with its own PgaC was explored with the aid of bioinformatics. Multiple alignments of PgaC sequences of different bacteria were used to identify conserved amino acid residues that might be critical for the functioning of the protein. Three-dimensional model of A. junii SH205 PgaC was generated for spatial visualization of amino acid residues. The analyses have shown that the protein PgaC has five conserved amino acids, Asp 140 , Asp 233 , Gln 269 , Arg 272 and Trp 273 , critical for the activity of enzyme. Interaction of UDP-N-acetylglucosamine within the conserved pocket of glycosyltransferase was explored from molecular docking studies.


Introduction
The genus Acinetobacter belongs to subclass γ-Proteobacteria, family Moraxellaceae, and comprises Gramstaining-negative, strictly aerobic, catalase-positive, non-motile, oxidase-negative, glucose non-fermenting bacteria with a guanine plus cytosine content of 39% -47%.They are ubiquitous in nature, found in soil and water [1].At present this genus comprises 34 validly published species and 11 species with provisional designations (http://www.bacterio.net/-allnamesac.html).Majority of the species of Acinetobacter are metabolically versatile and easy to grow on simple microbiological media [2].Typical temperature range favouring optimum growth of the representative bacterial species under this genus is mesophilic, however, clinically important species grow optimally at 37˚C [2].Among disease-causing species, Acinetobacter baumannii have been found as an important causative agent for outbreaks of variety of nosocomial infections, such as bacteremia, hospital-acquired pneumonia, and urinary tract infections [3]- [5].The ability of Acinetobacters to colonize and spread among immune compromised patients has been recognised worldwide [6] [7].Species other than A. baumanii which also have been found associated to the human infection, A. lwoffii and A. junii, are less frequently isolated and studied.In recent years a marked incidence of A. junii infection like bacteremia, septicemia, meningitis and corneal perforation has been reported from different parts of the world [8]- [14].However, the actual occurrence of infection caused by A. junii might be underestimated in absence of effective detectable phenotype [14].Apart from several factors including multiple antibiotic resistance [15], prevention from desiccation [16], the ability to form biofilm on medical devices and to colonize on skin and mucosal surfaces of vulnerable hosts [17]- [19] makes Acinetobacters as successful pathogen.Adherence of bacteria to host cells is generally considered to be an essential primary step in the colonization process [20].Once the bacteria get attached to a surface, they colonize there and may secrete exopolysaccharides resulting in a highly structured sessile microbial community within the biofilm [20] [21].The biofilm formation has been well documented in A. baumanii [22] but only few reports are available on A. junii [23] [24].Biofilms are complex biological matrices that contain proteins, ions, nucleic acids and polysaccharide polymers.There are reports confirming Poly-β-(1-6)-N-acetylglucosamine (PNAG) as the major component of biofilms in Staphylococcus epidermidis and Staphylococcus aureus [25] [26].Synthesis of PNAG in staphylococci is controlled by an operon, icaADBC.Like icaADBC operon, another operon pgaABCD has not only been found in A. baumanii but also in the genomes of several other gram-negative bacteria, including Yersinia pestis, Y. enterocolitica, Escherichia coli, Bordetella pertussis, Bordetella parapertussis, Bordetella bronchiseptica, Burkholderia cepacia, Pseudomonas fluorescens, Actinobacillus pleuropneumoniae, and Aggregatibacter actinomycetemcomitans, which controls the PNAG biosynthesis [27]- [32].It was demonstrated earlier that deletion of pga locus resulted in an A. baumannii mutant strain (S1Δpga strain) incapable of producing PNAG, while complementation with the pgaABCD genes fully restored the wild-type PNAG phenotype.It was also shown that heterologous expression of the A. baumannii pga locus in E. coli led to synthesis of significant amounts of PNAG, while no polysaccharide was detected in E. coli cells harboring an empty vector [22].Besides cell-to-cell adherence, PNAG also act as an important virulence factor and protects bacteria against innate host defences [33].Closest analysis of conserved protein domains revealed that PgaC is an N-glycosyltransferase homolog to IcaA; PgaB is a lipoprotein with putative polysaccharide N-deacetylase domains similar to those of IcaB while PgaA and PgaD have no functional homologies [33].
The whole genome sequence of A. junii SH20534 [34] revealed that pgaABCD locus is present in A. junii however no report is available on its relatedness to the virulence.In the present work the genetic potential of A. junii SH205 to synthesize PNAG, for its own adaptable survivability under stress and virulence, has been studied using comparative sequence analysis.Structural and functional analysis of N-glycosyltransferase (product of pgaC), based on sequence homology, was carried out along with molecular docking studies to understand its role in PNAG synthesis and thus relation to the biofilm formation which may be further analyzed for designing strategies to control its virulence.

Comparative Sequence Analysis of PGAC of A. Junii
The sequence of the translated product of genetic loci similar to pgaC of 12 virulent species from different taxa like Acinetobacter, Escherichia, Staphylococcus, Yersinia, Klebsiella and Chromobacterium have been compared with that of A. junii SH205 using the EMBOSS-Needle program which performs global pairwise sequence alignment based of N-W dynamic programming algorithm (www.ebi.ac.uk/Tools/psa/emboss_needle/).All the thirteen protein sequences of the organisms (Table 1) were used for comparison, have been retrieved from Uni-Prot-KB.The basis of selection of genomic regions from these organisms was because of their proven link with the formation of biofilm and/or production of poly-beta-1, 6-N-acetyl-D-glucosamine.The conserved amino acids within the translated region were identified from multiple sequence alignment of these 13 sequences using ClustalW (www.ebi.ac.uk/Tools/msa/clustalw2/).

Structure Prediction of N-Glycosyltransferase and Molecular Docking with UDP-Glc-NAc
The location and conformation of the conserved amino acids in the enzyme glycosyl transferase (product of pgaC) is important to understand the role of these amino acid residues in the synthesis of PNAG important for the formation the biofilm.This necessitates the prediction of its structure in the absence of any experimental model for the enzyme from A. junii.I-TASSER server [35] was used to predict the 3D structure of the enzyme (http://zhanglab.ccmb.med.umich.edu/I-TASSER/).The quality of the predicted structure was checked using the ERRAT server [36] (http://nihserver.mbi.ucla.edu/ERRAT/)and the refinement was carried out using the 3D refine -Protein structure refinement server [37] (http://sysbio.rnet.missouri.edu/3Drefine/).Molecular docking of glycosyl transferase (homologous to PgaC protein) from A. junii SH205, was performed with the ligand; UDP-GlcNAc (UDP-N-acetylglucosamine). The GUI program of Auto Dock 4 suit [38] was used to prepare, run, and analyze the docking simulations.The molecular structure of the ligand was drawn in ACDLabs Chemsketch 12.0 and optimized using UFF calculation in ArgusLab.The energy optimized model was then used as input in the Auto Dock, in order to carry out the docking simulation.Gasteiger charge was assigned and then non-polar hydrogens were merged.The grid box size was set at 36, 32 and 32 Å for x, y and z respectively, and the grid center was set to 69.08, 68.426 and 66.339 Å for x, y and z respectively.Lamarckian Genetic Algorithm (LGA) was chosen to search for the best conformers.
LGA is a flexible ligand-receptor docking genetic algorithm which enables to handle a large number of degrees of freedom with an advantage of empirical binding free energy force field that permits the prediction of binding free energies and for this reason binding constants, for docked ligands.During the docking process, maximum of 10 conformers were considered.Based on the free energy bonding data, out of 10-model result, one best model having lowest binding energy was picked up to analyze its interactions.

In Silico Site Directed Mutagenesis
To strengthen the accuracy of binding site prediction for UDP-GlcNAc in the active site of GT, following steps were followed (1) all possible amino acid substitution that can happen on the existing codon for Asp 140 , Asp 233 , Gln 269 , Arg 272 , and Trp 273 in A. junii SH205 translated pgaC sequence by point mutation at each of the base positions were done (2) similar (D140E; D233E; Q269N; R272H; W273L) and dissimilar (D140H; D233H; Q269L; R272L; W273C) amino acids substitutions from step 1 for each of the five target codon were selected.(3) finally 3D structure of the mutated proteins were generated and predicted using the same methodology as was followed for the wild protein and redocked using Autodock 4.0.The same grid and docking parameters were used for the docking analysis and the effect of mutagenesis on binding affinity was analysed.

Sequence analysis of Pga Proteins from A. junii
The operon pgaABCD is present in diverse bacterial species and found to be responsible for the synthesis of the polysachharide PNAG.Gram negative bacteria produced this polysaccharide, based on the gene expression of pgaABCD homologous loci in their genome.The Pga proteins (PgaA, PgaB, PgaC, and PgaD) from diverse bacterial genera have been studied with respect to their role in the biofilm formation and regulation.Although the locus pgaABCD is explored in A. baumanii, but too little is known about the phenotypic expression in A.junii.Since few studies have reported that the risk factors for A. junii infection were similar to the most clinically important Acinetobacter spp, A. baumannii, we were induced to look for the presence of similar proteins or genes for the biosynthesis of PNAG, polysachharide responsible for the biofilm formation, integrity and pathogenicity, in A. junii.BLAST (blastp suite) analysis at NCBI website, enabled the identification of four-gene locus in A. junii SH205 (the strain whose protein database is available).Sequence analysis of the pgalocus in A. Junii SH205 revealed that the predicted proteins encoded by this locus shared 41%, 23.7%, 68.4%, and 42.9% identity with the A. baumannii AYE PgaA, PgaB, PgaC, and PgaD proteins (Table 2).Therefore, we hypothesized that the locus might be responsible for the synthesis of PNAG in A. junii.

Comparative Sequence Analysis of PgaC of A. Junii SH205
PgaC is predicted to encode a 424-amino-acid N-glycosyltransferase (PgaC) that belongs to the glycosyltransferase 2 family.It is a cytoplasmic protein that is required for the synthesis for PNAG.PgaC of A. junii SH205 and other similar proteins from certain known virulent pathogenic bacteria we hypothesized that the pgaC loci in A. junii SH205 coding for PNAG might be associated with its virulence too.The ability of several pathogens to adhere to human tissues and medical devices by dint of producing biofilms is a major virulence factor that bears logical correspondence with blanket protection against several antibiotics, phagocytosis, and nutrient-stress.Of the different molecules identified as biofilm component in diverse species of eubacteria, PNAG remains as an important molecule that is widely conserved [39].
The multiple alignments of PgaC sequences of different bacteria have revealed that the polypeptide PgaC of A. junii SH205 retains the amino acids that are crucial to the function of the enzyme responsible for building up of the extracellular matrix in biofilms.On the basis of ClustalW and WebLogo results for sequences of PgaC of A. junii SH205, E. coli and some strains of Acinetobacter baumannii, HmsR of Y. pestis and C. violaceum, K. pneumonia, and IcaA of S. epidermidis, certain amino acids were found to be evolutionarily conserved.The result showed conservation of 18 amino acids e.g., Gly, Asn, Glu, Thr, Val, Ile, Asp, Ser, Lys, Ala, Pro, Arg, Gln, Phe, Trp, Cys, Tyr and Leu at 73 positions spread throughout the sequence in different frequencies is shown in Figure 1.Five amino acids that have been shown to be critical for the activity of glycosyltransferase [22] [31] are all found to be conserved in PgaC of A. junii SH205 (Asp 140 , Asp 233 , Gln 269 , Arg 272 , Trp 273 ), indicating functionally similarity of this protein.

Structure Prediction of N-Glycosyltransferase (GT) and Molecular Docking with UDP-Glc-NAc
To investigate the role of these amino acid residues, their conformation and location in the structure of the enzyme was explored.The predicted 3D structure generated by I-TASSER with C-score-0.17,when checked for the quality showed ERRAT Overall Quality Score as 74.760.ProCheck result revealed 98.2% residues in the allowed region in Ramachandran plot [40].The structure refinement using 3D refine had generated 5 models.One of these models had ERRAT Overall Quality Score as 90.625 with 98.4% residues in the allowed region.The refined structure when compared with the annotation of the glycosyl transferase of E. coli (UniProt-KB/Swiss-Prot Ac.No. P75905) has further validated the presence of two distinct structural regions.The transmembrane helical region has a role in anchoring the protein on to the plasma membrane whereas the periplasmic region contains most of the conserved residues (Figure 2).The RMSD of the refined model with reference to the initial predicted structure was calculated to be 0.259 Å as shown in Figure 3(a).The largest cavity having the surface area of 3329.2Å 2 was calculated using CastP [41] (http://sts-fw.bioengr.uic.edu/castp/calculation.php) and found to have 44 amino acids out of the conserved set of 73 residues which is present in the periplasmic region.This suggests the involvement of these amino acids in the active site for the synthesis of PNAG which has further been reviewed using molecular docking results.Rest of the conserved amino acids might have role in maintaining the correct structure of the active site and can be confirmed using site directed mutagenesis and simulation analysis.
PgaC, a cytosolic glycosyltransferase (GT), uses UDP N-acetylglucosamine (GlcNAc) to synthesize the polymer, PNAG.A docking study was carried out by using A.junii SH205 PgaC as receptor and UDP N-acetylglucosamine as ligand.The best model having the binding energy of −4.9 Kcal/mole was selected for analysing the result of the docking experiment.This analysis suggested the involvement of Pro 170 , Ile 190 , Lys 195 , Thr 206 , Ser 208 , Ile 234 , Gln 269 , Arg 270 , Arg 272 and Trp 273 in the interaction with the substrate UDP-GlcNAc (Figure 3(b)).
In silico mutagenesis approach was adopted to reassure the role of five critical amino acids that enables binding of UDP-GlcNAc.The mutated 3D protein models (5 models with substitution of similar amino acids and 5 with dissimilar amino acids) were superimposed, their root-mean square deviation (RMSD) values indicated a good overall structural alignment; as RMSD value of the backbone of whole structures ranged from 0.84 -1.05 Å (Figure 4).The mutant proteins displayed 3D structures with identical β-sheets and α-helices in similar arrangement and distribution with respect to the wild as shown in Figure 5.The molecular docking results showed that residues D140, D233, Q269, R272 and W273 were crucial for UDP-GlcNAc binding.Mutating these residues one at a time resulted in the decrease in binding affinity (higher binding energy) except at position 140 where the point mutation resulted in the lowest binding energy.It was also evident that substitution event in any one of the five crucial amino acid positions with similar amino acid (the other four remaining unaltered) allowed no significant change in the binding energy while substitution of any one of the five with the dissimilar amino acids (resulting per point mutation) was most affected in all the cases (Figure 6).The normality of the data was checked and the level of significance was set at 0.05.Mutations resulting in substitution with similar amino acids did not show significant difference in the binding energy (significance value: 0.398 > 0.05), however the mutation with dissimilar amino acid have shown significant increase (significance value: 0.001 < 0.05).The higher binding energy in case of dissimilar amino acids substitution would presumably indicate reduced association of GT with UDP-GlcNAc.The analyses have also suggested that two positions, D140 and D233, although not directly involved in the interaction with the ligand, have definitive role in maintaining the structure of the binding cavity.These analyses suggested that PgaC is a polysaccharide polymerase that uses UDP-GlcNAc as a substrate.The above mentioned residues are among the evolutionary conserved residue list which are found to be present in the periplasmic domain and involved in the enzymatic activity.However, definite evidence can be provided from only additional experiments such as substrate-enzyme reaction kinetics.

Conclusion
The homology search revealed the identity of four-gene locus homologous to various genetic loci encoding proteins for poly β-(1-6)-N-acetylglucosamine biosynthesis in A. junii SH205.The possibility of PNAG synthesis in A. junii SH205 with the aid of its own PgaC was ascertained by Bioinformatics.Based on this study, one can test the virulence potential of A. junii SH205 in cell culture and invent means of control by blocking the synthesis of PNAG.

Figure 1 .
Figure 1.Conserved amino acids (highlighted in green letters) along with the 5 critical amino acids (highlighted in red letters) in the translated product of pgaC loci among 13 different species, represented in the sequence of A. junii SH205.

Figure 2 .
Figure 2. Representation of two distinct structural regions (transmembrane helical region and periplasmic region) of glycosyl transferase of A. junii SH205 as mapped by local alignment with protein.(UniProt-KB/SwissProt Accn.No. P75905).

Figure 3 .
Figure 3. Homology modeling and molecular docking (a) Backbone representation of the predicted structure of Glycosyl Transferase from A. junii SH205, blue colored backbone represents the I-TASSER generated structure before structure refinement and red colored backbone represents the structure post refinement (b) Interacting residues are shown in yellow color, the substrate UDP-GlcNAc is shown in green colour.The magenta colored region represents the rest of the periplasmic domain.

Figure 5 .
Figure 5. Superimposed 3D structure of wild-type and mutant proteins.

Figure 6 .
Figure 6.Binding energy for in silico glycosyltransferase mutants of A. junii SH205.X-axis represents glycosyltransferase mutants generated by point mutations of the codon at critical amino acids and Y-axis is respective binding energies for UDP-GlcNAc docked with individual mutants.

Table 1 .
Identity and similarity percentage of the translated product of pgaC loci of A. junii SH205 with 12 different species having pgaC loci associated with their virulence.

Table 2 .
Maximum identity percentage between proteins from A. junii and A. baumannii.
This family includes PgaC, BpsC, HmsR, and IcaA from E. coli, B. pertussis, Y. pestis, and S. aureus, respectively.A BLASTP search homologous loci of other gram-negative and Staphylococuus epidermis (gram positive) bacteria with the NCBI A. junii SH205 nonredundant protein database sequences enabled us to identify pgaC that shares a high degree of similarly with pgaC encoding PNAG.Similarities between PgaC of A. junii SH205 and PgaC of E. coli and some strains of Acinetobacter baumannii, HmsR of Y. pestis and C. violaceum, K. pneumonia, and IcaA of S. epidermidis are shown in Table 2. A. junii SH205 PgaC shares 68.4%, 52.6%, 51.9%, 50.9%, 50.7% and 38.7% identity with PgaC of A. baumannii AYE, HmsR of C. violaceum, PgaC of K. pneumonia, HmsR of Y. pestis, PgaC of E. coli, and IcaA of S. epidermidis respectively (Table1).The IcaA sequence of S. epidermidis yielded low identity (<50%) with all studied gram negative bacteria.On the basis of the homology between