Homology Modelling and Structural Comparisons of Capsid-Associated Proteins from Circoviruses Reveal Important Virus-Specific Surface Antigens

Circoviridae represent a growing family of small animal viruses. Some of these viruses have veterinary and medical importance, although, a vast amount of these newly discovered viruses have unknown effects on their hosts. The capsid-associated protein (Cap) of circoviruses is of interest because of its role in viral structure, immune evasion, host cell entry, and nuclear shuttling of viral components. The structure of the porcine circovirus 2 (PCV2) Cap has been solved and offered insight to these functions. Based on the crystallographic PCV2 Cap structure, models from circoviruses isolated from avian, fish, and mammalian hosts have been constructed and analyzed to better understand the roles of these proteins in the virus family. A high degree of conservation is observed in the models, however, the surface antigens differ among viruses. This is likely a reflection of the small genome harbored by circoviruses, and therefore the requirement of their few proteins to carry out specific vital functions, while maintaining enough variation to successfully infect their hosts. Here we describe the putative structures of a range of Cap proteins from circoviruses based on the crystallographic determination of porcine Cap, identifying key regions for function and inhibition of crystal formation.

As few proteins are encoded in the viral genomes, these viruses are a model of efficiency and must use host cell machinery to replicate [13].Two major proteins are characterized in circoviruses and circovirus-like viruses; these are the replication-associated protein (Rep), which is involved in the replication of the virus, and the capsid-associated protein (Cap), which is the structural component of the viral capsid.Cap is the main antigenic protein of circoviruses because the protein uses repeated subunits to compose the entire capsid structure of the virus.In PCV and BFDV, 60 repeating Cap subunits are used to make up the capsid [1].
The Cap sequence is highly variable, likely an important aspect to evasion of the host immune system.Amino acid identities in the Cap sequence of BFDV range from 73% -99% [14,15].The Cap protein is a single domain protein, but has several features that pertain to its functions.The N-terminus of the protein is positively charged and may contain several conspicuous nuclear localizing signals (NLS).A second significant function of Cap is its ability to bind DNA [16].Affinity of Cap to DNA is a vital function because this interaction allows the viral genome to gain access to the host cell nucleus.Repbinding capability may also be an important role for Cap.This interaction is specifically interesting because of dis-crepancies between circoviruses.Rep proteins of PCV possess NLSs and can enter the nucleus without Cap [17], however, in BFDV, Rep must bind to Cap for nuclear transport [16].
In 2011 [18] the crystal structure of PCV2 Cap was determined.To crystallize PCV2 Cap, 40 amino acids were omitted from the N-terminal, which contains the NLS, however, the structure placed the NLS inside the capsid [18].The structural data provided important information with respect to the antigenic characteristics of this virus, where predictions about PCV epitopes had either been on the internalized surface and, therefore, not immunogenically relevant or were composed of consecutive amino acids [19,20].The structure of Cap has illuminated epitopes for monoclonal antibody (MAb) binding and has been useful to compare variation in binding of MAbs to PCV2 and PCV1.Small mutations, sometimes involving only a single amino acid substitution, were attributed to specific differences in MAb binding between the two porcine viruses [18].This highlights the selective pressure to employ mutations as a means to elude the host immune system, and can be related to the publication from Kundu et al. [21], showing that viruses infecting the same population may have diverse biological fitness due to genetic mutations that translate into slight amino acid variation.Specifically, Cap must successfully evade the host immune system, attach to a surface receptor on a cell that is appropriate for infection, bind to its own genome (and viral proteins) for localization into the nucleus where replication can occur, all while undergoing additional mutations, which select for optimal antigenicity and must not diminish critical function.These characteristics lead to the hypothesis that although variation among circoviruses and related viruses must be sufficiently different to infect their respective hosts and escape immune detection, they must have the same underlying structures, which allow them to carry out a core and critical set of functions.
Similarities between the amino acid sequences of circovirus Cap proteins and the solved PCV2 Cap also allow for models to be constructed.The differences among strains and between viruses may have different effects when comparing their structural attributes.The rapidly growing number of circoviruses and circovirus-like viruses being discovered in a diverse range of hosts begs the question of their structural relatedness [22].Conserved regions among these viruses would be helpful for development of diagnostic and treatment strategies, and the recognition of new viral discoveries.Models for circoviruses infecting a variety of hosts were analyzed based on the structural data from PCV2 Cap.It was observed that changes in Cap sequences among the analyzed circovirus isolates do not translate into major structural differences of the proteins, and likely, no major structural changes in the viral capsid, however, key differences were observed in antigenic regions and are discussed within.

Modelling of Circovirus Capsid-Associated Proteins
The structure of PCV2 Cap monomer and viral capsid were retrieved from the RCSB Protein Data Bank (PDB ID: 3R0R).The recombinant PCV2 Cap protein that was crystallized did not contain the NLS; rather it lacked the first 40 N-terminal residues, and contained 193 PCV2 Cap residues [18].Several circovirus Cap sequences from a range of hosts were retrieved from the National Center for Biotechnology Information (NCBI, U.S. National Library for Medicine).These sequences are outlined in Table 1.Models were created using SWISS-MODEL (Swiss Institute of Bioinformatics) [23] and viewed with The PyMOL Molecular Graphics System, Version 1.3 (Schrodinger, LLC.).Alignment of the circovirus Cap sequences was performed with Clustal W2 (EMBL-EBI) [24].A phylogenetic tree was made using MAFFT (CBRC) [25] and viewed using Archaeopteryx (phylosoft) [26].

Molecular Replacement Models of Circovirus Caps and Using PCV2 Capsid Structure
The Cap sequences of related circoviruses varied in length from 214 to 273 amino acids, and shared between 23% -32% identity and 35% -46% similarity with the PCV2 Cap sequence (Table 1).The canine circovirus (CaCV-1), the most recent circovirus sequence to be discovered, is the least similar to PCV2 Cap, a strange diversity when considering the mammalian hosts of PCV2 and CaCV-1.
The barbel (BaCV) and duck circovirus (DuCV) Cap sequences share the closest similarity to PCV2 Cap.Interestingly, the two circovirus sequences isolated from fish inhabit the opposite ends of the spectrum; the Silurus glanis circovirus (CfCV) being marginally more similar to PCV2 than CaCV-1.The isolates from mammalian hosts, CaCV-1 and Chimpanzee Stool avian-like circovirus-chimp17 (CsaCV-chimp17), also have major differences, although the CsaCV-chimp17 sequence has been likened to the raven circovirus (RaCV) so it is not surprising that its similarity to PCV2 fits appropriately with other avian isolates.All avian circoviruses compared in this study have over 40% similarity with PCV2 Cap.
The internal jelly roll structure described in PCV2 Cap is conserved among all models constructed of related circoviruses (Figure 1).This jelly roll is formed by seven to eight -sheets, depending on the model.
Although some surface characteristics are altered in each data set, there are still several -sheets making up the core of all the selected viral Caps.All N-termini appear to be facing internally, and therefore, the NLS portions of all Cap structures likely remain inside the viral capsid.

Duck Circovirus
The DuCV Cap model has few deviations from the PCV2 Cap structure, with even less structural variability on the external surface (Figure 3).The model contains two helices and eight -sheets.In the region homologous to PCV2 Cap residues between surface epitopes A and B the most striking difference appears.Several additional residues on the DuCV Cap sequence are organized into a -helix.This landmark protrudes from the structure, however, it corresponds to the internal surface of the capsid, negating its antigenic relevance.Other differences attain similarities from the BFDV Cap model.These include the lack of a -sheet corresponding to the latter portion of epitope A on PCV2 Cap, and a more modest protrusion prior to epitope E, consisting of seven additional residues, rather than the 13 seen in the BFDV Cap sequence.The DuCV Cap is the only avian-related virus among those analyzed to not contain a significant portion of sequential identical residues in epitope E. A pore, due to a tunnel in the DuCV Cap protein, would extend from the external surface to the internal surface of the viral capsid.

Columbid Circovirus
The CoCV Cap model displays structures similar to an average compromise between PCV2 Cap and BFDV Cap.
The similarity to PCV2 Cap leaves very little of the CoCV Cap to extend beyond the surface of the porcine virus structure.Not surprisingly, the differences are the most pronounced prior to the region homologous to epitope E. Here, the additional residues are split into two protrusions, with the median portion returning to similarity with the PCV2 Cap structure.Three sequences which contain several consecutive amino acids that are present in most of the avian-related circoviruses in this study are present in this area, WIPL 183-186 , which is part of the me dian portion that rejoins the PCV2 Cap strand's trajectory, and HYGLAFS 200-206 and PQP 210-212 , which are  shows some small transprotein tunnels, although it is not clear whether these would allow the passage of particles into and out of the viral capsid.

Barbel Circovirus
The small sequence length of the BaCV Cap leaves little extraneous residues to form protruding or additional structures beyond that seen in the PCV2 Cap.The -helix another -helix, although a divergence is also present in the region corresponding to epitopes B and C due to the inclusion of additional residues.Several tunnels exist in the protein, at least one which would create a pore through to the internal environment of the capsid structure.

Gull Circovirus
A set of eight -sheets make up the core of the gull circovirus (GuCV) Cap model.The single -helix present in the model is an extension of an existing, shorter, -helix from PCV2 Cap.This -helix appears on the lateral surface of the protein, and would interact with an adjoining Cap protein in the viral capsid.Not surprisingly, the portion of GuCV Cap prior to epitope E from PCV2 Cap contains additional residues, causing a minor protrusion.The region within epitope E also contains a sequence which matches with several avain circoviruses analyzed, HYGLAFS 180-186 .The GuCV Cap contains transprotein tunnels, creating pores that would extend from the external surface to the internal surface of the viral capsid.

Raven Circovirus
The RaCV shares some striking similarities to the BFDV Cap structures and sequences.The model contains a -helix and eight -sheets.The surface structure homologous to epitope E from PCV2 Cap includes a preceding protrusion consisting of 17 additional residues.This protrusion possesses the WIPL 185-188 sequence seen at the beginning of the additional region, and the VKHY-GLA 203-209 and PQP 213-215 sequences in epitope E, both sequences that are identical or similar to the regions from the BFDV Cap sequence.The region homologous to epitope A also lacks the -helix and -sheet.The protein has pores, although these may not extend from the external to the internal surface of the viral capsid.

Silurus Glanis Circovirus
The CfCV Cap sequence, like the BaCV Cap sequence, is much shorter than PCV2 Cap at 227 amino acids.This allows the PCV2 Cap model to extend beyond the borders of the CfCV Cap model in several places.The first turn in the PCV2 Cap model extends for 12 residues longer than the CfCV Cap, immediately prior to epitope A. CfCV Cap contains two -helices that lie adjacent to each other, one in epitope A and the other prior to epitope E, and seven -sheets.The -helix prior to epitope E is located on the surface and creates a small protrusion in the same position as the protrusions seen in the other models.

Chimpanzee Stool Avian-Like Circovirus-Chimp17
Although the CsaCV-chimp17 virus was isolated from a mammalian host and contains the closest amount of residues to the full-length PCV2 Cap sequence, its name asserts its likeness to the avian circoviruses.The relation to avian circoviruses is evident in the structure and sequence of the model.The overall structure maintains a shape consistent with the avian virus models, however, it is the only model not to contain a -helix.The CsaCV-chimp17 Cap model contains eight -sheets, like the avian Cap models, and has a large protrusion prior to epitope E. The protrusion begins with the WIPL 149-152 sequence seen in the BFDV, CoCV and RaCV Caps, as well as the sequence HYGLAFS 166-172 as part of epitope E, which is seen in CoCV Cap, GuCV Cap and the BFDV Cap sequences that were analyzed.

Canine Circovirus
The CaCV-1 Cap is the least alike in sequence identity and similarity to PCV2 Cap; this is evident in the structure as well (Figure 4).The model includes one α-helix and seven β-sheets.Several stretches of non-homologous residues in CaCV-1 Cap result in additional features.The most conspicuous feature is a very large protrusion that extends from the external surface following the portion of the protein homologous with epitope C.This protrusion extends for 26 amino acids.Other protrusions from CaCV-1 Cap occur also occur on the external surface or on interfaces, which would articulate with other Cap proteins to form the viral capsid.No additional protrusions are present on the internal capsid surface.The model predicts the viral capsid structure to be porous, with multiple tunnels extending through the protein, which would reach from the external surface to the internal surface of the viral capsid.

Discussion
Models were made and evaluated for eight related cir- Copyright © 2012 SciRes.CSTA coviruses, four from avian hosts, two from fishes, and two isolated from mammalian hosts.The structures for the models, along with sequence similarities show the relatedness of the avian circoviruses.Classifications between the other viruses, however, do not necessarily follow the same lines as their hosts.The publications reporting the viruses from non-avian hosts provide insight to the differences seen in these isolates.The discovery of BaCV and CfCV resulted in a new group of fish circoviruses, which are separated from circoviruses of other hosts [8,9].Although BaCV is the closest related virus to CfCV, their Cap sequences only share 28% identity and 46% similarity so their structures need not be similar.Their discrepancy for similarity with PCV2 Cap is not surprising either, as phylogenetic analysis of amino acid sequence places BaCV closer than CfCV to PCV2 (Figure 1) [9].The viruses isolated from mammalian hosts also differ greatly, however this is evident from their nomenclature.The CsaCV-chimp17 isolate was shown to relate to avian circoviruses more closely than other clusters (Figure 1) [12].Meanwhile, the little data that exists on the newly discovered CaCV-1 classifies the canine virus as a separate species, as it shows minimal identity to other circoviruses [11].Although both structures show distinct characteristics for both mammalian-isolated circoviruses, it can be seen that CsaCV-chimp17 resembles avian models with its eight -sheets.
The avian circoviruses share many characteristics, including the previously mentioned eight -sheets, and distinct surface amino acid sequences.The sequences WIPL, HYGLAFS, and PQP occur in several of the analyzed avian circovirus sequences, which corresponds to residues of epitope E of PCV2 Cap and the residues immediately preceding.Since these conserved regions are on the surface and are found in protrusions, the sequences may be a suitable antigen for MAbs that can be used to target a wide range of strains among a species of circoviruses, or possibly as a general diagnostic tool to cross-react with a range of viruses that infect different hosts.The ability of MAbs to cross-react with related viruses has been show to have neutralizing activity, and is of great interest as a diagnostic and characterization tool for many viral pathogens [27][28][29].The location of the conserved sequences on the surface, as well as their close proximity may also select for MAbs that recognize linear epitopes, making them more robust for experimentation and diagnostics.
The overall structure of all circoviruses examined in this paper showed a high degree of similarity.Several studies have noted the high variation in circovirus sequences, especially those contributing to the capsid protein [14,15,[30][31][32][33][34][35][36][37].With the high diversity, even among isolates and strains from the same virus, structural changes may be anticipated.Many surface antigens will change, however, despite the diversity the configuration of the -sheets at the centre of the structure was consistent throughout all models, creating a jelly roll formation as reported for PCV2 Cap and seen in other icosahedral viruses [18,38,39].Even with PCV2 Cap exhibiting nearly twice as many -sheets in the structure as CfCV and CaCV-1, the internal structure is still maintained.
The presence of -helices in the viruses may not be as significant due to their variation among closely related isolates and their generally small size.Long strands, containing the NLS would be seen on the N-terminal of all Cap proteins.
It was noted that all N-termini of the circoviruses viewed were directed internally into the viral capsid.The location of the NLS may also signify a function within the viral capsid.Citing proposed functions for similar Ntermini found in PCV and CAV which are used in packaging the viral genome into the capsid [13], the NLS on BFDV Cap was long assumed to be confined to the inside of the capsid in order to carry out this role [40].Although the high concentration of positively charged arginines prevents a definitive structure from being determined without direct structural analysis of the full-length protein, PCV2 Cap structural data, even with a truncated NLS, is enough to show that the N-terminal of the protein is inward facing for all Cap models viewed.The NLS region was also observed binding DNA [16], giving cre-dence to the claim that the N-terminal in Cap is responsible for viral DNA packaging in the capsid.
Since many changes occur on the external surface and do not affect the interaction with other Cap proteins, the viral capsid of all circoviruses likely share a high level of structural similarity.To continue to expand the structural knowledge of circovirus Caps, crystallographic or NMR data will need to be attained.It would be preferential to use the full-length protein, however, the NLS region has been shown to inhibit expression in E. coli [40].Additionally, the long, positively charged NLS, characteristic to circovirus Caps, is predicted to extend outward from the tertiary structure of the protein creating a flexible region that would likely inhibit packing of a crystal for crystallographic determination.Although the size of these proteins is suitable for NMR determination difficulties arise with the high salt concentration required to purify and solubilize circovirus Caps, as well as its lack of solubility in the pH range compatible with NMR.Nevertheless, additional sequences of circovirus Cap, full-length and fragmented, should be expressed in an attempt to optimize for crystallization or NMR to continue to add knowledge to this rapidly growing family of viruses.

A
full-length BFDV Cap sequence isolated from a redfronted parakeet (Cyanoramphus novaezelandiae) was superimposed on the PCV2 Cap structure to view the putative positioning of the NLS in the virus.The model of the full-length BFDV Cap structure contained one α-helix, and eight β-sheets.Of the four epitopes, B and C showed very little change in external positioning.Epitope A shows differences as the PCV2 Cap structure has a small β-sheet while the putative full-length BFDV Cap has another turn on its strand.Epitope E shows major changes with many additional amino acids prior to the PCV2 Cap epitope causing the strand to protrude.The NLS was expected to extend into the inside of the viral capsid (Figure2).

Figure 2 .
Figure 2. Cross-section of viral capsid of PCV2 composed of 60 units of PCV2 Cap (green).Full-length BFDV Cap is overlain one of the PCV2 Cap molecules to view the position of the NLS inside the viral capsid.The NLS extends well into the capsid cavity, likely interacting with the viral genome.

Figure 3 .
Figure 3. PCV2 Cap (green and cyan) overlain with DuCV (red).The cyan portions on the PCV2 Cap indicate amino acids presumed to be epitopes on the external capsid surface for MAb binding and are labelled A, B, C, and E according to Khayat et al. [18].The protrusion on DuCV that includes residues prior to epitope E has been labeled pre-E-DuCV, and appears on the surface of the protein.The DuCV Cap sequence shares the highest amount of similarity to the PCV2 Cap sequence.The jelly roll structure and several other features remain conserved.withinepitope E. Interestingly, the WIPL sequence in the other models remains as part of the protrusion, not following the PCV2 Cap structure.The -sheet corresponding to epitope A is also not present in CoCV Cap, along with other minor -sheets seen in PCV2 Cap.CoCV Cap at the N-terminal of the PCV2 Cap structure is maintained in the BaCV Cap model, however, no other -helices are present.Eight -sheets are seen in the BaCV Cap model.The region corresponding to epitope A shows neither -helix nor -sheet.The most prominent protrusion from BaCV Cap appears to occur on the internal surface of the viral capsid, which takes place of

Figure 4 .
Figure 4. PCV2 Cap (green and cyan) overlain with CaCV-1 Cap model (pink).The CaCV-1 Cap model has the most discrepancies with PCV2 Cap.The regions pertaining to epitopes B and E from PCV2 Cap are labeled B-CaCV-1 and E-CaCV-1, respectively on the CaCV-1 Cap model.A large protrusion on the surface of CaCV-1 can also be seen on the post-epitope C region and is labeled pC-CaCV-1.The internal jelly roll structure, however, remains conserved.