Identification of a Morphogenic Intermediate of the Bacteriophage Mu Baseplate

Bacteriophage morphogenesis is a model system for investigating sequential molecular assembly. The Mu phage is one of the most classical Myoviridae. Although it is well known as a mobile genetic element, the details of its morphogenesis remain unclear. Analysis of conditional lethal mutants and genome analysis of the Mu phage have suggested that genes 42, 43, 44, 45, 46, 47, and 48 are essential for its baseplate assembly. Since we have already reported X-ray structures of the products of genes 44 (gp44) and 45 (gp45), we here tried to purify the remaining Mu phage baseplate subunits, gp42, gp43, gp46, gp47, and gp48, to investigate the baseplate assembly process. In the case of gp42 expression, the transformed E. coli cells showed growth inhibition after induction and no gp42 fractions were observed. However, gp43, gp46, gp47, and gp48 were successfully expressed and purified, although gp48 could not be applied to further analysis, because the amount of soluble fraction was very low. Based on analytical ultracentrifugation, we concluded that gp43 formed a monomer, gp46 was a monomer, and gp47 occurred as both a monomer and dimer in solution. Moreover, we found that gp43 and gp45 formed an intermediate complex in the baseplate assembly process.


Introduction
Morphologically, the tailed bacteriophages can be assigned to three different families, namely Podoviridae, Myoviridae, and Siphoviridae.The structural proteins of these tailed phages and their genes have been classically investigated using conditional lethal mutants [1].These techniques have also been employed in reconstitution experiments to study the self-assembly pathways of bacteriophages [2]- [4].In the previous understanding of phage morphogenesis, the assembly of the phage subunits was considered as a sequential process directed by a strictly ordered pathway.One of the most important discoveries for bacteriophage research was that such an assembly pathway did not reflect the temporal regulation of gene expression, but was promoted by a specific sequence of protein-protein interactions [5].Thus, bacteriophages have been used as a model system for investigation of the sequential molecular assembly [6].
It is currently possible to reproduce the self-assembly pathway by using recombinant subunits [7].To apply this strategy of reconstitution with recombinant subunits for investigation of the macromolecular assembly mechanism, the expression and characterization of individual subunit proteins is clearly an important first step.Recombinant subunits are also useful for structural studies of bacteriophages and to prepare subunit crystals for X-ray experiments [8].However, the purification and characterizations of the individual phage subunits can be difficult, since the subunits can sometimes aggregate easily.Specifically, the individual subunits may have hydrophobic surfaces that can attach to other subunits and cause aggregation when a single subunit is purified alone.In contrast with these difficulties of subunit isolation and characterization, remarkable progress has been made in the structural investigation of whole bacteriophage particles by means of three-dimensional image reconstructions from electron micrographs [9] [10].Thus, the details of the phage appearance have been well clarified.Moreover, by superimposing the X-ray structure of a subunit onto the three-dimensional image obtained from electron microscopy, it has recently become possible to characterize the locations and interactions between component subunits [9] [11]- [14].
Despite the diversity of the morphology and infection mechanisms in Podoviridae, Myoviridae, and Siphoviridae, each of their tails consist of several subunits are a "nano-machine" responsible for efficient phage DNA injection into the host cells.The distal tail end is characterized by a baseplate that functions in specific host cell recognition and the adsorption apparatus.The details of the baseplate assembly pathway are not well-known, except in the case of a few trailblazing phages such as T4 and λ, whereas several image reconstruction studies have been reported for the other phage baseplate [15]- [17].The Mu phage is another trailblazing Myoviridae, and is composed of an icosahedral head filled with double-stranded DNA, a contractile tail, a baseplate located at the distal end of the tail, and tail fibers.Analysis of over 300 conditional lethal mutants indicated that 27 genes are essential for growth of the Mu phage [18] [19].The gene products of genes Y, N, P, Q, V, W and R (gpY, gpN, gpP gpQ, gpV, gpW and gpR) are necessary for the baseplate assembly [20].The genome analysis of Mu suggested that the genes Y, N, P, Q, V, W and R corresponded to the genes 42, 43, 44, 45, 46, 47, and 48, respectively, and that many of the genes exhibited a low degree of sequence homology to proteins registered in the databases [21].Thus, little is known about the Mu phage gene products that comprise its virion.
Recently, we reported the purification and structural determination of the Mu gp44 and C-terminal domains of gp45 [22] [23].Both are trimers and structurally homologous with other Myoviridae subunits in spite of the low primary structure homology.In this paper, we purified the remaining Mu phage baseplate subunits, gp43, gp46, gp47, gp48, and reported on their stoichiometry in solution.Moreover, our analysis revealed that gp43 and gp45 form an intermediate complex in the baseplate assembly process.

Protein Expression and Purification
The genes 42, 43, 46, 47, and 48 of the bacteriophage Mu were amplified by polymerase chain reaction from lysogenic host genome DNA and cloned into the expression vector pET21a (Novagen) [23] [24].For all these constructs, the DNA sequences were designed to fuse with or without the 6 × histidine-tag at the C-terminal end when the genes were expressed.The expression constructs were used to transform E. coli BL21(DE3)pLysS.The transformed cells were grown at 18˚C in Luriabroth containing 100 µg/ml of ampicillin.Isopropyl-1-thio-β-D-galactosidase (IPTG) was added to a concentration of 1 mM when the optical density at 600 nm reached 0.6.The cells were continuously cultivated for 12 h afterwards, and then harvested by centrifugation at 6000 rpm for 10 min.The resulting cell pellet was re-suspended in 10 mM Tris HCl (pH 8.0), 1 mM EDTA.After sonication (three 5-min cycles at 60 W) of this suspension in the presence of 1 mM phenylmethanesulfonyl fluoride (PMSF) and centrifugation (6000 rpm, 10 min), the recovered supernatant was dialyzed twice against 20 mM phosphate buffer, pH 7.5, 300 mM NaCl.The protein was loaded onto a nickel affinity column (Qiagen), which was equilibrated with the dialysis buffer, and eluted with an imidazole-containing buffer (20 mM sodium phosphate buffer, pH 7.4, 300 mM NaCl, 500 mM imidazole).The protein specimens were further purified by gel filtration on a Sephacryl S-300 (Bio-Rad) gel filtration column (400 ml) equilibrated with 20 mM sodium phosphate (pH 7.4), 150 mM NaCl.The gel filtration was performed with 20 mM sodium phosphate (pH 7.4), 150 mM NaCl and, in the case of purification of the gp43 and gp45 complexes, 500 mM imidazole.All protein fractions were monitored for the absorption at 280 nm, and subjected to 10% or 12.5% SDS-PAGE.The purification procedures described above were carried out at 4˚C.The purified proteins were stored at −80˚C.

Identification of a Subunit Complex
The IPTG-induced cells expressing the individual subunits were mixed in several histidine-tag fused and nontag combinations.After each mixture was lysed, the cell lysate was applied to a nickel affinity column as described above, and a histidine-tag fused subunit bound with a non-tag subunit was recovered.

Analytical Ultracentrifugation (AUC)
AUC was performed with an Optima XL-I (Beckman, CA) with an eight-hole An50Ti rotor at 20˚C with standard double-sector centerpieces and quartz windows.The sedimentation velocity experiment was carried out in 20 mM sodium phosphate (pH 7.4) and 150 mM NaCl at a rotor speed of 40,000 rpm.The acquired data were analyzed using the SEDFIT program [25] [26] to determine the molecular weight.For these calculations, the partial specific volume was estimated from the amino acid composition using the program SEDNTERP [27].

Circular Dichroism (CD) Spectroscopy
CD spectra were recorded on a J-820 spectropolarimeter (Jasco) at 20˚C.All the spectra were calculated as an average of twenty scans from 200 to 250 nm in a 1-mm path-length quartz cuvette.The CD spectra were analyzed by K2D2 sever (http://k2d2.ogic.ca/) in order to estimate the secondary structure.

Results and Discussion
Previously, we reported on the purification and structural determination of gp44 and gp45.The gp44 forms a trimer exhibiting a central hub-like structure with an inner diameter of 25 Å through which genome DNA presumably passes during infection [22] [24].The gp45 is a central tail spike that shows irreversible binding activity to the host cell membrane during infection [23] [28].We tried to investigate the Mu baseplate assembly pathway, but our preliminary experiments showed that gp44 and gp45 did not form a complex directly.Classical electron microscopic experiments using conditional lethal mutants and genome analysis suggested that the remaining Mu baseplate subunits were gp42, gp43, gp46, gp47, and gp48.Therefore, we next prepared recombinant expression and purification systems for each subunit and tried to determine their stoichiometries using AUC.Although the Mu phage is usually cultivated at 30˚C, recombinant expression of the base plate subunits at 30˚C or 37˚C often resulted in insoluble subunit fractions.Finally, we found that low temperature cultivation at 18˚C yielded soluble gp43, gp46, gp47, and a small amount of gp48.We also investigated the subunit complex using a nickel affinity column and histidine-tag to identify the binding counterpart subunit.Various possible combinations of the Mu baseplate components were examined to determine if a subunit complex was present.

Expression of gp42 Was Unsuccessful
From genome analysis and sequence homology, gp42 was annotated a tape measure protein that was important for the assembly of phage tails and involved in tail length determination [21].For our gp42 expression experiments, we constructed the full-length and several truncated gp42 expression vectors.In all of the expression experiments conducted with these vectors, the transformed E. coli cells showed growth inhibition after 1 mM IPTG induction, and no gp42 expression was observed on SDS-PAGE (data not shown).Since there have been reports of a tape measure protein with a lysozyme domain [29] [30], we considered that Mu gp42 also had such a domain, and that its lysozyme activities inhibited host cell growth.Two lysozyme-like enzymes in the public database, Haemophilus influenzae peptidogly can transglycosylase and Pasteurella multocida peptidoglycan transglycosylase, showed significant sequence homology of which amino acid identities were 23% (99 a.a./432 a.a.) and 26% (113 a.a./433 a.a.), respectively, with gp42.In fact, we sometimes found that the host E. coli cells were lysed when we attempted to induce expression of the C-terminal portion of gp42 (Y.Shimamori et al., unpublished results).Our additional data on the gp42 lysozyme activities will be reported elsewhere.

Purification and AUC of gp43
We successfully performed the expression and purification of gp43-His by using nickel affinity column and gel filtration chromatography.The expression yield was approximately 3 mg from 1 L cultivation and solubility was at least 2 mg/ml.The molecular weight determined by AUC was 53,431 ± 3312 (Figure 1).By comparing this weight with the calculated molecular weight of 52,863 from the primary sequence, we concluded that gp43 represented a monomer in solution.The estimated secondary structures from CD spectrum were 46% α-helices and 38% β-strand.

Purification and AUC of gp46
We purified gp46-His by using nickel affinity column and gel filtration chromatography (Figure 2), but the solubility of gp46-His was low and clearly depended on the solution pH.To prepare gp46-His expressed cell extraction, a higher pH condition resulted in a greater amount of gp46 in the supernatant.A comparison of the extraction experiments performed at pH 8.5 and pH 7.4 revealed that the yield was approximately 5-fold greater  under the pH 8.5 condition.The expression yield was approximately 0.5 mg from 1L cultivation and solubility was at least 1 mg/ml at pH 8.5.The molecular weight determined by AUC was 16,824 ± 1737 and that calculated by primary sequence was 18,093.This result indicated that gp46 existed as a monomer in solution.The estimated secondary structures from CD spectrum were 39% α-helices and 32% β-strand.

Purification and AUC of gp47
In gp47-His purification, we obtained two peaks in gel filtration chromatography.A lower molecular weight peak resulted in a single band in SDS-PAGE and a single molecular weight component in AUC corresponded to 38,724 ± 2881 (Figure 3).Although the higher molecular weight peak also revealed a single band in SDS-PAGE (data not shown), we observed two molecular weight species in AUC, one with a molecular weight of 42,541 ± 2867 and the other with a weight of 83,972 ± 4393.Judging from the calculated molecular weight of 39,512 expected from the primary sequence, gp47 formed both a monomer and a dimer in solution.The expression yield was approximately 2 mg from 1L cultivation and solubility was at least 2 mg/ml.The estimated secondary structures from CD spectrum were 36% α-helices and 51% β-strand.

Purified gp48 Formed Aggregates
We purified gp48-His by using nickel affinity column and gel filtration chromatography (Figure 4).However, the solubility of gp48-His was very low and this complex easily formed precipitated aggregates.The expression yield was 0.5 mg from 1 L cultivation and solubility was less than 0.2 mg/ml.In AUC, we observed only aggregation of gp48-His and, therefore, could not determine its stoichiometry in solution.The estimated secondary structures from CD spectrum were 43% α-helices and 41% β-strand.

Identifications of a Subunit Complex
Next, to investigate the molecular assembly of the Mu phage baseplate, we examined the mix purified baseplate  subunits and attempted to identify a subunit complex.However, we often lost proteins during the gel filtration chromatography for identification of the complex.Since aggregations of purified subunits were frequently observed, we considered that formed aggregates might be caught in the column.This possibility led us to change our strategy.To produce a subunit complex prior to the formation of aggregates and precipitants, different cells containing individual over expressed subunits were mixed and then lysed [7].These cell extracts were applied to a nickel affinity column and used for isolation of a histidine-tag fused subunit and non-tagged subunit complex.
Unfortunately, gp48 was an exception and was not applied to this experiment, because the recoveries of soluble gp48 and gp48-His were very low.

Screening of gp43-His-Binding Subunits with gp44, gp45, and gp47
The identification of the assembly counterpart for gp43-His was carried out by mixing the cell expressing nontagged gp44, gp45, and gp47 at the same time.The bound fraction of the nickel affinity column was analyzed by SDS-PAGE and shown to contain gp43-His and gp45.However, when this protein fraction was purified by a gel filtration column with 20 mM sodium phosphate (pH 7.4) and 150 mM NaCl, we only retrieved gp43-His, and the recovery of gp45 was very low.Since the gp43-His + gp45 complex was stable in the elution solution containing imidazole for the nickel affinity column, we added imidazole to the gel filtration running buffer.Finally, the purification of a gp43-His + gp45 complex was done using gel filtration chromatography with 20 mM sodium phosphate (pH 7.4), 150 mM NaCl, and 500 mM imidazole (Figure 5(a)).We have reported that the gp45 was purified with 50 mM Tris-HCl (pH 7.4) and 500 mM arginine [28].Considering these results together with our present findings, we concluded that basic compounds were effective for stabilizing gp45.However, even if we used these basic compounds, the gp43-His + gp45 complex was aggregated during AUC.Thus, we could not measure its stoichiometry in solution.Since an intermediary complex of a molecular assembly process is unstable, this instability drives it toward the completed stable molecule.This principle caused us to face difficulties for investigation of the assembly intermediate.

Screening of gp45-His Binding Subunits with gp43, gp44, and gp47
The transformed E. coli cells expressing gp45-His were mixed with other cells producing the baseplate subunits gp43, gp44, and gp47 at the same time.The extracted fraction of the cell mixture was applied to a nickel affinity column.The elution fractions of the mixture were confirmed by SDS-PAGE, and the results indicated that gp45-His bound to gp43.These fractions were further purified using gel filtration chromatography in the presence of 500 mM imidazole.Although contamination of host E. coli proteins was still found as well as a result of the previous recombinant gp45 preparation [28], the gp45-His and gp43 were found to keep the complex after the gel filtration chromatography (Figure 5(b)).Together with the results of the gp45 and gp43-His, we successfully observed full-length gp43 + gp45 complexes using both of gp43-His + gp45 and gp43 + gp45-His.Moreover, we sometimes detected complexes of gp43 and truncated gp45-His during gel filtration chromatography, when imidazole was not added and degradation occurred with gp45-His (data not shown).N-terminal sequencing resulted that this truncated gp45-His was missing its N-terminal domain and that it was similar with the gp45 C-terminal domain that was used for X-ray crystallography [23] [28].We are now planning to apply a set of systematic N-terminal deletion mutants of gp45 to investigate binding region for gp43.

Screening of gp46-His Binding Subunits with gp43, gp44, gp45, and gp47
Using the strategy described above, we tried to identify the intermediate complex including gp46-His from among gp43, gp44, gp45, and gp47.Unfortunately, we found only gp46-His in the SDS-PAGE results for the elution fractions of the nickel affinity column chromatography.We therefore considered that gp46 participated in a later part of the assembly sequence of the baseplate.

Screening of gp47-His Binding Subunits with gp43, gp44, and gp45
We also used the same strategy for the isolation of gp47-His-containing intermediate complexes using gp43, gp44, and gp45.Since we did not observe a bound subunit with gp47-His in the nickel column fractions, we considered that gp47 was taken into the baseplate assembly in a later part of the sequential process, just as gp46-His was.

The Mu Baseplate Assembly Is Assumed to Start from a gp43 + gp45 Complex
The previous T4 phage reconstitution studies [7] reached the conclusion that one of the expressed proteins, which is to be associated at the last step of the assembly, was an appropriate choice for fusion with the histidine-tag to facilitate isolation of the intermediate complexes.According to this consideration and our observation that both a gp43-His + gp45 complex and a gp45-His + gp43 complex were found, a gp43 + gp45 complex would be the initial complex for the Mu baseplate assembly pathways.In other words, if a gp43 + gp45 complex was not the initial complex, we would expect to observe either a gp43-His + gp45 or gp45-His + gp43 complex, but not both.Based on the same opinion, gp46 and gp47 were not thought to be the third subunit, because we neither observed a gp46-His + gp43 + gp45 complex nor a gp47-His + gp43 + gp45 complex.Finally, we considered as a candidate that the probable third subunit would be gp42, gp44 or gp48.To confirm these expectations, we are now constructing co-expression vectors for various combinations of several subunits.These investigations are expected to provide new details of the Mu baseplate assembly sequence.

Figure 1 .
Figure 1.(a) SDS-PAGE (12.5%) analysis of the gp43-His prepared by gel filtration chromatography showed that the protein specimen was a single fraction in the electrophoresis; (b) SEDFIT analysis of AUC of the purified gp43-His.The rotor speed was 40,000 rpm.Moving boundaries were measured at 280 nm, 20˚C.The molecular weight determined by AUC was 53,431 ± 3312 and the molecular weight expected from the primary sequence was 52,863; (c) CD spectrum of the gp43-His.

Figure 2 .
Figure 2. (a) SDS-PAGE (12.5%) analysis of the gp46-His prepared by gel filtration chromatography showed that the protein specimen was a single fraction in the electrophoresis; (b) SEDFIT analysis of AUC of the purified gp46-His.The rotor speed was 40,000 rpm.Moving boundaries were measured at 280 nm, 20˚C.The molecular weight determined by AUC was 16,824 ± 1737 and that expected from the primary sequence was 18,093; (c) CD spectrum of the gp46-His.

Figure 3 .
Figure 3. (a) SDS-PAGE (12.5%) analysis of the lower molecular weight peak of gp47-His prepared by gel filtration chromatography.The results showed that the protein specimen was a single fraction in the electrophoresis; (b) SEDFIT analysis of AUC of gp47-His in the lower molecular weight peak of the gel filtration chromatography.The molecular weight determined by AUC was 38,724 ± 2881 and that expected from the primary sequence was 39,512; (c) SEDFIT analysis of AUC of gp47-His in the higher molecular weight peak of the gel filtration chromatography.The molecular weights determined by AUC were 42,541 ± 2867 and 83,972 ± 4393; (d) CD spectrum of the gp47-His.

Figure 4 .
Figure 4. (a) SDS-PAGE (12.5%) analysis of the gp48-His prepared by gel filtration chromatography showed that the protein specimen was a single fraction in the electrophoresis.The molecular weight calculated from the primary sequence was 21,290; (b) CD spectrum of the gp48-His.

Figure 5 .
Figure 5. (a) SDS-PAGE (10%) analysis of the gp43-His + gp45 complex prepared by gel filtration chromatography.The arrows correspond to gp43-His (upper) and gp45 (lower); (b) SDS-PAGE (12.5%) analysis of the gp45-His + gp43 complex prepared by gel filtration chromatography.The arrows correspond to gp45-His (lower) and gp43 (upper).Not only the gp45-His and gp43 but also several contaminated proteins originated from host E. coli.This problem was also occurred when recombinant gp45-His was previously prepared[28].