Source Anomalous Scattering Using Cr K α Radiation for Its Potential Application in Determining Macromolecular Structures

Obtaining phase information for the solution of macromolecular structures is a bottleneck in X-ray crystallography. Anomalous dispersion was recognized as a powerful tool for phasing macromolecular structures. It was used mainly to supplement the isomorphous replacement or to locate the anomalous scatterer itself. The first step in solving macromolecular structures by SAD (single-wavelength anomalous diffraction) is the location of the anomalous scatterers. The SAD method for experimental phasing has evolved substantially in the recent years. A phasing tool, 5-amino-2,4,6triiodoisophthalic acid (I3C—magic triangle), was incorporated into three proteins, lysozyme, glucose isomerase and thermolysin using quick-soaking and co-crystallization method in order to understand the binding of metal ion with proteins. The high quality of the diffraction data, the use of chromium anode X-ray radiation and the required amount of anomalous signal enabled way for successful structure determination and automated model building. An analysis and/or comparison of the sulfur and iodine anomalous signals at the Cr Kα wavelength are discussed.


Introduction
Current structural genomics projects aim to solve a large number of selected protein structures as fast as possible.High degree of automation and standardization is required at every step of the whole process to speed up protein structure determination.It is not easy to obtain automatically the crystal derivatives, which is appropriate for phasing.New ideas have been put forward, that aim in making the phasing of novel structures easier and more susceptible to routine and automatic treatment [1].Phase problem is a bottleneck in macromolecular structure determination and also in model building which is a time-consuming task.Phases can be derived from some knowledge of the molecular structure.Structures of small proteins (molecular weight less than 10 kDa) can be determined in solution using nuclear magnetic resonance (NMR) spectroscopy and the assembly of proteins in a complex can be studied using electron microscopy, but only X-ray diffraction helps in determining the three dimensional structure of small and large proteins with a precision of about 0.1 -0.2 Å.In macromolecular crystallography, the phases are derived either by Molecular Replacement (MR) method using the atomic coordinates of a structurally similar protein or by locating the positions of heavy atoms that are intrinsic to the protein or that have been added (MIR, MIRAS, SIR, SIRAS, MAD and SAD) [2][3][4][5].
MIR is a classic method of solving novel crystal structures of macromolecules and has been responsible for an enormous amount of success of structural biology, since the early days of protein crystallography.MR also has been widely used when appropriate models are available.Over the past decade, MAD has been a vehicle of progress in phasing new crystal structures.Both MIR and MAD require the presence of either appropriate heavy atoms or anomalous scatterers, which are naturally occurring or specifically introduced in the macromolecule [6].The standard method of derivatization in MIR involves soaking or crystallizing the native crystals in diluted solutions or heavy metal reagents.In MAD, selenium derivatization is carried out by genetic engineering using which the normally occurring Methionines (Met) are replaced as Se-Met.Both the approaches have drawbacks because of heavy-atom derivatization, which results in non-isomorphism between the native and derivatized protein crystals [7].Sometimes, several deriva-tives are required to achieve success.By collecting Multiple wavelength anomalous diffraction (MAD) data at two or more wavelengths, the definitive phase angle can be determined using MAD technique [8,9].
In comparison, only a single set of X-ray data is required by Single wavelength anomalous diffraction technique (SAD) technique to provide the positions of the anomalous scatterers, which together with density modification can reveal the structure of the complete protein.
The sulfur SAD phasing method allows the determination of protein structure de novo without reference to derivatives such as Se-Met [10,11].For targets, with a weak MR solution from which structure cannot be determined, the MR solution can be incorporated into the SAD experiment and the increased number of sites identified by combined SAD/MR could be used in a subsequent SAD experiment with no MR component to calculate the phases [12].The number of protein structures being phased using only a single set of diffraction data by SAD method has increased due to improved in-house X-ray sources, detectors and softwares.This technique has led to the routine use of anomalous scattering to obtain phase information from either intrinsic sulfurs or phosphorus presented in macromolecules or by addition of heavy metal reagents by soaking/co-crystallizing method in the native protein crystal [13,14].In its purest form, SAD can simply utilize the intrinsic anomalous scatterers presented in the macromolecule, such as the sulfur atoms of cysteine and methionine or bound ions [15].The challenge is to maximize and measure the small signal, since the Bijvoet ratio can be as low as 1% [16,17].Both copper and chromium anodes (wavelengths 1.54 Å and 2.29 Å) have been increasingly employed for the same purpose in laboratory X-ray sources with much success [18,19].SAD phasing has already been carried out with anomalous scatterers such as mercury [20], uranium [21], iodine [22] and a tantalum bromide cluster [23], incorporated into the crystal lattice.However, heavy atom derivatives suffer from nonspecific binding, which results in low occupancy of the heavy-atom sites, which leads to weak anomalous signal and disruption of the crystal lattice and fail in derivatization.Surprisingly, it has been observed that short halide soaks can improve the crystal diffraction [24].Many such soaks also require the use of toxic chemicals and stringent safety precautions [25].Exploiting the anomalous signal already presented in the native protein or in the solvent would eliminate the extra experimental work via derivatization and would also eliminate the risk of lack of isomorphism.Phasing using the anomalous signal of sulfur alone has earlier been achieved [26].Longer wavelengths than Cu Kα (1.54 Å) would produce a larger signal, but at the same time experimental difficulties may increase as does the noise level in the data [27].It has been recently re-ported that data collection wavelengths in the range of λ = 1.5 -3.0 Å are fairly easy to handle in a diffraction experiment and even at home sources using instant Cr Kα radiation [28,29].
The use of chromium-anode X-ray radiation is very useful for SAD experiments.The anomalous scattering signal at this wavelength is more than doubled for various metals when compared to conventional copper characteristic wavelength.Furthermore, naturally bound metals and atoms from crystallization solutions tend to show a significant increase in anomalous scattering with chromium radiation [30].Improved data quality helps in exploiting the weak anomalous signal derived only from the sulfurs or in particular from halide ions incorporated by soaking.Therefore, a new class of compound 5-amino-2,4,6-triiodoisophthalic acid, (I3C) that combines heavy atoms for phasing with functional groups for their specific interaction(s) with biological macromolecules was used to give rise to strong anomalous signals using inhouse Cr Kα radiation.The I3C consists of three iodine atoms that are arranged in an equilateral triangle (6.1 Å per side each) (Figure 1).
I3C has low toxicity when compared with other heavy reagents.I3C has been incorporated into three proteins viz., lysozyme (HEWL), glucose isomerase (GI) and thermolysin (TL) for the present study using quick-soaking and co-crystallization method.Lysozyme and glucose isomerase contain higher amount of sulfurs than most proteins in the bacterial or eukaryotic proteomes providing a favorable Bijvoet ratio [31].Thermolysin contains lesser amount of sulfurs when compared to lysozyme and glucose isomerase.I3C was derivatized successfully using soaking concentrations of 500 mM for lysozyme and 150 mM for glucose isomerase.I3C was also derivatized into thermolysin using co-crystallization method.The functional groups of the compound, interacts well with the proteins via hydrogen bonds.The strong anomalous signal of iodine atoms in the I3C makes it a powerful phasing tool for in-house data.

Crystallization
The hen egg white lysozyme crystallization droplet consisted of 2 µl protein solution (20 mg/ml) and 1 µl reservoir solution [50 mM Sodium Acetate and 1 M Sodium Chloride, pH 4.7] and was equilibrated against 1 ml well solution at 25˚C.Glucose isomerase crystallization droplet consisted of 2 µl protein solution (33 mg/ml) and 1 µl reservoir solution [200 mM Magnesium chloride and 100 mM Tris, pH 4.7] and was equilibrated against 1ml well solution at 25˚C.Thermolysin crystallization droplet consisted of 2 µl protein solution (25 mg/ml), 1 µl reservoir solution [1.4 mM Calcium Acetate, 10 mM Zinc Acetate, 1 mM Sodium Nitrate and 50 mM Tris; pH 7.3], which was equilibrated against 1 ml well solution at 20˚C.Lysozyme, Glucose Isomerase and Thermolysin crystals appeared after a day and belonged to the tetragonal P4 3 2 1 2, orthorhombic I222 and hexagonal P6 1 22 space groups, respectively with one molecule per asymmetric unit.Protein crystals were obtained using the hanging drop vapour-diffusion method.Crystals of each protein were harvested for collecting native and I3C quick-soaked and co-crystallized datasets (500 mM I3C, 150 mM I3C and 300 mM I3C).
Stock solutions of I3C with 1 M concentration were obtained by dissolving solid materials in 2 M lithium hydroxide solution [32] to deprotonate the carboxyl groups, thereby producing a salt with high solubility.If sodium or potassium hydroxide solution or an ammonia base is used, the resulting salt will have limited solubility.Lysozyme was soaked for about 45 seconds in 500 mM I3C solution.Glucose isomerase crystal was soaked and tried in 500 mM, 400 mM, 300 mM and 250 mM I3C for various time-periods, but the crystal degraded eventually.Therefore, Glucose isomerase protein crystal was soaked for about 3 minutes 10 seconds (190 seconds) in 150 mM I3C solution.The crystals were later back-soaked for 5 seconds in a cryosolution containing the same salt and buffer concentration with 30% Glycerol and 25% MPD (2-methyl-2,4-pentanediol) respectively.Thermolysin protein crystal was grown by adding 0.5 µl of 300 mM I3C in the crystallization drop itself using co-crystallization method.The grown protein crystals of thermolysin with I3C were cryo-soaked in [10 mM Calcium Acetate, 7% (v/v) DMSO, 20% (v/v) Glycerol and 10 mM Tris; pH 7.3].The crystals were later flash cooled in liquid nitrogen (100 K).

Data Collection and Processing
Six datasets (native and 500 mM I3C for lysozyme; native and 150 mM I3C for glucose isomerase; native and 300 mM I3C for thermolysin) were collected separately using Rigaku R-Axis IV++ image plate detector equiped with Cr Kα (2.29 Å) anode X-ray generator operated at 45 kV and 45 mA.Crystals diffracted upto 2.53 Å and 360 frames were collected with crystal to detector distance being 110 mm at 0.5˚ oscillation steps and 180 seconds exposure time per frame in each case.The intensities were integrated with the HKL2000 [33], refining all parameters including crystal mosaicity.Scaling and merging were also done with the same package.

Substructure Solution and Data Analysis
The possibility of locating the anomalous scatterers using the dual-space recycling algorithm enabled in SHELXD depends on the significance of the anomalous signal presented in the data [34][35][36].For the location of substructures of anomalous scatterers with SHELXD, only the internal loop which relies on the strongest E magnitudes is used.The success rate of SHELXD solutions critically depends on data quality and redundancy of their measurements.Using the direct methods program SHELXD, it was possible to obtain the positions of anomalous scatterers from the anomalous signal contained in all the diffraction data.Density modification with SHELXE [37,38] resulted in high-quality starting phases.Model building was performed with ARP/wARP [39] and refinement with REFMAC [40] available in CCP4i suite [41].Figures were prepared using PyMOL software [42].
At the chromium wavelength of 2.29 Å, sulfur atom has a value of 1.14, SHELXD program found seven sulfur atoms and eight chloride ions (weaker peaks appeared in substructure solution), anomalous scatterers in the native lysozyme data (Figure 2(a)).Chlorine, the halide lighter than iodine, has its K edge at a long wavelength (4.39 Å) and displays only a small anomalous effect [43][44][45][46].For native glucose isomerase data, anomalous scatterers for one manganese and nine sulfur atoms were obtained using SHELXD program and treated for phasing (Figure 2(b)).Similarly, for native thermolysin data, anomalous scatterers for one zinc ion, four calcium ions and two sulfur atoms were obtained using SHELXD program and treated directly for phasing (Figure 2(c)).They were given as input into SHELXE for obtaining the electron density maps.The density modified final maps were subjected to analysis by ARP/wARP web server [47] for automatic chain tracing and model building.The electron density map allowed ARP/wARP program to build 122 residues out of a total of 129 amino acids for native lysozyme with four disulfide bridges.Similarly, 388 residues were automatically built out of a total of 389 amino acids for native glucose isomerase with a single disulfide bridge.For native thermolysin, ARP/wARP program automatically built 314 residues out of a total of 316 residues.
The iodine absorption edges retain a significant anomalous signal (f" = 12.82 e) at the chromium characteristic wavelength (2.29 Å).For I3C-soaked lysozyme dataset, substructure solution determined heavy-atom site for iodines of I3C that formed an equilateral triangle and anomalous scatterers for eight sulfur atoms and eight chloride ions (Figure 3(a)).For I3C-soaked glucose isomerase dataset, anomalous scatterers for twelve sulfur atoms, one manganese ion and for an equilateral triangle  (I3C) were determined using SHELXD (Figure 3(b)).Using SHELXD program, for I3C co-crystallized thermolysin dataset, anomalous scatterers for two calcium ions, one zinc ion, two sulfur atoms and one I3C molecule were determined (Figure 3(c)).Density modification was carried out using SHELXE and the obtained modified map was given as input into ARP/wARP program.The final ARP/wARP conventional and free R factors obtained with REFMAC were 20.3% and 22.4% (lysozyme); 21.4% and 24.9% (glucose isomerase); 20.1% and 22.4% (thermolysin) respectively, wherein 123 residues of 129 amino acids for lysozyme, 386 residues of the total of 389 protein amino acids for glucose isomerase and 312 residues out of 316 amino acids for thermolysin were safely built by the iterative free-atom density modification and model-building procedure.Only halide sites corresponding to peaks higher than 5σ in the anomalous map were included.
Solvent content of lysozyme, glucose isomerase and thermolysin are 37%, 55% and 46%, respectively.In all the cases discussed above, it was possible not only to locate the anomalous scatterers, but also subsequently to solve the protein model by SAD phasing.All the collected datasets are of good quality and they have close to 100% completeness.In all the above-mentioned structures, the asymmetric unit contains only a monomer.Longer wavelengths provide not only an increased anomalous signal for phase determination, but also allow a much clearer definition of substructures; their positions and occupancies, which may turn out to be very important for elucidating the function of a molecule.

Native Sulfur Binding Sites
By contrast, sulfur is presented in almost all proteins.It is heavier than any other elements (C, N, and O) found in most proteins and displays some anomalous signal.Phasing a protein through only the inherent anomalous signal derived from the sulfur atoms presented in both cysteines and methionines presented in the ordered solvent region was possible for lysozyme and glucose isomerase data sets with redundancy of 10 and above at wavelength of 2.29 Å.The structure of the 129 residue, tetragonal lysozyme (P4 3 2 1 2) was phased using only the anomalous signal derived from seven sulfurs in the protein and eight coordinated chloride anions with high redundancy.Similarly, the structure of orthorhombic glucose isomerase was phased using one manganese and nine sulfur atoms from eight methionines and one cysteine from the 389 residues presented in the protein.The structure of 314 residue, hexagonal thermolysin (P6 1 22) was phased using the anomalous signal derived from two sulfur atoms, one zinc ion and four calcium ions.
The presence of metal ions originated from the crystallization buffer used for each protein crystallization setup.

I3C Binding Sites
One binding site for I3C each was observed in lysozyme, glucose isomerase and thermolysin, respectively.The occupancies for all three halogen atoms per site were (0.70, 0.66 & 0.62) for lysozyme (0.65, 0.60 & 0.56) for glu cose isomerase and (0.81, 0.77 & 0.71) for thermolysin.Interestingly, the occupancy values of both the proteins differ lightly, although similar soaking conditions were tried.The data from the I3C derivative showed significant anomalous signal to noise ratio (1.78% for lysozyme, 1.82% for glucose isomerase and 1.45% for thermolysin) throughout the entire resolution range to 2.53 Å.The interactions of the I3C in lysozyme are very similar to those previously reported [21].They mostly replace water molecules in the crystal lattice.Inspection of the I3C sites using PyMOL software showed several Hydrogen bond interactions.The three functional groups of the phasing molecule formed hydrogen bonds with the side or the main chains of the amino acids.One carboxyl group interacts with an Arginine residue (ARG 114).The same carboxyl group interacts with oxygen and nitrogen atoms of the Asparagine residue (ASN 37) (bifurcation) via water molecules.Hydroxyl group of the I3C molecule also interacts with the nitrogen presented in the Lysine residue (LYS 33) (Figure 4(a)).

I3C-Cocrystallized
In the I3C-so oup of I3C forms hydrogen bonds to the hydroxyl group presented in the Phenylalanine residue (PHE 296).
Similarly, the amino group also shows an interaction ith Glycine residue (GLY 298) via a water molecule.The carboxyl group interacts with the oxygen atom of the Aspartic acid residue (ASP 295) (Figure 4(b)).In the I3C co-crystallized thermolysin data, the hydroxyl group of I3C forms a hydrogen bond with Serine residue (SER 279) (Figure 4(c)).Sulfur atoms could also be located in the 500 mM I3C, 150 mM I3C and 300 mM I3C datasets collected to 2.53 Å resolution using Cr Kα radiation.Crystal data statistics, phasing and model building details are listed in Table 1.In all cases, the figure of merit was greater than 0.55.More than 96% of the residues and 92% of the side chains were placed automatically using warpntrace mode in ARP/wARP program.Anomalous difference Fourier maps have been computed at 5σ level.
The concentration of iodides in the soaking solution ems to influence their occupancy more significantly.Iodide sites in the I3C ring have hydrogen bonding contacts with hydrogen-donor groups of protein or water molecules.They tend to occupy ordered sites around the protein surface with varying occupancy, and therefore share with water molecules presented nearby.This shows that I3C has easily diffused into the protein crystals during quick-soaking and co-crystallization methods.The quick cryo-soaking and co-crystallization methods with halides explained here may be an alternative method for phasing protein crystal structures.

Conclusion
Data quality is decisive for successful location of the anomalous substructure.The example of successful SAD phasing based on the signal of weak anomalous scatterers such as sulfur atom and chloride ion, prove that even the anomalous signal provided or presented naturally in a macromolecule is good enough to solve crystal structures successfully using an in-house chromium-generated Xray radiation.The results also indicate that phasing after a short soak with a buffer containing a halide salt or co-crystallization is much easier and more likely to succeed.I3C represents a novel class of compound that helps in showing interaction(s) with protein molecule(s), it can be used for experimental phasing, and is a compound of choice, since the iodine atoms give rise to a strong anomalous signal for SAD phasing.

Figure 2 .
Figure 2. Anomalous map at 5 sigma level.(a) the peaks of seven sulfur atoms (big) and eight chloride ions (small) with water molecules in HEWL; (b) the peaks of nine sulfur atoms and one manganese ion with water molecules in GI and (c) peaks of four calcium ions, one zinc ion and two sulfur atoms with water molecules in TL.

Figure 3 .
Figure 3. Anomalous map at 5 sigma level.(a) the peaks of one I3C molecule and eight sulfur atoms (big) and eight chloride ions (small) with water molecules in HEWL; (b) the peaks of one I3C molecule and twelve sulfur atoms and one manganese ion with water molecules in GI; and (c) peaks of two calcium ions, one zinc ion, two sulfur atoms and one I3C molecule with water molecules in TL.

Table 1 . Crystal data statistics, phasing and model building details.
SN and DV thank UGC, G financial support for this rese and UGC-SAP for funding facilities to the Centre for Advanced Study in Crystallography and Biophysics.Chromium datasets were collected at X-ray facility, CCMB, Hyderabad funded by CSIR Facility Creation Project (FAC0004) as part of Eleventh Five Year Plan.SN and DV thank Dr. R. Shankaranarayanan for extending his lab facilities to collect anomalous scattering datasets using Cr Kα radiation.