Comparative Molecular Mechanics and Quantum Mechanics Study of Microhydration of Nucleic Acid Bases

DNA is the most important biological molecule, and its hydration contributes essentially to the structure and functions of the double helix. We analyze the microhydration of the individual bases of nucleic acids and their methyl derivatives using methods of molecular mechanics (MM) with the Poltev-Malenkov (PM), AMBER and OPLS force fields, as well as ab initio Quantum Mechanics (QM) calculations at MP2/6-31G(d,p) level of theory. A comparison is made between the calculated interaction energies and the experimental enthalpies of microhydration of bases, obtained from mass spectrometry at low temperatures. Each local water-base interaction energy minimum obtained with MM corresponds to the minimum obtained with QM. General qualitative agreement was observed in the geometrical characteristics of the local minima obtained via the two groups of methods. MM minima correspond to slightly more coplanar structures than those obtained via QM methods, and the absolute MM energy values overestimate corresponding values obtained with QM. For Adenine and Thymine the QM local minima energy values are closer to those obtained by the PM potential (average of 0.72 kcal/mol) than by the AMBER force field (1.86 kcal/mol). The differences in energy between MM and QM results are more pronounced for Guanine and Cytosine, especially for minima with the water molecule forming H-bonds with two proton-acceptor centers of the base. Such minima are the deepest ones obtained via MM methods while QM calculations result in the global minima corresponding to water molecule H-bonded to one acceptor and one donor site of the base. Calculations for trimethylated bases with a water molecule corroborate the MM results. The energy profiles were obtained with some degrees of freedom of the water molecule being frozen. This data will contribute to the improvement of the molecular mechanics force fields.


Introduction
Water is one of the most abundant chemical compounds on the planet, and it constitutes a high percentage of the cell composition. To understand the role of interactions of biomolecules with water in relation to their functions, it is essential to have a detailed description of the energetic and structural aspects of interactions of the molecules involved. The first data on DNA fibers obtained by X-ray diffraction showed that DNA is highly hydrated and the interactions with water are responsible for its conformational changes [1]. The microhydration of nucleic acid (NA) bases, i.e. interactions of the bases with separate water molecules, plays an important role in structural stabilization of the double helix. QM calculations (of HF, DFT, and MP2) performed for hydrated complexes of DNA bases revealed that the geometric properties of such complexes are extremely sensitive to the interactions with one or few water molecules [2][3][4], the presence of just one water molecule is enough to completely change the structure of a complex of nucleic acid bases in the global minimum.
From the analysis of experimental results on hydration of oligomeric DNA duplexes, Schneider and his group [5,6] evaluated the distribution of water molecules around the components of the NA by considering it as a "construction of hydrated blocks". A modular scheme for the hydration was suggested. It determines the average sites of water molecules around the components of the NA, and can generate predictive patterns for the distribution of water molecules around the NA fragments. The studies of microhydration of the individual components of nucleic acids, obtained by both experimental and theoretical methods complete this scheme. Quantitative evaluation of the sites of hydration also contributes to the improvement of Molecular Mechanics force fields [7,8].
Experimental spectroscopy studies have provided valuable data on the hydration of the components of the NA. The first studies of water clusters with nucleic bases using mass spectrometry in a primary ionization field were made by the group of Sukhodub, who determined the enthalpies of hydration of DNA bases and some of their derivatives [9]. Important mass spectroscopy experiments were performed by Kim et al. [10], where the threshold ionization energies for hydrated Adenine (A) and Thymine (T) bases were reported. Recent studies of UV photoionization in vacuum by a supersonic molecular beam using optical spectroscopy and comparison with theoretical results enabled the determination of the ionization energies of microhydrated DNA bases [11] and of tautomers of hydrated 9-methylguanine [12]. The studies of mononucleotide complexes with individual water molecules have been reported recently [13]. However, all these studies do not provide direct information about the structure and stabilization energy, and theoretical interpretation of the results is necessary. The experiments of Sukhodub et al. [9] represent the only exception, because they determine the gas-phase interaction energies of water-base complexes from the temperature dependences of the equilibrium constants of the association. However this method, in contrast to the theoretical results, does not specify the geometry of the complexes. So, despite the success of modern experimental methods, they still do not provide direct data for the detailed topology of the network of water-base hydrogen bonds (H-bonds). Thus, the computational methods of both molecular mechanics and quantum mechanics are the indispensable tools for the detailed study of the fine structure of hydration of nucleic acids.
The microhydration of the bases has been the subject of numerous theoretical studies by Monte Carlo [14][15][16] and Molecular Dynamics [17] techniques using different force fields. The hydration sites can be compared with quantum mechanics ab initio [12,[18][19][20][21][22][23][24] and DFT calculations for interactions of some water molecules with DNA bases and base pairs [4,[25][26][27]. The molecular mechanics calculations have demonstrated that the deepest minima of the interaction energy of a water molecule with nucleic bases correspond to the formation of a water bridge between two hydrophilic atoms of the base. Such a bridge can be formed in three different ways, namely between two H-bond acceptor centers of the base, between two donor centers, and one acceptor and one donor centers. The first of these scenarios was analyzed with ab intio quantum mechanical calculations more extensively (applying different basis sets), as this configuration in molecular mechanics corresponds to global minima for Guanine and Cytosine. We performed preliminary ab initio calculations using the bases with rigid geometry followed by the complete energy minimization in the space of all the degrees of freedom. We also performed the study of energy profiles, i.e. the dependences of the interaction energy on some water displacements around the base hydration sites fixing certain geometrical parameters. This information will contribute to future improvements in force fields. The comparison of theoretical MM and QM results with the experimental data, demonstrates that we need to reconsider the geometry of some minima positions for the force field parameters adjustments.

Method of calculation
The systems considered contain one of methylated nucleotide bases (1methylpyrimidine or 9-methylpurine) and one water molecule. The starting geometries of the bases are the average structures obtained from X-ray experimental data in crystals, these geometries have been used in previous works [15,28]. For simplicity we will name the above mentioned methylated bases simply as Adenine, Thymine, Cytosine, and Guanine. The calculations of the interaction energy were performed within the two schemes of molecular mechanics (MM) and quantum mechanics (QM). For MM calculations Poltev-Malenkov (PM) force field [7,8] was used along with the potentials implemented in the AMBER program [29,30], in both cases the interaction energy is calculated as the sum of pair interactions of all the atoms constituting the molecules. For the PM potential, each atom-atom interaction consists of a Coulomb term and of Lennard-Jones (or 6-12) one (Eq. 1). To describe the interaction between the atoms capable of forming hydrogen bonds, the 6 -12 term is replaced by a 10-12 term (Eq. 2) E(r ij ) = kq i q j r ij -1 -A ij r ij -6 +B ij r ij -12 (1) E(r ij ) = kq i qr ij -1 -A ij (10) r ij -10 +B ij (10) r ij -12 (2) In these equations, k is a numerical constant, q i , q j are the effective charges of atoms i and j respectively (calculated by semiempirical quantum chemistry methods and reproduced the experimental dipole moments of the molecules), r ij is the distance between the atoms. The coefficients A ij , B ij and A ij (10) , B ij (10) are adjustable parameters whose numerical values are the same as in previous articles [8,28]. The AMBER potentials [29] take into account the intra-molecular terms (whose contributions are small) and do not contain the 10-12 terms.
Quantum mechanics calculations were performed using the GAUSSIAN 03W program [31], at MP2/6-31G(d, p). The interaction energies E int was evaluated considering the basis set superposition error correction using the counterpoised procedure of Boys-Bernardi implemented the GAUSSIAN package [32]. After energy minimization, additional single point calculations were performed with counterpoise option to evaluate the energy of the first molecule with the basic functions of the second one, and vice versa the energy of the second molecule with the functions of the first one. These terms are subtracted from the total energy and so the corrected energy E BSSE of the system is obtained.
All the local minima were verified by the calculations of the matrices of second derivatives of energy (Hessian) which appeared to be positive. For some local minima of Guanine and Cytosine more extensive basis set (aug-cc-pvdz) was used in order to confirm the geometry. For each base, energy scans were performed with both methods (MM and QM) by changing the position of the water molecule around the hydrophilic centers. Some geometric parameters were varied gradually, with other ones being fixed. For example azimuthal scans were made, i.e. the angle θ ( Fig. 3a.) was varied to change the position of the water molecule in the base plane around the base atom capable of forming a hydrogen bond. During these minimizations the distance r between two atoms of the two molecules and parameters x, y and z which determine the rotations of the water hydrogen's around the water oxygen were varied ( Fig. 3a.). The energy profiles obtained provide fine details of geometry changes which will contribute to the improvement of force fields.

Local minima of interaction energy between DNA bases and single water molecule
The extensive calculations of the water-base systems via MM and QM methods described in the previous section enable us to reveal all the local minima for these systems. The calculated interaction energies along with those of other authors are presented in Table 1. The hydrogen bond distances for these complexes are shown in Table 2.
The structures obtained with both methods are shown in Figure 1. The same number of local minima and rather close water oxygen positions were revealed in both MM and QM calculations, i.e. every MM minimum corresponds to QM minimum with the mutual water-base positions resembling those of QM minima. The PM potential functions favor the coplanar configurations of the complexes, i.e. the base ring and the 3 atoms of the water molecule located in nearly the same plane. The only exceptions are the configuration 3 for Guanine and the configuration 1 for Cytosine (Fig. 3).
Qualitatively similar structures are obtained with the AMBER potentials and the potential of Jorgensen (OPLS), the latter values were calculated in reference [28] and coincide with the values obtained in [14] using the method of diffusion Monte Carlo.
The results of our ab initio calculations revealed both non-coplanar and coplanar conformations, for example the structures corresponding to minimum 2 for Adenine and 3 for Cytosine are completely planar whereas for structures 2 and 3 for Thymine and 3 for Guanine, one of the hydrogens of the water molecule remains in the plane of the base while the other hydrogen deviates from the plane for approximately 30°. Comparison of the interaction energies of the minima obtained with the MM method and those obtained with QM shows generally higher values for the former, this is true for both PM and AMBER potentials (not for all the OPLS results). This difference can be due to the MM potential adjustment to the hydration of the bases in aqueous solution [28,8] where water molecules of the first shell are affected by the "bulk" water. The tendency for shortening the interatomic distances on including the other water molecules can be seen from comparison with other publications. This feature is reported in a DFT study [2] for complexes of Cytosine with 14 molecules of water, where for water position corresponding to minimum 4 the O-H W ...O2 distance is of 1,82 Å, while our QM calculations give the value of 1.91 Å. The same tendency took place in the Hartree-Fock study [19] for Guanine with 7 to 13 water molecules.
The values of the interaction energies in minima calculated with the method MM/PM are closer to QM ones for the Adenine and Thymine (the average differences being of 0.72 kcal/mol) than for Guanine and Cytosine (2.8 kcal/mol). The reason for these differences is due to the fact that QM calculations result in rather small interaction energies for H-bonding of water molecule to two proton acceptors of the bases. This situation will be discussed in the next section.

Interactions of water molecule with two H-bond acceptors of the bases
The most significant differences between MM and QM results refer to the minimum 1 of Guanine and 3 of Cytosine (Fig. 1) corresponding to the interaction of water molecule with two H-bond acceptors of the base. From calculations carried out with the potential PM, we found that: The first minimum for Guanine obtained via PM potentials is the most profound one, and it is only 0.2 kcal/mol deeper than the minimum 2, which is the global one for AMBER force field (  [25,26] (the values of -9.97 and -9.1 kcal/mol respectively). Microhydration of Cytosine and its radical anion were investigated with the DFT-B3LYP method [26], and the deepest minimum for Cytosine was found at position 2, but for the Cytosine anion it was located at position 3, the bond lengths being shorter compared to our QM values, (Hw…O2 of 1.95 Å and Hw...N3 of 2.16 Å). The barrier between the minima 3 and 2 for cytosine-water complex is quite small; the minima 2 and 3 obtained via PM potentials have nearly the same energy (the difference of 0.1 kcal/mol). The same effect can be seen for different tautormers of Guanine via MM calculations [28]. The enol tautomer with the OH group oriented towards the N1 atom forms a complex with water molecule H-bonded to two acceptor atoms (N7 and O6) with the energy of -10.04 kcal/mol, and interatomic distances with O6 of 1.85 Å and with N7 of 1.97Å. This minimum was also revealed using ab initio RI-MP2 method with TZVPP basis set [18]. The deepest minimum for dCMP (B3LYP/6-31G* level of DFT) corresponds to the minimum 3 of Cytosine [33]. Thus, the MP2/6-31G(d,p) level of QM calculations results in the minima of poor stability when water molecule forms H bonds with two base acceptors, and the potential barrier is rather small, however, the MM potentials show them as ones of the lowest energy.

Microhydration of some methylated derivatives of nucleic acid bases
There are experimental mass spectrometry data [9] for some trimethylated bases which can be useful in comparison of the results of MM and QM methods on search for waterbase interaction energy minima. The substitution of the hydrogen atom capable to form H-bond by a methyl group excludes some local energy minima of water-base interactions, thus helping to refer experimental data to the definite water position. The methylation changes slightly the calculated values of water-base interaction energy minima for water positions not involved in H bonding with the hydrogen to be substituted. It was demonstrated for MM calculations earlier, and it is confirmed for QM calculations here.
The calculation results obtained via MM and QM methods for trimethylated bases and the experimental enthalpies of water-base complex formation are listed in the Table 3.
The values corresponding to global minima for 1-methylcytosine and 9-methylpurines (from the Table 1) are added for comparison.
The results demonstrate rather close experimental values of the enthalpies of complex formation with water molecule for m 9 Gua and m 229 Gua. The same is true for m 1 Cyt and m 144 Cyt (Table 3). Rather small differences between the values of monomethylated and trimethylated Gua and Cyt suggest the nearly same positions of the water molecule for complexes observed in experimental study and in global minima. The comparison of experimental data for m 9 Ade and m 669 Ade demonstrates less negative values for the trimethylated base, i.e. the substitution of amino group hydrogens by methyl groups changes the position of water molecule in the complex. Both MM and QM calculations suggest that the m 669 Ade-water complex correspond to minimum 3 for m 9 Ade, as the formation of other two minima for m 9 Ade-water complexes are blocked by methyl groups.
The calculations for m 2,2,9 Gua do not help to decide which minimum is more favorable, the minimum 1 (as predicted by PM potentials) or 2 (as predicted by QM and AMBER calculations). Both minima are possible for m 2,2,9 Gua-water complex (Table 3), and the calculated values for these minima are close for those of m 9 Gua (Table 1) The calculations for trimethylated bases suggest the necessity of both improvement of MM force fields and more sophisticated QM calculations to reach more adequate description of water-base interactions.

Interaction energy dependences on displacement of water molecule from energy minima
Three types of scans via QM and MM methods described above have been performed for various displacements of water molecule from the positions corresponding to energy minima. The first type refers to azimuthal displacement of the water oxygen (other variables being free). The Figure 3 presents an example of such scans for Adenine-water complexes; Figure 3a demonstrates the regions of the angle variations between N3, N1, and N7 acceptor sites. The dependences of interaction energy on the angle demonstrate a similarity for MM and QM curves, and the energy values obtained with the two methods are rather close, especially for the regions near the minima 3 and 2.
The energy difference between two methods is largest in the region of the minimum 1 (1.15 kcal/mol, this value is different from that of Table 1, (0.39 kcal/mol) as we put constrains on some parameters). Other similarities of the two sets of curves refers to the distances between the atoms of two molecules; the dependencies of N...Ow and H...Ow distance on the angle nearly coincide.
Similar azimuthal scans were obtained for other bases; for Thymine-water system maximum energy difference, 1.83 kcal/mol, corresponds to the trajectory from minimum 1 to minimum 2. For the Guanine-water and Cytosine-water systems, there are more pronounced differences in energy, though the distances between the participating in H bonds atoms of the base and the water molecules are rather close for the two methods. It is noteworthy that for QM structures the distances of out-of-plane water hydrogen from base acceptor atom are nearly the same as for corresponding coplanar MM water-base complexes.
The second type of scans performed refer to moving a water molecule towards and away from the base starting from the minima positions (during the optimization x, y, and z parameters were varied, the angle was fixed). When we make a radial scan such that a water molecule approaching the methyl group of the base to the distances between the oxygen and carbon shorter than 3.15 Å, the structures obtained with MM may be non-coplanar due to the repulsion of atoms. In this case the energy dependence as a function of r for the two methods demonstrate the same pattern.
The third type of scans was performed by the displacements of water molecules out of the plane of the bases, the angle z being varied from 0° to 90°. In this case the energies have the same tendency to decrease when the water moves away from the base plane (to z = 90 °), at the end of the scan path there can arise a marked difference (up to 3 kcal/mol for scans near amino or methyl groups).
Some MM minima refer to both water hydrogen's in the base plane while corresponding QM minima refer to displacement of the water hydrogen not forming H-bond by 30°-45° out of the plane (e.g. in Thymine and Guanine 2nd minima). We performed MM and QM energy scans as functions of the angle of rotation about O-H water bond (H being H-bonded to the base and the bond being in the base plane). The energy differences between QM and MM water positions fall in 0.2 kcal/mol region, thus being not great, but may be significant for some cases. More profound QM calculations and MM parameter adjustment are required for more exact water-base system description in this respect.

Conclusion
This paper concerns the evaluation of the interactions of nucleic acid bases with single water molecule. The calculations for such simple systems can be performed via the methods of various complexities, from simple atom-atom MM computations of the rigid molecules to correlated ab initio QM computations using extended basis sets. The comparison of the results obtained via various methods demonstrates both some common features and some differences in quantitative geometry and energy characteristics. The simulation of biomolecular systems in surrounding water is possible via MM methods only. Thus, continuous improvement of MM force fields is required for adequate reproduction and prediction of important features of the systems containing nucleic acid fragments and hundreds of water molecules (and other biologically important molecules). Such improvement is not possible using experimental data only due to the insufficient amount of such data. The high level QM computations of the simple systems can help to fill this gap.     Figure 1 are listed in parentheses.