Keywords:
    
Intramolecular interactions are normally described by bond stretching, angle bending and torsional angle motion terms. These interactions involve the closest bonded atoms described by two-, three-, and four-body terms, respectively. The bond and angle terms are normally given as harmonic wells while the torsion term is most often expressed as a Fourier sum.
 
Intermolecular interactions are those between separate molecules but also include all interactions within the same molecule beyond the bonded interactions. Non-bonded interactions are further divided between short- ranged and long-ranged interactions. The short-range interactions mimic the Van der Waals type of forces, while the long-range interactions are electrostatic interactions. The electron distributions around atoms are approxi- mated by fixed point charges, and their interactions are treated by using Coulomb’s law.
 
The effective range of short-range interactions is limited to a specified cut-off. By assuming a uniform beyond the cut-off, correction terms to the energy can be obtained by integrating over the non-zero part of the inter- actions [1-3]. Thus the fast-decaying short-range interactions can be accurately approximated by truncation at the cut-off distance in most calculations.
 
Artificially collecting and dividing the diffuse and fluctuating electron densities inside and around molecules on single atomic sites are a crude but conceptually simple and effective approximation, since Coulomb’s law can be invoked. However, this simplification comes at a price since the interactions between point charges stretch over very long distances. Furthermore, these long-range interactions cannot be truncated without introducing simulation artifacts [4-7].
 
As the system size grows, calculating the electrostatic interactions becomes the major computational bottleneck. Methods based on Ewald summation [8,9] are still considered as the most reliable choice, and a large variety of schemes to compute them in computer simulations have been proposed [10-16]. There is also a multitude of alternative methods for representing electrostatic interactions. Examples are methods based on a cut-off [17-19], tree and multipole based methods [20-25], multigrid methods [26-28], reaction field methods [29-31], the particle mesh method [32,33] and the isotropic sum method [34].
 
 In this paper, we describe an approach to Ewald summation based on the non-uniform fast Fourier transform technique. We use the acronym ENUF-Ewald summation using Non-Uniform fast Fourier transform (NFFT) technique. Our method combines the traditional Ewald summation technique with the NFFT to calculate electrostatic energies and forces in molecular computer simulations. In the paper, we show that ENUF is an easy-to-implement, practical, and efficient method for calculating electrostatic interactions. Energy and momentum are both conserved to float point accuracy. By a suitable choice of parameters, ENUF can be made to behave as traditional Ewald summation but at the same time gives a computational complexity of
, where 
 is the number of electrostatic interaction sites in the system. Weighing all these properties together, we believe that ENUF should be an attractive alternative in simulations where the high accuracy of Ewald summation is desired. 
 
In the next section, we summarize the basic methodology to apply Ewald summation to computing the electrostatic energies and forces within periodic boundary conditions. In Section 3, we introduce the Ewald method where the reciprocal space part is calculated based on non-uniform FFTs and discuss the underlying concepts and use of the libraries. We also give general guidelines to implement the method in existing simulation programs. In Section 4, we discuss its implementation in a general purpose atomistic Molecular Dynamics simulation package M.DynaMix giving its scaling characteristics in a standard desktop computer. In Section 5, we demonstrate its implementation in Dissipative Particle Dynamics package. This method is applied to simulating soft charged mesoscopic particles. It is necessary to use charge distributions in order to avoid non-physical aggregation of soft charged particles if point charges are used. This issue is discussed further in Section 5.
 
2. Electrostatic Interactions
 
We start by describing a model system of charged particles which captures the most salient features of electrostatic interactions in general MD systems. The electrostatic potential of a system with periodic boundary conditions (PBC) is first stated; we follow with the manipulation of the basic formulas to the form in which they are commonly written; this Section ends with a summary of expressions for both energy and forces.
 
2.1. Ewald Summation
 
 Consider a cubic simulation box with edge length
, containing 
 charged particles, each with a charge
, located at
. The boundary conditions in a system without cut-off is represented by replicating the simulation box in all directions. The total electrostatic potential energy of the charge-charge interactions is then given by 
 
 
 (1) 
 
 where
, and 
 denotes the length (2-norm) of the vector
. Because of the long-range nature of the electrostatic interactions, 
includes contributions from all replicas, but exclude self-interactions, which is expressed in the triple sum in Equation (1). The outer sum is taken over all integer vectors 
 
 
. The 
 symbol on the first summation sign in Equation (1) indicates that 
 
 the self-interaction terms should not be included, i.e., when 
 terms are omitted. 
 
The sum in Equation (1) is not an absolutely convergent series, but rather conditionally convergent. As a consequence, the order of summation affects the value of the series. In fact, it was discovered by Riemann that any conditionally convergent series of real terms can be rearranged to yield a series which converges to any prescribed sum [35]. In a sense, this is a situation very similar to the case when a linear equation has an infinite number of solutions because it is under-determined; by adding a set of conditions a unique solution may be defined. For the specific case of Equation (1), a physically relevant summation order has to be prescribed and the boundary conditions of the surrounding media have to be specified.
 
The lattice sum of Equation (1) can be calculated by a method that was first developed in 1921 by Ewald [8]. He used it to calculate lattice potentials in solids. In the context of Molecular Dynamics, there are several different derivations of the Ewald summation method that are more recent; a small selection is given by [3,9]. In the following discussion we mainly follow the work of de Leeuw et al. [9,36,37].
 
 In [9], de Leeuw et al. developed a technique using convergence factors that transforms the sum of a conditionally convergent series into a series with a well-defined sum. Furthermore, they showed that applying a specific convergence factor is equivalent to a certain summation order. Assuming an overall charge neutral system, 
, and summing the terms in Equation (1) over all integer vectors 
 in concentric spherical order, they showed that the electrostatic potential energy can be written as 
 
 
 (2) 
 
 when the surrounding media of the periodically replicated cell is a uniform dielectric with dielectric constant 
 and distances are calculated with the minimum image convention. When the surrounding media is a conductor
, the energy can be written as 
 
 
 (3) 
 
From Equation (3) it is clear that the boundary conditions, vacuum or conductor, have an effect on the energy of the system. Depending on the simulated system and the properties of interest, the choice of boundary conditions can affect the obtained results [38,39].
 
 The function 
 is given by 
 
 
 (4) 
 
 with the two error functions defined as 
 and
. The number 
 used in Equations (2) and (3) is defined as 
 
 
 (5) 
 
 where 
 is a free parameter. 
 
Equations (2) and (3) are not in a form that is appropriate for efficient numerical calculations and in the case of Molecular Dynamics simulation we also need expressions for the forces. To arrive at a more suitable form we make the necessary analysis for the electrostatic energy and forces in the following Sections.
 
2.2. Energies in Ewald Summation
 
 To rearrange and expand Equation (2) we first insert 
 in Equation (4) 
 
 
 (6) 
 
 Next we rescale 
 by making the substitution 
 in Equations (5) and (6) to get 
 
 
 (7) 
 
and
 
 
 (8) 
 
Inserting Equations (7) and (8) into Equation (2) we get
 
 
 (9) 
 
 Note that the summation in Equation (9) is for 
 in the first sum. We make further simplifications by studying the terms on the right hand side of Equation (9) for 
 and
. 
 
 When 
 we have the following terms 
 
 
 (10) 
 
 with the terms independent of n included. The 
 symbol indicates that the 
 terms are excluded from the daggered sum. 
 
 When 
 we get 
 
 
 (11) 
 
 The factor 
 in Equation (11) comes from changing the summation from 
 to all pairs 
 and
, and using the symmetry induced by 
 and
. 
 
By combining Equations (10) and (11) we identify the real-space term
, the reciprocal-space term
, the self-interaction term
, and the boundary-condition term
. The real-space term is given by
 
 
 (12) 
 
and the reciprocal-space term is
 
 
 (13) 
 
 However, with the symmetries generated by 
 and
, we get 
 
 
 (14) 
 
Furthermore, we have
 
 
 (15) 
 
 
 (16) 
 
and finally
 
 
 (17) 
 
The reciprocal-space part, Equation (14), can be expanded in two different forms. The first form is in terms of the structure factor
,
 
 
 (18) 
 
and is given by
 
 
 (19) 
 
 Now for a fixed
, the structure factor 
 is just a complex number
, and the simple fact that 
 
 
, gives the real form of Equation (14) 
 
 
 (20) 
 
The first form in Equation (19) is used in our fast approach to calculating the reciprocal-space part. The last form in Equation (20) is the most common point-of-departure when implementing the reciprocal-space part.
 
2.3. Forces in Ewald Summation
 
 Now that we have calculated the electrostatic energy of the system we can easily compute the electrostatic forces 
 that act on each particle
. Splitting the forces in the same way as we have split the energy and using Equation (17) we get the total electrostatic force by finding the negative of the gradient of the electrostatic energy 
 
 
 (21) 
 
 The subscript 
 on the 
 operator indicates that we take the partial derivatives with respect to the position of particle 
 and the 
 in Equation (21) comes from the self-interaction term in Equation (15) being independent of
. Before we do this calculation we note a couple basic, but helpful, formulas for calculating derivatives 
 
 
 (22) 
 
 With above formulas, it is straightforward to find the different terms of
. The contribution from the real-space term, 
, becomes 
 
 
 (23) 
 
 such that when 
 include all 
 and otherwise only
. Equation (20) is convenient to use when calculating the reciprocal-space contribution because it is expressed in terms of charge locations 
 rather than relative distances
. Thus the reciprocal-space force is given by 
 
 
 (24) 
 
Finally, the contribution that depends on the boundary condition
 
 
 (25) 
 
2.4. Formulas for Energy and Forces in Ewald Summation
 
 Consider a periodically replicated system, with the central box consisting of 
 point charges 
 and
. Assume that the surrounding media at the boundary of the periodically replicated system is a uniform dielectric with dielectric constant
. The cubic box has edge length
; each charge 
 is located at
, and distances are calculated with the minimum image convention. After expansion and rearrangements of Equation (2), rescaling 
 and using symmetries induced by 
 and
, the total electrostatic energy of the system can be written as 
 
 
 (26) 
 
with the different terms given by
 
 
 (27) 
 
 
 (28) 
 
 
 (29) 
 
 
 (30) 
 
 Note that 
 is a free parameter. The structure factor 
 is defined as 
 
 
 (31) 
 
 The total electrostatic force, 
, on each particle is 
 
 
 (32) 
 
Each of the force terms given by
 
 
 (33) 
 
 
 (34) 
 
 
 (35) 
 
 The positive number
, the so called Ewald convergence parameter, is chosen for computational con- venience. Observe that 
 for large values of
. By choosing 
 large enough in Equation (27), we can ensure that the only terms that contribute in the real-space sum is when
. This may be expressed so that all terms with 
 should be included. 
 
Choose a cut-off in both real-space and reciprocal-space so that the neglected terms in the real-space and reciprocal-space parts are of the same order
, or less. The truncation in real-space implies that a sufficient number of terms must be included in the reciprocal-space sums, Equation (28).
 
 Given a required accuracy
, 
is fixed by 
 
 
 (36) 
 
 and 
 is determined by 
 
 
 (37) 
 
 We have two conditions and four parameters. With a required 
 we may just as well pick a suitable value for 
 and let the above two equations determine 
 and
. 
 
 With an optimal choice of parameters the computational effort of the Ewald method becomes 
 [36] giving a considerable improvement over the 
 computational complexity implied by the “infinite” reach of the Coulomb interactions. 
 
3. ENUF: A Fast Method for Calculating Electrostatic Interactions
 
In Section 2 we summarized known results and prepared the ground for the development of a fast method for Ewald summation using the discrete nonuniform fast Fourier transform (NDFT).
 
3.1. Discrete Fourier Transforms for Non-Equispaced Data
 
The fast Fourier transform for nonuniform data-points (NFFT) [40] is a generalization of the FFT [41]. Several similar approaches have been proposed; some examples are [42-50] with comparisons in [47,51,52].
 
 The basic idea of NFFT is to combine the standard FFT and linear combinations of a window function that is well localized in both the spatial domain and the frequency domain. A controlled approximation using a cut-off in the frequency domain and a limited number of terms in the spatial domain results in an aliasing error and a truncation error, respectively. The aliasing errors is controlled by the oversampling factor
, and the truncation error is controlled by the number of terms, 
, in the spatial/time approximation. For a number of window functions (Gaussian, B-spline, Sinc-power, Kaiser-Bessel), it has been shown that for a fixed over- sampling factor, 
, the error decays exponentially with 
 [53]. 
 
3.1.1. Problem Definition
 
 We wish to calculate the discrete Fourier transform for nonequispaced data. The problem can be stated as follows. For a finite number of given Fourier coefficients 
 with 
 we want to evaluate the 
 
 trigonometric polynomial 
 at each of the given nonequispaced points 
 
 
. In the literature, points are often called knots. We use the two terms synonymously. 
 
 Obviously, the details of an NDFT depend on the definitions of a sampling set for knots, 
, and an index space
. More in-depth discussions and further details can be found in [53,54]. The presentation that follows is mainly drawn from these sources. 
 
3.1.2. Underlying Concepts
 
 Consider a d-dimensional domain 
 in which the set of nonequispaced knots, or data points, are located. Let 
 
 
 (38) 
 
 and the set of 
 data points
. For the application we have in mind 
 is usually 2 or 3. Let 
 be a function space of trigonometric polynomials with degree 
 in dimension
; the function space 
 can be defined as 
 
 
 (39) 
 
 The dimension of this function space is
. The frequencies 
 with the index set 
 are such that 
 
 
 (40) 
 
3.1.3. Matrix-Vector Formulation
 
With these preliminary definitions we carry on with the problem of calculating the discrete Fourier transform for
 
 nonequispaced data. For a finite number of given Fourier coefficients 
 with 
 we want to evaluate the trigonometric polynomial 
 at each of the given nonequispaced 
 
 knots in
, where the product 
 is the usual scalar product of the two vectors 
 and 
 as
. Consequently, for each
, we evaluate
. This may be reformulated in matrix-vector notation by setting 
 
 
 (41) 
 
and writing
.
 
3.1.4. Related Matrix-Vector Products
 
 A number of related NDFT matrix-vector products can also be defined. To write them down we let 
 be the complex conjugate of the elements of the matrix 
 and 
 the transposed complex conjugate of the matrix
. Using these conventions we can name and summarize the related NDFT matrix-vector products and their component representation as 
 
 
 (42) 
 
 
 (43) 
 
 
 (44) 
 
 
 (45) 
 
3.1.5. NDFT, FFT and NFFT
 
 From the different NDFT products written in matrix-vector form, as in Equations (42)-(45), it is clear that it takes 
 arithmetic operations to transform between the Fourier-samples and the Fourier-coefficients. 
 
 This is simply because the matrix 
 is
, with
. 
 
 However, for the special case of 
 and 
 equispaced knots
, the Fourier-samples 
 can be calculated from the Fourier-coefficients 
 by the fast Fourier transform (FFT) with 
 arithmetic operations. 
 
 The fast Fourier transform for nonequispaced knots (NFFT) is a generalization of the FFT. The essential idea is that of combining a window function with the standard FFT. The window function is a well localized function in both the space domain and frequency domain. Several different window functions and similar approaches have been proposed. The resulting algorithms are approximate and some of them have been shown to have a computational complexity of
, where 
 is the desired accuracy [53]. 
 
3.2. Fast Ewald Summation
 
Using optimal parameters in the Ewald summation method implies that the time to calculate the real-space part and the reciprocal-space part are approximately equal. As the number of particles in the system grows we would like to combine the calculation of the short-range part of the potential with the real-space part. This implies that we need to choose a real-space cut-off about the same size as the short-range cut-off. With this nonoptimal choice, the reciprocal-space parts of the Ewald summation method become the most time-consuming to calculate [55].
 
To show how a fast Ewald summation approach may be obtained from the regular Ewald method, described in Section 2, we focus on the reciprocal-space parts. In Section 3.1, we gave the details of the discrete Fourier transform (DFT) for data that is nonuniformly spaced (NDFT). Based on these definitions we get a number of useful algorithmic primitives. First we reformulate the reciprocal-space part of the regular Ewald method in terms of the NDFT primitives. Then we show how the fast Fourier transform for nonequispaced (NFFT) can be applied, yielding an Ewald method based on the nonuniform fast Fourier transform.
 
3.2.1. Reciprocal Space Terms as DFT
 
We apply the generalized DFT, described in Section 3.1, to the calculation of the reciprocal-space energy and forces. This allows us to formulate the standard Ewald method for calculating the reciprocal energy and forces in terms of the NDFT primitives.
 
3.2.2. Reciprocal Energy
 
In the case of the electrostatic energy we have from Equation (28)
 
 
 (46) 
 
 with the structure factor 
 defined as 
 
 
 (47) 
 
 By comparing the definition of the transposed NDFT in Equation (45) and the structure factor in Equation (47) we note that they have the same structure; after a renumbering of the location indexes, the summation limits are also the same. In fact, by setting the normalized locations, 
, and the samples, 
, we see by inspection that Equation (47) is a 3D instance of Equation (45) with
. Furthermore, assuming that the MD simulation box is centered around the origin, the normalized locations can be assumed to be in the domain 
 as defined in Equation (38). 
 
Consequently, we can use the NDFT approach to calculate each of the components of the structure factor. From a computational point of view this means that we can also expect to utilize an NFFT based algorithm to calculate the components of the structure factor
, rather than the straightforward summation normally used in the Ewald method.
 
Recasting Equation (46) in terms of Fourier-components
 
 
 (48) 
 
 Calculating the energy, 
, using Equation (48) means that we 
 
 1) calculate all 
 using the transposed NDFT, 
 
 2) scale each 
 with a factor given by Equation (48), 
 
3) sum all the scaled components.
 
3.2.3. Reciprocal Forces
 
 We calculate the contribution from the reciprocal-space forces using a similar approach as for the energy. In the formula below, 
, and
, denote the real and imaginary part of the arguments, respectively. From Equation (24) we have that 
 
 
 (49) 
 
 Now, the structure factor 
 is just a complex number so the expression in the brackets in Equation (49) can be written as the imaginary part of a product 
 
 
 (50) 
 
Inserting Equation (50) this into Equation (49) gives
 
 
 (51) 
 
 Note that Equation (51) is a vector equation. Furthermore, each of the three components has the same structure as the conjugated NDFT of Equation (44). By setting the normalized locations, 
, and the samples, 
 
 
 (52) 
 
 we see, again, by inspection that each component of Equation (51) is a 3D instance of Equation (44). Assuming that 
 is in the index set 
 of Equation (40), 
, and setting
, we can formulate 
 directly in Fourier-terms 
 
 
 (53) 
 
 Calculating the reciprocal-space force 
 on particle
, using Equation (53), means that we 
 
 1) start with the structure factor components, 
, already obtained when we calulated
, 
 
 2) scale each 
 using Equation (52), 
 
3) giving a new set of Fourier-coefficients that are transformed back to real-space, via Equation (53), using the conjugated NDFT, and finally
 
 4) with Equation (53), taking the imaginary part of coefficient 
 and scaling it with 
 gives the reciprocal force on particle
. 
 
Thus we can use the NDFT approach to calculate each of the components of the reciprocal forces. Again, from a computational point of view this means we can expect to utilize an NFFT based algorithm to find the respective components.
 
3.2.4. Combining the Ewald Method and NFFT
 
 The reformulation of 
 and 
 as in Equations (48) and (53) shows the central role of the transposed and conjugated NDFT in calculating the reciprocal-space energy and forces. Starting with the location of the charged particles, the structure factor 
 is calculated via a transposed NDFT. In the language of Ewald summation, we transform from real-space to reciprocal-space. Scaling the absolute value of the Fourier- components and summing gives
. To find 
 we go back from the reciprocal-space to the real-space by first calculating the Fourier-components of the forces and then performing a conjugated NDFT. 
 
 An implementation of Ewald summation uses cut-offs, in reciprocal-space, 
, and real-space,
; with 
 large enough and with a required accuracy, 
, truncate the sums Equation (27) and Equation (28) at the respective cut-offs so that the last term added
, in each of the sums. When 
 is fixed by 
 
 
 (54) 
 
 Then 
 is determined by 
 
 
 (55) 
 
 We have recast 
 and 
 in terms of Fourier-components and set
, where 
 is the 
 
oversampling factor. This gives
. In general the computational complexity of the NFFT method
 
 is
, where 
 is the desired accuracy in the approximation used within NFFT [53]. 
 
 Using Equation (54) and the above defintion of
, we see that the complexity becomes
. Note that 
 is a function of
, for a fixed oversampling factor. With a controlled approximation of the structure factor via the use of nonuniform fast Fourier transform, the original computational complexity of 
 becomes
. 
 
 At this stage, the path to a fast Ewald method should now be clear. By specifying an accuracy
, we replace the transposed and conjugated NDFT with the corresponding operations using the NFFT algorithm. Thus Equations (48) and (53) become a concise procedure to calculate approximations of 
 and
. Most of the mathematical details can be kept separate and hidden in a set of library routines and the remaining formulas pertain to the physics of the problem. Furthermore, with a library implementation based on a state-of-the-art FFT-library, we have good reason to expect it to be efficient. 
 
3.2.5. Implementation and Results
 
In a first implementation [56,57], we used the libraries FFTW [58] and NFFT [54]. Details of the accuracy and scaling properties can be found in reference papers.
 
Basing the implementation on libraries has a number of advantages. It makes the implementation task easier and introduces a convenient division of labor in the program code: the mathematical aspects are mainly concentrated to the libraries while the physical aspects of the problem remain. Also, since the code becomes quite compact without becoming convoluted, it becomes easier to check, understand, and explain. Improvements and optimizations of the libraries can be easily included in the program, usually by just relinking the program. For example, the customization of the window function used in the NFFT algorithm---Gaussian functions, dilated cardinal B-splines, Sinc functions, or Kaiser-Bessel functions---is currently achieved by recompiling the NFFT library and relinking the application. Due to the comparatively small size of the NFFT library this is very quick. Furthermore, improvements in either theory or implementation of the used libraries will be easily accessible.
 
In summary, we claim that the ENUF method is
 
・ efficient and concise, and
 
・ has a clear separation of concerns between mathematical and physical details.
 
In a sense it can be said that we get the best of two worlds: a concise and efficient algorithm. The separation of mathematical concerns is a bonus that has the potential to simplify implementation and further developments due to the fact that they may occur independently of each-other.
 
4. ENUF Implemented in M.DynaMix
 
M.DynaMix is a highly modular general purpose parallel Molecular Dynamics code for simulations of arbitrary mixtures of either rigid or flexible molecules. It was released by Lyubartsev and Laaksonen in late 90’s [59]. Most common force fields can be used in simulations with a variety of periodic boundary conditions (cubic, rectangular, hexagonal or a truncated octahedron). Quantum corrections to the atomic motion can be done using the Path Integral Molecular Dynamics approach. M.DynaMix has been used in applications from materials design to biological processes.
 
M.DynaMix deals with particle system interacting by a force field consisting of Lennard-Jones, electrostatic, covalent bonds, angles and torsion angles potentials as well as of some optional terms, in a periodic rectangular, hexagonal or truncated octahedron cell. Rigid bonds are constrained by the SHAKE algorithm [60]. In case of flexible molecular models the double time step algorithm is used [61]. Algorithms for NVE, NVT and NPT statistical ensembles are implemented, as well as Ewald sum approach for treatment of the electrostatic interactions. An option to calculate free energy by the expanded ensemble method with Wang-Landau optimization of the balancing factor is included in later versions. For its features and capabilities, M.DynaMix is diffused in a large modeling and simulation community. Written in FORTRAN 77, it can be run on a wide variety of hardware architectures both in sequential and parallel execution. The entire program source code consists of a number of FORTRAN files (modules), made of blocks of subroutines, performing different tasks or groups of tasks.
 
4.1. Framework Overview
 
The situation outlined for M.DynaMix is very common in the field of computational science: a package created in late 90’s that is still at work in spite of new innovations in computing platforms. In a situation like this, each upgrade requires a trade-off between the need to preserve the existing structure and the wish to obtain as much efficiency as possible.
 
To take the best advantage from latest programming techniques and tools, we decide to use C language to create some of the new code segments. On one hand, modules in FORTRAN and C can easily coexist in the same code, just taking account of a few mild guidelines [62]; on the other hand, C language provides several enhancements related to intrinsic features (dynamic memory allocation and direct pointers reference) and possibility to set up complex data structures; furthermore, it allows a more plain access to a large amount of external routine libraries written in C. From a general point of view, M.DynaMix code has been projected with a good modularity degree and has been possible to make the most part of upgrades just switching a block (subroutine) with a new one.
 
4.2. Input Parameters
 
 To use ENUF method for treating long range interactions, M.DynaMix user needs to set the proper key string in the input file [63]; parameters to specify are the Ewald convergence factor 
 and the number of points for FFT grid in
, 
and 
 direction; starting from them, the program automatically sets proper values for over-sampling factor 
 and approximation parameter
. 
 
4.3. Implementation
 
Ewald-like methods for computing electrostatic energies typically replace part of the summation in real space with an equivalent summation in Fourier space; among M.DynaMix modules, there is one devoted to this reciprocal space duty. In the starting setup, this module was present in M.DynaMix in a version related to the full Ewald algorithm; ENUF implementation involved the creation of a second instance of this module.
 
A scheme of the ENUF module as implemented in M.DynaMix is presented in Table 1; the module in this case is a group of files, part of them written in FORTRAN and part in C; each file contains routines related to the algorithm steps. The table displays for each step the programming language used to write corresponding file; Some files make calls to external libraries, typically used to perform non uniform Fourier transform and its inverse.
 
 
   
 
 
 Table 1. ENUF algorithm scheme: for each step is listed the language of the related routines and the presence of calls to external libs.
 
  
4.3.1. Coordinate Scaling and Adjoint Transform
 
 In the first step of the algorithm, particle coordinates are rescaled respect to the simulation box, in order to obtain the proper set of nonequispaced knots as required in Equation (38). Knots are stored in the unidimensional FORTRAN array
, in consecutive way, saving coordinates in reverse component order 
 for each knot. Particle charges are located in unidimensional array
. 
 
 At the next step, a C routine performs the 3-dim adjoint transform of charges 
 located at knots
. C routine access to 1-dim FORTRAN vectors 
 and 
 as three dimensional arrays; the reverse component order adopted in 
 vector filling is related to this and depends on the different way to arrange multi- dimensional arrays in memory: by columns for FORTRAN, by rows for C. Transformed data are stored in the complex array
. 
 
4.3.2. Energy, Regular Transform and Forces
 
 In this stage, energy is evaluated by a sum, according to formula (48). Then, the transformed array 
 is rescaled using Equation (51); starting from
, three complex rescaled arrays are created, 
, 
and
, 
 
 one for each direction in the reciprocal space. A C routine back transform complex data set
, 
and 
 
 
 related to the same set of knots
. In this step, three independent back transformations are performed, producing complex arrays
, 
and
. Forces contributions for each particle are obtained according to Equation (53); components in
, 
and 
 directions are evaluated using the imaginary part of arrays
, 
and
. 
 
4.4. External Libraries
 
To perform Fourier transform on nonequispaced data, our current implementation make use of an external existing function library. Among the available resources, we select the library NFFT [54]. NFFT is a widely diffused C subroutine library for computing nonequispaced discrete Fourier transform in one or more dimensions, of arbitrary input size and of complex data; it is based on FFTW [58].
 
4.5. Validation
 
In the original version of M.DynaMix the working method for the treatment of long range interactions is the full direct Ewald summation. Using a cross-check approach, namely comparing ENUF method to the stable one, it has been possible to debug every step of the algorithm implementation and check its correctness in deep way.
 
 After the implementation, the same cross-check mechanism has been used to investigate ENUF precision and efficiency. Figure 1 displays the execution time spent for long range interactions evaluation respect to the number of particles N, when full Ewald and ENUF are used. In the simulation, of 
 iterations, the system is composed by 50 
 ions and 50 
 ions in water solution in a cubic cell. The number of water molecules is increased from 
 to 
 in seven discrete steps, keeping constant the density to 1.02
. The simulation run on a Intel Xeon workstation, 1.86 GHz, with ram 8 Gb. 
 
The full Ewald method has been taken as reference; for each system size, ENUF parameters have been tuned to obtain the same Ewald precision, evaluated in terms of the electrostatics energy contribution, tolerating a maximum divergence of 2%.
 
 The plot shows that execution time for long range interactions that represents the efficiency bottleneck in the original M.DynaMix version, has been drastically reduced saving the same numerical precision. The plot also shows the different trend for execution time vs 
 in the two methods; the point set for the full Ewald scheme 
 
 
  
 
 
 Figure 1. Execution time for long range interactions vs. number of particles N, for Ewald and ENUF. System is composed by 50 Na+ and 50 Cl− in H2O solution in a periodic cubic cell. Number of water molecules is increased from 103 to 25 × 103 keeping density to 1.02 g/cm3.
 
  
is well fitted using the well-known theoretical behavior
; data points related to ENUF correspond quite well to the expected
.
 
5. Implementation of ENUF Method in Dissipative Particle Dynamics Scheme
 
 The Dissipative Particle Dynamics (DPD) simulation method, originally proposed by Hoogerbrugge and Koelman, is a particle-based simulation method to simulate hydrodynamic phenomena at mesoscopic level [64, 65,66]. In DPD model, several atoms or molecules are grouped together to form coarse-grained particles. The interactions between any pair of DPD particles 
 and 
 are normally composed of three pairwise additive forces: the conservative force
, the dissipative force
, and the random force 
 
 
 
 (56) 
 
with
 
 
 (57) 
 
 
 (58) 
 
 
 (59) 
 
 where
, 
, 
, and
. The parameters
, 
, and 
 determine the strength of the conservative, dissipative, and random forces, respectively. 
is a randomly fluctuating variable, with zero mean and unit variance. 
 
 The pairwise conservative force is written in terms of a weight function
, where 
 is usually chosen for 
 and 
 for 
 such that the conservative force is soft and repulsive. The unit of length 
 is related to the volume of DPD particles. Two weight functions 
 and 
 for dissipative and random forces, respectively, are coupled together through the fluctuation- dissipation theorem 
 
 
 (60) 
 
 to form a thermostat and generate naturally the canonical distribution (constant number of particles, N, volume, V, and temperature, T) [67]. In most applications, the weight function 
 adopts a simple form as [68] 
 
 
 (61) 
 
In the original DPD model, one critical advantage is the soft repulsive nature of conservative potential, which enables us to integrate the equations of motion with large time step. However, such advantage restricts the direct incorporation of electrostatic interactions in DPD model. The main problem is that dissipative particles carrying opposite point charges tend to collapse onto each other, forming artificial ionic clusters due to the stronger electrostatic interactions than soft repulsive conservative interactions.
 
 In order to avoid such unphysical phenomena, point charges at the center of dissipative particles are usually replaced by charge density distributions meshed around particles to remove the divergency of electrostatic interactions at 
 [69,70]. In our implementation [71,72], a Slater-type charge density distribution is considered with the form of 
 
 
 (62) 
 
 in which 
 is the decay length. The integration of Equation (62) over the whole space gives the total charge
. 
 
 The electrostatic potential 
 generated by Slater-type charge distribution 
 can be obtained by solving Poisson’s equation 
 
 
 (63) 
 
 in which the variables 
 and 
 are the permittivity of vacuum space and the dielectric constant of water at room temperature, respectively. In spherical coordinates, the Poisson’s equation becomes 
 
 
 (64) 
 
 By solving Poisson’s equation, the electrostatic potential field 
 can be analytically expressed by 
 
 
 (65) 
 
 The electrostatic energy between two interacting charge density distributions 
 and 
 is the product of the charge density distribution 
 and the electrostatic potential generated by charge distribution 
 at position 
 
 
 
 (66) 
 
 The electrostatic force on charge distribution 
 is the negative of the derivative of the potential energy 
 respect to its position 
 
 
 
 (67) 
 
 By defining parameter 
 as the reduced center-to-center distance between two charge distributions and dimensionless parameter
, respectively, the reduced electrostatic energy and force between two Slater-type charge distributions are given by 
 
 
 (68) 
 
 
 (69) 
 
Comparing to the electrostatic energy and force between point charges in atomistic simulations in previous section, we can find that the electrostatic energy and force between two Slater-type charge distributions in DPD simulations are scaled with corresponding correction factors as
 
 
 (70) 
 
 
 (71) 
 
The similarities between electrostatic energy and force between point charges and counterparts between charge density distributions imply that once we get electrostatic energy and force between point charges, from which the electrostatic energy and force between Slater-type charge density distributions in DPD simulations can be directly rescaled with corresponding correction factors.
 
For the electrostatic energy and force between point charges, we know that both of them do diverge when the relative distance between point charges is close to 0. While for the electrostatic energy and force between charge density distributions, in the limit of
, the reduced electrostatic energy and force between charge density distributions are, respectively, described by
 
 
 (72) 
 
 
 (73) 
 
It is clear that the adoption of Slater-type charge distribution in DPD simulations removes the divergence of electrostatic interactions at
, which means that both electrostatic energy and force between charge distributions are characterized with finite quantities.
 
 By matching the maximum electrostatic energy between charge distributions at 
 with Groot’s previous work [69] gives
. In our detailed implementations, we adopted a particular coarse-graining scheme [73] with 
 and
, in which the former parameter means 
 water molecules being coarse-grained into one DPD particle and the latter means there are 
 DPD particles in the volume of
. With this particular scheme, the length unit 
 is given as
. From the relation of
, we can get
, which is consistent with the electrostatic smearing radii used in González-Melchor’s work [70]. 
 
Figure 2 shows the representation of reduced electrostatic energy and force with respect to the relative distance between Slater-type charge distributions. For a better comparison, we also include the typical soft conservative potential and force between dissipative particles, as well as the standard Coulombic potential and force between point charges, both of which do diverge at
. In short range length scale, both the electrostatic energy and force are characterized with finite quantities, which attribute to the adoption of Slater-type charge distributions instead of point charges. In long range length scale, both the electrostatic energy and force are consistent with counterparts between point charges, which implys that the ENUF-DPD method can capture essential characteristics of electrostatic interactions at mesoscopic level.
 
 Combining electrostatic force 
 and soft repulsive conservative force 
 gives the total conservative force 
 on particles 
 in DPD simulations. The total conservative force
, together with dissipative force 
 and random force
, as well as the intramolecular bonding force 
 for polymers and surfactants, act on dissipative particles and evolve the whole simulated system toward equilibrium conditions before taking statistical analysis. 
 
As the number of charged DPD particles in the simulated system grows, the calculation of the reciprocal space electrostatic interactions will become the most time-consuming part. Using suitable parameters in
 
 
  
 
 
 Figure 2. Electrostatic potential and force between two charge density distributions in DPD scheme calculated from the ENUF and Ewald summation methods with reference parameters. For a better comparison, the standard Coulombic potential and force, both of which diverge at
, and the typical conservative potential and force in standard DPD method are also included. Both the electrostatic potential and force expressions are plotted for two equal sign charge distributions.
 
  
ENUF-DPD method assures that the time to calculate the real space summations is approximately the same as the time to calculate the reciprocal space summations, thereby reducing the total computational time. Herein we try to explore the ENUF-DPD related parameters and get a set of suitable parameters for further applications.
 
 As addressed in Section 3.2, the implementation of ENUF-DPD method uses the Ewald convergence parameter
, required accuracy
, and two cut-offs (
for real space and 
 for reciprocal space summations). These parameters are correlated with each other through two conditions shown in Equations (54) and (55). With required accuracy parameter
, it is more convenient to pick a suitable value for
. Then one can determine 
 and 
 directly from Equations (54). However, due to the fact that 
 should be integer and 
 should be a suitable value for the cell-link list update scheme in DPD simulations, we adopt another procedure to get suitable parameters. 
 
 During the calculation of real space summation and self-interaction parts of electrostatic energy and force between point charges, we can directly multiply corresponding correction factors to get the electrostatic energy and force between slater-type charge density distributions. However, it is not accessible for the calculation of reciprocal space summation since we cannot rescale the NFFT transformation results. But if we choose suitable real space cutoff
, beyond which two correction factors 
 and 
 converge to unit, one can directly adopt NFFT transformation result without any corrections. We find that electrostatic energy and force between charge density distributions are consistent with counterparts between point charges when the relative distance between two distributions is larger than
. Hence in our simulations, 
is taken as the cutoff for real space summations of electrostatic interactions. 
 
 By evaluating the Madelung constant of a face-centered cubic (FCC) lattice, we adopt the Ewald convergence parameter with the value of
, which can generate accurate Madelung constant for FCC lattice structure and keep considerable accuracy. Then we perform coarse-grained simulations on bulk electrolyte system to explore suitable values for
, and approximation parameter 
 in NFFT. It is specified that approximation parameter 
 and cutoff 
 for reciprocal space summations can generate consistent electrostatic energy and force in comparison with those obtained from traditional Ewald summation method with reference parameters. Larger 
 values can further increase the accuracy of electrostatic interactions in ENUF-DPD method, but also the total computational time in treating electrostatic interactions increases. By compromising the accuracy and computational speed in the ENUF-DPD method, we adopt 
 and 
 in following simulations. 
 
With the set of explored optimized parameters, we address the computational complexity of ENUF-DPD method in treating electrostatic interactions. The computational complexity of ENUF-DPD method is appro- ximately described as
, which shows remarkably better computational efficiency than the tradi- tional Ewald summation method with acceptable accuracy in treating long-range electrostatic interactions between charged particles at mesoscopic level.
 
The ENUF-DPD method is then validated by investigating the influence of charge fraction of polyelectrolyte on corresponding conformational properties [71]. With the increase of charge fraction on polyelectrolyte, both the intramolecular correlations between charged beads on polyelectrolyte chain and the intermolecular correla- tions between charged beads on polyelectrolyte and counterions are enhanced. The conformation transition of polyelectrolyte chain from collapsed state to fully extended conformation can be visualized from simulations. Meanwhile, the dependence of the conformations of fully ionized polyelectrolyte on charge valency and concentration of added salts are also studied in details. Counterions with larger valency show stronger conden- sations on polyelectrolyte chains. Such counterions can induce polyelectrolyte chains from extended confor- mation to compact state, and then to swollen conformation with the increase of counterion concentrations.
 
With the ENUF-DPD method, we further investigate the specific binding structures of dendrimers on amphiphilic bilayer membranes [74]. We construct mutually consistent coarse-grained models for dendrimers and lipid molecules, which can properly describe the conformation of charged dendrimers and the surface tension of amphiphilic membranes, respectively. Systematic simulations are performed and simulation results reveal that the permeability of dendrimers across membranes is enhanced upon increasing dendrimer sizes. The negative curvature of amphiphilic membrane formed in dendrimer-membrane complexes is related to dendrimer concentration. Higher dendrimer concentration together with the synergistic effect between charged dendrimers can also enhance the permeability of dendrimers across amphiphilic membranes.
 
With these two typical and representative applications, we can see that the newly implemented ENUF-DPD method can capture the essential characteristics of electrostatic interactions at mesoscopic level. This method has all capabilities of ordinary DPD method, but includes applications where electrostatic interactions are essential but previously inaccessible, hence can be used to study charged complex systems at mesoscopic level.
 
6. Summary and Conclusions
 
Treatment of electrostatic interactions based on Ewald summation techniques is reviewed. While Ewald- summation is still considered as the most accurate scheme to compute the long-ranged interactions, it is also the part slowing down simulations. The scaling for large systems makes the computations very time-consuming. In particular, the reciprocal part of Ewald becomes a bottle-neck. As an attractive alternative approach to mesh- based schemes which show a linear scaling, we introduce an Ewald method based on non-uniform fast Fourier transforms (ENUF) giving examples of two implementations in already existing software packages, for ato- mistic Molecular Dynamics and Dissipative Particle Dynamics. We demonstrate that the implementation be- comes straight-forward as we rely on NFFT library. We discuss the optimization of convergence parameters and window functions.
 
 The ENUF method scales linearly as 
 and conserves both the energy and momentum to float point accuracy making it a very robust and accurate method. 
 
Acknowledgements
 
Y.-L. W. and A. L. acknowledge the KA Wallenberg Foundation for financial support. F. M. acknowledges the Wenner-Gren and the Carl Tryggers Foundations for funding for Visiting Professorship. Y.-L. W. and A. L. thank the Kavli Institute for Theoretical Physics China (KITPC) and Chinese Academy of Sciences (CAS) for support and scientific environment during their stay. This work is supported by SERC the Swedish e-Science Center and Swedish Science Council. Parts of the computations were performed on resources provided by the Swedish National Infrastructure for Computing (SNIC) at PDC, HPC2N, and NSC.