The LBFGS Quasi-Newtonian Method for Molecular Modeling Prion AGAAAAGA Amyloid Fibrils

Experimental X-ray crystallography, NMR (Nuclear Magnetic Resonance) spectroscopy, dual polarization interferometry, etc are indeed very powerful tools to determine the 3-Dimensional structure of a protein (including the membrane protein); theoretical mathematical and physical computational approaches can also allow us to obtain a description of the protein 3D structure at a submicroscopic level for some unstable, noncrystalline and insoluble proteins. X-ray crystallography finds the X-ray final structure of a protein, which usually need refinements using theoretical protocols in order to produce a better structure. This means theoretical methods are also important in determinations of protein structures. Optimization is always needed in the computer-aided drug design, structure-based drug design, molecular dynamics, and quantum and molecular mechanics. This paper introduces some optimization algorithms used in these research fields and presents a new theoretical computational method - an improved LBFGS Quasi-Newtonian mathematical optimization method - to produce 3D structures of Prion AGAAAAGA amyloid fibrils (which are unstable, noncrystalline and insoluble), from the potential energy minimization point of view. Because the NMR or X-ray structure of the hydrophobic region AGAAAAGA of prion proteins has not yet been determined, the model constructed by this paper can be used as a reference for experimental studies on this region, and may be useful in furthering the goals of medicinal chemistry in this field.


Introduction
Neurodegenerative diseases including Parkinson's, Alzheimer's, Huntington's, and Prion's were found they all featured amyloid fibrils [1,2,3,4,5,6]. Amyloid is characterized by a cross-β sheet quaternary structure and recent X-ray diffraction studies of microcrystals revealed atomistic details of core region of amyloid [7,8]. All the quaternary structures of amyloid cross-β spines can be reduced to one of the 8 classes of steric zippers of [8], with strong van der Waals (vdW) interactions between β-sheets and hydrogen bonds (HBs) to maintain the β-strands. A new era in the structural analysis of amyloids started from the 'steric zipper'-β-sheets [7]. As the two β-sheets zip up, Hydrophobic Packings (HPs & vdWs) have been formed. The extension of the 'steric zipper' above and below (i.e. the β-strands) is maintained by Hydrogen Bonds (HBs) (but usually there is no HB between the two β-sheets). This is the common structure associated with some 20 neurodegenerative amyloid diseases.
We first do some mathematical analysis for the common structure. Let r be the distance between two atoms, the vdW contacts of the two atoms are described by the Lennard-Jones (LJ) potential energy:

1)
where ǫ is the depth of the potential well and σ is the atom diameter; these parameters can be fitted to reproduce experimental data or deduced from results of accurate quantum chemistry calculations. The ( σ r ) 12 item describes repulsion and the −( σ r ) 6 item describes attraction ( Figure 1). If we introduce the coordinates of the atoms whose number is denoted by N and let ǫ = σ = 1 be the reduced units, then, for N atoms, Eq. 1.1 is is the coordinates of atom i, N ≥ 2. The minimization of LJ potential energy f (x) Figure 1: The LJ potential energy (Eqs. 1 Figure 1 of [9]).

.1 and 1.4) (can be seen in
on R n (where n = 3N) is an optimization problem which is a well-known and challenging test problem for global optimization (see http://www-wales.ch.cam.ac.uk/CCD.html and its recent references such as [10,11,12,13,14]). It is very hard for global optimization to directly solve Eq. 1. where A, B, C, D are given constants and usually most of the HBs are still kept during the phase of molecular modeling. Thus, the amyloid fibril molecular modeling problem can be reduced to solve the optimization problem Eq. 1.3 though it is not easy to accurately solve Eq. 1.3 for a large molecule.
Alternatively, we have found another way to solve Eq. 1.3 [15]. Seeing Figure  1, we may know that the optimization problem Eq. 1.3 reaches its optimal value at the bottom of the LJ potential well, where the distance between two atoms equals to the sum of vdW radii of the two atoms. Hence, the amyloid fibril molecular modeling problem can be looked as a molecular distance geometry problem (MDGP) [16]. As an example to explain MDGP, the problem of locating sensors in telecommunication networks is a DGP. In such a case, the positions of some sensors are known (which are called anchors) and some of the distances between sensors (which can or cannot be anchors) are known. The DGP is to locate the positions of all the sensors. The MDGP looks sensors as atoms and their telecommunication network as a molecule. In mathematics, the following Eqs. 1.6∼1.8 can express the MDGP for Eq. 1.3. The 3D structure of a molecule with N atoms can be described by specifying the 3D coordinate positions x 1 , x 2 , . . . , x N ∈ R 3 of all its atoms. Given bond lengths d ij between a subset S of the atom pairs, the determination of the molecular 3D structure is (1.6) where || · || denotes a norm in a real vector space and in this paper it is calculated as the Euclidean distance 2-norm. Eq. 1.6 can be reformulated as a mathematical global optimization problem in the terms of finding the global minimum of the function P (X), where w ij , (i, j) ∈ S are positive weights, X = (x 1 , x 2 , . . . , x N ) T ∈ R n [17] and usually S has fewer elements than N 2 /2 due to the error in the theoretical or experimental data [18,16]. Even there may not exist any solution x 1 , x 2 , . . . , x N to satisfy the distance constraints in Eq. 1.6, for example when data for atoms i, j, k ∈ S violate the triangle inequality; in this case, we may add a perturbation item −ε T X to P (X): where ε ≥ 0. In some cases, instead exact values d ij , (i, j) ∈ S can be found, we can only specify lower and upper bounds on the distances: l ij ≤ ||x i − x j || ≤ u ij , (i, j) ∈ S; in such cases we may penalize all the unsatisfied constraints into the objective function of (P ε ) by adding (i,j)∈S max [18,16], where we may let d ij be the interatomic distance (less than 6 angstroms) for the pair in successive residues of a protein and set l ij = (1 − 0.05)d ij and u ij = (1 + 0.05)d ij [16]. In this paper, we aim to solve Eq. 1.8 (or Eq. 1.3) for modeling amyloid fibril molecular 3D structures.
Neurodegeneration is the progressive loss of structure or function of neurons, including death of neurons. A prion is a misshapen protein that acts like an infectious agent (but not requiring either DNA, RNA, or both) to cause a number of fatal diseases. Prion diseases are rich in β-sheets (compared with the normal prion protein PrP C in rich of α-helices) and are so-called "protein structural conformational" diseases. The normal hydrophobic region 113-120 AGAAAAGA peptide of prion proteins is an inhibitor/blocker of prion diseases. PrP lacking this palindrome could not convert to prion diseases. Brown et al. pointed out that the AGAAAAGA peptide was found to be necessary (though not sufficient) for blocking the toxicity and amyloidogenicity of PrP 106-126, and the peptide AGAA does not form fibrils [19]. The minimum sequence necessary for fibril formation should be AGAAA, AGAAAA, AGAAAAG, AGAAAAGA and GAAAAGA, but the molecular structures of these fibrils have not known yet. This paper addresses an important problem on modeling the 3D molecular structures of prion AGAAAAGA amyloid fibrils of neurodegenerative diseases. The rest of this paper is arranged as follows. In the next section, i.e. Section 2, an improved LBFGS Quasi-Newtonian method is presented for solving Eq. 1.8. Section 3 implements this Quasi-Newtonian method by constructing an 3D molecular structure of prion AGAAAAGA amyloid fibrils of neurodegenerative prion diseases. Numerical results of computations show that the method designed in Section 2 is very effective and successful. This concluding remark will be made in the last section, i.e. Section 4.

Methods
In a (macro)molecular system, if it is very far from equilibrium, then the forces may be excessively large, a robust energy minimization (EM) is required; another reason to perform an EM is the removal of all kinetic energy from the system: EM reduces the thermal noise in the structures and potential energies [20]. EM, with the images at the endpoints fixed in space, of the total system energy provides a minimum energy path. EM can be done using steepest descent (SD), conjugate gradient (CG), and Limited-memory Broyden-Fletcher-Goldfarb-Shanno (LBFGS) methods.
Three kinds of possible EM methods are: (1) derivative-free methods -that require only function evaluations, e.g. the simplex method and its variants; (2) derivative information methods -the partial derivatives of the potential energy with respect to all coordinates are known and the forces are minimized, e.g. SD, CG methods; and (3) second derivative information methods, e.g. LBFGS method. "SD is based on the observation that if the real-valued function f (x) is defined and differentiable in a neighborhood of a point x 0 then f (x) decreases fastest if one goes from x 0 in the direction of the negative gradient of f (x) at x 0 and SD local search method converges fast [21]. SD is robust and easy to implement but it is not most efficient especially when closer to minimum; at this moment, we may use the efficient CG. CG is slower than SD in the early stages but more efficient when closer to minimum. CG algorithm adds an orthogonal vector to the current direction of the search, and then moves them in another direction nearly perpendicular to this vector. The hybrid of SD-CG will make SD or CG more efficient than SD or CG alone. However, CG cannot be used to find the EM path, for example, when "forces are truncated according to the tangent direction, making it impossible to define a Lagrangian" [22,23]. In this case, the powerful and faster quasi-Newtonian method (e.g. the LBFGS quasi-Newtonian minimizer) can be used [22,24,25,26,27,28]. We briefly introduce the LBFGS quasi-Newtonian method as follows.
Newton's method in optimization explicitly calculates the Hessian matrix of the second-order derivatives of the objective function and the reverse of the Hessian matrix [29]. The convergence of this method is quadratic, so it is faster than SD or CG. In high dimensions, finding the inverse of the Hessian is very expensive. In some cases, the Hessian is a non-invertible matrix, and furthermore in some cases, the Hessian is symmetric indefinite. Quasi-Newton methods thus appear to overcome all these shortcomings.
Quasi-Newton methods (a special case of variable metric methods) are to approximate the Hessian. Currently, the most common quasi-Newton algorithms are the SR1 formula, the BHHH method, the widespread BFGS method and its limited/low-memory extension LBFGS, DFP, MS, and Broyden's methods [30,31,32,33]. In Amber [23] and Gromacs [20], LBFGS is used, and the hybrid of LBFGS with CG -a Truncated Newton linear CG method with optional LBFGS Preconditioning [25] -is used in Amber [23].
For BFGS method, whether it converges at all on nonconvex problems is still an open problem. In fact, Powell (1984) gave a counter-example that shows that BFGS with an inexact line-search search may fail to converge [34,35,36]. Li and Fukushima (2001) proposed a modified BFGS method for nonconvex objec-tive function [37]. Basing on [37,24,38,25,39], in this paper we present an improved LBFGS method described as follows [40] -which presents the nonmonotone line search technique [41,42] for the Wolfe-type search. The improved LBFGS method presented in this paper is much better than the standard BFGS method in view of the CPU time (see Figure 2) tested through more than 30 nonlinear programming problems (where each selected problem is regular, that is, its first and second derivatives exist and are continuous everywhere, and each problem is with different dimensions, i.e., 100, 500, 1000 and 10000 dimensions) and its mathematical theory to support this algorithm can be seen from the Supplementary Materials [40] listed at the end of this paper. This paper implements the Wolfe-type search by the approximation technique of piecewise linear/quadratic function [43].
Step 1: If g k = 0, then output x k and stop; otherwise, go to Step 2.
Step 2: Solve the following linear equation to get d k : Step 3: Find a step-size λ k > 0 satisfying the Wolfe-type line search conditions: Step 4: Let the next iterate by x k+1 = x k + λ k d k . Calculate g k+1 and g k+1 .
In Algorithm 1, g k denotes the gradient of f (x) at x k , and the convergence and the R-linear convergent rate of Algorithm 1 are guaranteed by the following assumptions: (A1) f is twice continuously differentiable, (A2) f has Lipschitz continuous gradients and Hessians, (A3) the Hessian at the stationary point is always positively definite, and (A4) the level set of f is bounded. The detailed proof of the convergence and R-linear convergent rate of Algorithm 1 can be found in the Supplementary Materials supplied at the end of this paper. Figure 2: Performance based on CPU time, where the definitions of P and τ can be found in [44].
In [40], the following nonmonotone line search technique for stepsize λ k is used: f 1), and M 0 is a nonnegative integer. This is a difference between the algorithm of [40] and this paper. All in all, it is well known that quasi-Newton method is an efficient solution method for unconstrained and continuously differentiable minimization problem [45,46,47]. However, it needs computing and storage of the updated matrix which is an approximation to the Hessian matrix at each iteration of the method. Hence, its efficiency may decrease when it is applied to large scale optimization problem. To overcome the drawback, limited memory quasi-Newton method is proposed [48]. The main ideal of this method is nearly identical to that of the standard BFGS method, and the only difference is that the inverse Hessian approximation is not formed explicitly, but defined by a small number, saym, of BFGS updates. This technique received much attention in recent years and numerical experiments show that it is very competitive [49,24], and its global convergence and R-linear convergence rate with Wolfe line search are established for the uniformly convex case [26,24]. Since the limited memory BFGS method may suffer from ill-conditions for small value ofm, Al-Baali (1999) [50] made some modifications to the method and establish its global convergence based on the same assumptions, and Byrd et al. (1994) [51] derives new representation of limited memory quasi-Newton matrices for the benefit of computing the updated matrix. Recently, a non-monotone line search is introduced, see e.g., [41,42]. Then it is showed to be more competitive and practical for solving nonlinear optimization problems, and [52] established the global convergence of this line search applied to limited memory BFGS method based on the uniformly convex assumption. Motivated by the above observation, it turn out that in two respects the limited memory BFGS method is much less effective. First, we note that the convergence analysis of these method are focused on the uniformly convex assumption and little is known for nonconvex case. Second, numerical experiments have suggested the main weakness of limited memory method is that it may converge very slowly in terms of number of iterations for ill-conditioned problems. The purpose of the above Algorithm 1 is to reduce these defects and Figure 2 shows the effectiveness of the proposed algorithm. We will apply it into the molecular modeling of prion AGAAAAGA amyloid fibrils in the next section.

Results and Discussion
From their research of prion, scientists found that the cross-β structure of peptides is with the nature of self-aggregation, the self-aggregating to form fibers. This provides us a new research idea for nanomaterials. HBs can be formed between peptide β-strands, and one peptide monomer connects together with another in accordance with the specific structure to form fibers. Many laboratories in the world are synthesizing peptides that can self-aggregate to form fibers, and want to be able to control the growth of the fiber to find out new functional materials [53,54]. The studies of this paper not only benefit nanometerials research, but also benefit the research on neurodegenerative amyloid fibril diseases. Prion AGAAAAGA peptide has been reported to own an amyloid fibril forming property (initially described in 1992 by Gasset   Amyloid fibril identification for prions Figure 3: Prion AGAAAAGA (113-120) segment is clearly and surely identified as the amyloid fibril formation region, because its energy is less than the amyloid fibril formation threshold energy -26 kcal/mol [80].

Material for the Molecular Modeling
This paper uses a suitable pdb file template 2OMP.pdb (the LYQLEN peptide derived from human insulin residues 13-18 [8]) from the Protein Data Bank to build an 8-chain AGAAAAGA prion amyloid fibril molecular model to illuminate Algorithm 1 works very well. To choose 2OMP.pdb (Figure 4) as the modeling template is due to it can pass all the long procedures of SDCG-SA (equilibrations & productions)-SDCG of [79]. By observations of Figure 4 and the 2nd column of coordinates of 2OMP.pdb, we know that E(F) chains can be calculated on the XZ-plane from A(B) chains by Eq. 3.1 and other chains can be got by a parallel

New Molecular Modeling Homology Model
Basing on the template 2OMP.pdb from the Protein Data Bank (Figure 4) Figure 5, where the twice of the vdW radius of CB atom is 3.4 angstroms).
In [79] the commercial package InsightII (http://accelrys.com/) is used to build models. Instead of InsightII, because this package is not available by the authors, this paper uses Algorithm 1 to build and optimize Model 1. In "Zipper 1", fixing the coordinates of A.ALA3.CB, A.ALA1.CB and letting the coordinates of E.ALA6.CB, E.ALA4.CB be variables, we may get an optimization problem:   The initial structure of Model 1 illuminated in Figure 6(a)∼6(b) is not the optimal structure with the lowest total potential energy. The initial structure also has no hydrogen atoms (so no hydrogen bonds existed) and water molecules added. For each Chain, the C-terminal and N-terminal atoms also have problems. Clearly there are a lot of close/bad contacts between β-strand atoms as illuminated in Figure 6(a)∼6(b). We used the ff03 force field of AMBER 11,in a neutral pH environment. The amyloid fibrils were surrounded with a 8 angstroms layer of TIP3PBOX water molecules using the XLEaP module of AMBER 11. 1944 waters and 408 hydrogen atoms were added for Model 1 byt he XLEaP module. solvated amyloid fibril was inimized by the method.
The LJ potential energy of atoms' vdW interactions is just a part of the total potential energy of a protein, and by observations from Model 1 computed by Eqs. 3.10, 3.2∼3.3 and 3.11 we can see that the contacts between β-strand atoms and β-sheet atoms are too close/bad. Thus, we need to relax Model 1 computed. The relaxation is done in the use of local search LBFGS Quasi-Newton method (lbfgs memory depth=3) within AMBER 11 [23]. The relaxed/optimized Model 1 is illuminated in Figure 6(c). Seeing Figure 6(d) compared with Figure  6(b), we may know the vdW interactions between the two β-sheets are very relaxed/optimized now. Figure 6(c) shows the Model 1 of optimal molecular structure for prion AGAAAAGA amyloid fibrils.

Conclusion
In a (macro)molecular system, a robust energy minimization is very necessarily required. Mathematical optimization minimization methods find a place to apply in these systems. Because in physics the (macro)molecular system usually is not a simple two-body problem of system, local search optimization methods are very useful in the applications to the (macro)molecular system. On anther sense, when a protein is unstable, noncrystalline or insoluble and very difficult to detect its 3D structure by the expensive and costly NMR and X-ray, theoretical mathematical or physical computational method can be used to produce the 3D structure of the protein. Moreover, even the X-ray crystallography finds the X-ray final structure of a protein, we still need refinements using theoretical protocols in order to produce a better structure. The theoretical computational method -an improved LBFGS Quasi-Newtonian mathematical optimization method -presented in this paper and other mathematical optimization methods mentioned in this paper should be very useful in the protein molecular modeling research field. This paper also shows the effectiveness of the improved LBFGS mathematical optimization method presented. Prion AGAAAAGA amyloid fibrils have not much structural information. This paper presents some bioinformatics on the molecular structures of prion AGAAAAGA amyloid fibrils in the sense of theoretical emphasis. The structures may be helpful in the advance in the biochemical knowledge of prion protein misfolding or instability and in the future applications for therapeutic agent design.
VR0063 & 488 on its Peak Computing Facility at the University of Melbourne, an initiative of the Victorian Government (Australia). The first author appreciates the financial support by Professors Changyu Wang and Yiju Wang for his visits to Qufu Normal University (China), and by Professor Xiangsun Zhang for his visits to Chinese Academy of Sciences (Beijing); the first author also appreciates Professor Adil M. Bagirov (University of Ballarat) for his instructions to implement the Wolfe-type search. This paper is dedicated to Professor Xiangsun Zhang (Academia Sinica, Beijing) on the occasion of his 70th birthday.
where B k = H −1 k ,s l = x l+1 − x l , y * l = y l + γ l s l , and B k−m+1 k = B 0 for all k.
Theorem 1 Let Assumptions (A1)∼(A4) hold, and {x k } denotes the sequence generated by Algorithm 1. Then lim k→∞ inf g k = 0; moreover, there exits a constant t ∈ [0, 1) such that which is to say that x k converges to the minimum R-linearly.
Proof. Denote by M 1 an upper bound of g k on Ω and by y * k = y k + γ k s k , we can prove that y * T where M 4 is a positive constant. Combining Eqs. 5.2 and 5.3, we obtain a constant δ > 0 such that cos(θ k ) ≥ δ, where cos(θ k ) = s T k B k s k s k B k s k .
Because g(x) is Lipschitz continuous and by the Wolfe-type line search condition Eq. 2.2, we have which implies Using the Wolfe-type line search condition Eq. 2.1 and Eq. 5.4, we have Then using the monotonicity and the bound of {f k } on Ω, we obtain ∞ k=1 g k 2 cos 2 θ k < ∞.
Under the Assumptions (A2)∼(A3), there is a neighbourhood U(x * ) of x * such that for all x ∈ U(x * ), where m is a positive constant got according to the Assumptions. Hence, Eqs. 5.5∼5.6 hold with x = x k for all k sufficiently large.