In many cases dynamical processes essentially contribute to protein function, e.g., in ligand binding or in enzymatic reactions. Representing a microscopic description of protein dynamics in the framework of classical mechanics, molecular dynamics (MD) simulations [1, 2, 3] can serve as a tool to interpret experimental data and to provide predictions on hitherto unobserved processes.

MD simulations of protein dynamics pose a computational challenge, since suitable simulation systems have to include the native environment (e.g., water and lipid molecules) of these biological macromolecules [4, 5, 6, 7] and, therefore, typically have to comprise several 10,000 atoms. Furthermore, they have to use femtosecond integration time steps to enable smooth descriptions of the fastest degrees of freedom. Simulations of such systems are currently limited to nanoseconds even if the most powerful supercomputers are employed. Although there are biochemically important processes which actually occur on a nanosecond time scale [8] and have been successfully described by MD simulations [9, 10], most biochemical processes occur on time scales well above nanoseconds and, thus, are inaccessible to conventional MD methods.

In the MD approach the classical Newtonian equations are solved
numerically to obtain the motion of all
atoms [11, 12, 13].
For each numerical integration step the Coulomb sum

between all pairs of atoms (*i*, *j*) with partial charges
and
positions has to be evaluated.
In common force fields for protein and solvent molecules most
atoms carry partial charges; therefore
the evaluation of *U* dominates the computational effort in MD simulations
as it scales quadratically with the number *N* of
charged particles.

In truncation methods [14], which simply neglect long
range electrostatic interactions beyond a cutoff distance of
typically 8 - 10Å, the computation of
*U* scales with
*N* instead of
and, consequently, such methods
are widely used in MD simulations.
However, the truncation of electrostatic interactions
leads to serious errors in forces, energies and
other observables. These errors are generally too large to be acceptable
for descriptions
of protein
dynamics [15, 16, 17, 18].
For instance, using a cutoff distance of 10Å the relative errors
in the computation of electrostatic forces are of the order of
10% [19]. One of the major artifacts concerning the
description of protein dynamics, that is entailed by these sizable errors,
is a suppression of dynamical correlations between slowly fluctuating
protein modes, as these are mainly coupled by the neglected long-range
Coulomb forces [16, 18].
Thus, truncation of electrostatics trades a description of protein dynamics
which is computationally prohibitive but adequate in the
sense of basic physics for a description
which is computationally manageable but physically
inadequate.

As a result of the above considerations, one of the major tasks which has to be tackled in the strive for improved MD methods is the design of approximate algorithms for computationally efficient and sufficiently accurate evaluation of long-range Coulomb interactions and integration of the Newtonian equations.

However, any attempt to solve the problems stated above is confronted with two difficulties: First, requiring efficiency and accuracy on equal footings typically represents a contradiction, since approximate algorithmic schemes tend to gain computational efficiency at the cost of accuracy; that contradiction already has shown up in the above discussion of truncation methods and has to be taken care of in the design of any new approximation scheme. Second, it is much simpler to estimate the computational complexity of an approximate MD algorithm, since this is just a matter of counting floating point operations, than to judge its accuracy. In fact, as discussed in refs. [16] and [20], the latter task is highly nontrivial due to the complicated, nonlinear, and chaotic dynamics of macromolecular systems and requires careful comparisons of a variety of statistical observables on structural and dynamical properties which have to be extracted from extensive sample simulations. For instance, the particular artifact of the truncation method noted above, i.e., the suppression of dynamical correlations, became apparent only after the progress of computer technology had enabled simulations of sufficiently long duration. To the extent that quality assessments of approximate MD algorithms will become necessary in the present paper, we will adopt the problem adapted approach developed in refs. [16] and [20] and will employ the procedures and some of the statistical observables which have been suggested there as useful and suitable tools to judge algorithmic accuracy (see also further below).

Up to now various approaches have been suggested which aim at a more efficient approximate treatment of the long-range electrostatic interactions in MD simulations. These approaches may be classified into two categories: multipole [21, 22, 23, 24, 25, 26, 18, 19] and multiple-time-step methods [27, 28, 29, 30, 31, 32, 33, 16].

Multipole methods approximate the long-range forces originating from a group
of point charges by truncated multipole expansions of their electrostatic
potential.
Using a hierarchy of grids for subdivision of space,
nested on multiple scales, and
a corresponding hierarchical organization of charge groups and
multipole expansions [22] a computational complexity
of *O*(*N log N*) is achieved.
By additionally using a hierarchy of local Taylor expansions for the
evaluation of the electrostatic forces acting on particles
Greengard and Rokhlin have constructed the so-called
*fast multipole method* (FMM)
that even scales with *O*(*N*) for large systems [23, 24, 25].
Several implementations of this method suited for MD simulations have been
reported [34, 35, 36].

For MD simulations of biomolecules the FMM-type grouping of charges, defined by a fixed and regular subdivision of space, requires multipole expansions of rather high order as to achieve sufficient numerical accuracy [24]. If, instead, charge grouping is adapted to structural and dynamical properties of the simulated biomolecules, multipole expansions can be truncated at low orders without large sacrifices of accuracy [19, 26, 18]. Accordingly, such a structure adapted multipole method (SAMM) provides a further substantial speed-up for MD simulations.

Multiple-time-step methods are based on the observation that forces between distant atoms generally exhibit slower fluctuations than forces between close atoms. Therefore, without significant loss of accuracy, the more slowly fluctuating forces can be treated with longer integration step sizes. The required separation of forces can be implemented, e.g., by grouping atom pairs into distance classes. Simple multiple-time-step methods define only two distance classes, an inner and an outer one [27], whereas more advanced methods employ a hierarchy of such classes [28, 29, 30, 31, 32, 33, 16].

As one can expect from the general considerations presented further above, the enhanced efficiency of multipole and multiple-time-step methods is obtained at the cost of various kinds of algorithmic artifacts. A typical example is the artificial introduction of ``algorithmic noise'', which arises from discontinuous and discretized representations of continuous and smooth functions and processes. Such noise effectively can act like an uncontrolled stochastic force on to the atomic motion, and can turn the simulated dynamics into a Langevin-type dynamics with the nasty property, that the algorithmic noise is not properly balanced by a corresponding friction term and, therefore, constantly adds heat to the simulated system. By thorough studies of sample simulations these artifacts have been shown to be negligible in the case of the SAMM method [19, 26, 18] and to be optimally small in the case of a particular multiple-time-step procedure, the so-called DC-1d extrapolation scheme, which has been suggested and carefully compared with other methods in ref. [16].

From a general point of view, multipole methods and multiple-time-step methods
exploit different regularities of Coulomb interactions as to obtain enhanced
computational efficiency:
the former make use of regularities *in space*, whereas
the latter exploit regularities *in time*.
This complementarity led us to propose [16]
that a combination of both concepts should be capable to render additional
speed-ups. In view of the proven superiority of the SAMM method as compared
to fixed grid FMM procedures on the one hand, and of the DC-1d scheme
as compared to other multiple-time-step schemes on the other hand,
these two methods appear to be the natural candidates for the envisaged
combination. It is the purpose of the present paper to show how a suitable
combination of these methods can be achieved and to demonstrate
that the resulting *fast multiple-time-step
structure-adapted multipole method* (FAMUSAMM) actually
preserves and combines the advantages of the parent methods with respect
to enhanced computational efficiency and to the lack of substantial
algorithmic artifacts.
In particular we will demonstrate by comparison of a combination of SAMM
with other multiple-time-step schemes, that the DC-1d procedure actually
is the method of choice.

Windemuth [37] has recently presented a first step
towards a combination of multipole and time-step methods;
using a simple variant of a multiple-time-step method with only two
distance classes and a conventional FMM algorithm he achieves considerable
speed-ups.
Unfortunately, a quality assessment is lacking in Windemuth's paper.
Zhou and Berne [38] have reported a more advanced combination
of the multiple-time-step algorithm RESPA [31]
with a slightly modified fast multipole method using a rectangular
subdivision of the simulation volume.
Here, quality assessments are based on short MD simulations in
the ps range.
For MD simulations with periodic boundaries [39, 40, 41]
a fast method based on a combination of RESPA and
the smooth particle mesh Ewald method [42]
has been described [43].
The latter method is particularly useful for simulations of crystalline
structures, e.g., protein crystals.
Also, for the simulation of non-crystalline systems, like in lipid membranes
soluted proteins [4], periodic boundaries have been used as an
approximation due to the availability of efficient algorithms.
In the latter case, however, it is *a priori* not clear whether that
approximation is justified (for a discussion of possible artifacts
see, e.g., ref. [44, 45, 46, 47]).
In our paper we develop an efficient algorithm to speed up simulations of
non-periodic systems, in which, typically, the environment is effectively
included through the use of appropriate boundary forces
(see, e.g., refs [44, 48]).
We provide a careful analysis of possible algorithmic artifacts and their
influence on structural and dynamical properties of a test system.

We intend to achieve high efficiency by combining SAMM and a multiple-time-step scheme. To that aim, we first outline the basic features of the parent methods and, subsequently, present our new algorithm FAMUSAMM. As will be explained in the following sections, in our multipole method charge grouping will be based on the specific structural features of macromolecules rather than on fixed subdivisions of space as used in the methods cited above. In particular, our combination of SAMM with a multiple-time-step method is expected to be more efficient and less memory consuming, since we intend to apply the multiple-time-step scheme to local Taylor expansions of the electrostatic potential rather than to Coulomb forces acting on individual atoms. Having defined suitable methods to evaluate FAMUSAMM's accuracy, we present the results of our accuracy assessment. After documenting the enhanced computational performance of our method, we discuss and summarize our results.

Wed Apr 30 15:40:09 MET DST 1997