Multi-GPU UNRES for scalable coarse-grained simulations of very large protein systems
Graphical Processor Units (GPUs) are nowadays widely used in all-atom molecular simulations because of the advantage of efficient partitioning of atom pairs between the kernels to compute the contributions to energy and forces, thus enabling the treatment of very large systems. Extension of time- and size-scale of computations is also sought through the development of coarse-grained (CG) models, in which atoms are merged into extended interaction sites. Implementation of CG codes on the GPUs, particularly the multiple-GPU platforms is, however, a challenge due to more complicated potentials and removing the explicit solvent, forcing developers to do interaction- rather than space-domain decomposition. In this paper, we propose a design of a multi-GPU coarse-grained simulator and report the implementation of the heavily coarse-grained physics-based UNited RESidue (UNRES) model of polypeptide chains. By moving all computations to GPUs and keeping the communication with CPUs to a minimum, we managed to achieve almost 5-fold speed-up with 8 A100 GPU accelerators for systems with over 200,000 amino-acid residues, this result making UNRES the best scalable coarse-grained software and enabling us to do laboratory-time millisecond-scale simulations of such cell components as tubulin within days of wall-clock time.