SUperman: Efficient permanent computation on GPUs
Description
The permanent is a function, defined for a square matrix, with applications in various domains including quantum computing, statistical physics, complexity theory, combinatorics, and graph theory. Its formula is similar to that of the determinant; however, unlike the determinant, its exact computation is #P-complete, i.e., there is no algorithm to compute the permanent in polynomial time unless P=NP. For an n × n matrix, the fastest algorithm has a time complexity of O(2^(n-1) n). Although supercomputers have been employed for permanent computation before, there is no work and, more importantly, no publicly available software that leverages cutting-edge High-Performance Computing accelerators such as GPUs. In this work, we design, develop, and investigate the performance of SUperman, a complete software suite that can compute matrix permanents on multiple nodes/GPUs on a cluster while handling various matrix types, e.g., real/complex/binary and sparse/dense, etc., with a unique treatment for each type. SUperman run on a single Nvidia A100 GPU is up to 86 × faster than a state-of-the-art parallel algorithm on 44 Intel Xeon cores running at 2.10GHz. Leveraging 192 GPUs, SUperman computes the permanent of a 62 × 62 matrix in 1.63 days, marking the largest reported permanent computation to date.