Filter Results
4347 results
- HOS-Ocean v2.1: Extensions of a nonlinear wave solver based on the High-Order Spectral methodIn this work, we report a new version of the HOS-Ocean code which is an open-source solver for deterministic nonlinear ocean wave propagation. The numerical model makes use of the so-called High-Order Spectral method, which ensures high efficiency and accuracy. This new release includes i) additional physical features such as spatially varying current and bathymetry together with dedicated models to account for wave-breaking, and ii) numerical developments with parallelization of the code through MPI as well as a more user-friendly code (pre-built binaries available, and simplified building procedure and operation). HOS-Ocean v2.1 is released as open-source, available from GitLab, developed and distributed under the terms of GNU General Public License (GPLv3). Along with the source code, detailed documentation under Sphinx format is available.
- BicAn: An integrated, open-source framework for polyspectral analysisWe present a novel platform for higher-order spectral analysis of time series data in Python. The theory, utility, and applications of such analyses are summarized. Direct estimation of coherence (n = 2), bicoherence (n = 3), and tricoherence (n = 4) spectra are given for test signals; higher-order (n > 4) spectra are inferred at single points in polyfrequency space. Quantification of uncertainty for nonstationary processes is considered, and applications to nonlinear dynamics research are given.
- Multi-GPU acceleration of PALABOS fluid solver using C++ standard parallelismThis article presents the principles, software architecture, and performance analysis of the GPU port of the lattice Boltzmann software library Palabos [J. Latt et al., “Palabos: Parallel lattice Boltzmann solver”, Comput. Math. Appl. 81, 334-350, (2021)]. A hybrid CPU-GPU execution model is adopted, in which numerical components are selectively assigned to either the CPU or the GPU, depending on considerations of performance or convenience. This design enables a progressive porting strategy, allowing most features of the original CPU-based codebase to be gradually and seamlessly adapted to GPU execution. The new architecture builds upon two complementary paradigms: a classical object-oriented structure for CPU execution, and a data-oriented counterpart for GPUs, which reproduces the modularity of the original code while eliminating object-oriented overhead detrimental to GPU performance. Central to this approach is the use of modern C++, including standard parallel algorithms and template metaprogramming techniques, which permit the generation of hardware-agnostic computational kernels. This facilitates the development of user-defined, GPU-accelerated components such as collision operators or boundary conditions, while preserving compatibility with the existing codebase and avoiding the need for external libraries or non-standard language extensions. The correctness and performance of the GPU-enabled Palabos are demonstrated through a series of three-dimensional multiphysics benchmarks, including the laminar-turbulent transition in a Taylor-Green vortex, lid-driven cavity flow, and pore-scale flow in Berea sandstone. Despite the high-level abstraction of the implementation, the single-GPU performance is similar to CUDA-native solvers, and multi-GPU tests exhibit good weak and strong scaling across all test cases. Beyond the specific context of Palabos, the porting methodology illustrated here provides a generalizable framework for adapting large, complex C++ simulation codes to GPU architectures, while maintaining extensibility, maintainability, and high computational performance.
- CuPyMag: GPU-accelerated finite-element micromagnetics with magnetostrictionWe introduce CuPyMag, an open-source, Python-based framework for large-scale micromagnetic simulations with magnetostriction. CuPyMag solves micromagnetics with finite elements in a GPU-resident workflow in which key operations, such as right-hand-side assembly, spatial derivatives, and volume averages, are tensorized using CuPy’s BLAS-accelerated backend. Benchmark tests show that the GPU solvers in CuPyMag achieve a speedup of up to two orders of magnitude compared to the CPU codes. Its runtime grows linearly/sublinearly with problem size, demonstrating high efficiency. Additionally, CuPyMag uses the Gauss-Seidel projection method for time integration, which not only allows stable time steps (up to 11 ps) but also solves each governing equation with only 1-3 conjugate-gradient iterations without preconditioning. CuPyMag accounts for magnetoelastic coupling and far-field effects arising from the boundary of the magnetic body, both of which play an important role in magnetization reversal in the presence of local defects. CuPyMag solves these computationally-intensive multiphysics simulations with a high-resolution mesh (up to 3M nodes) in under three hours on an NVIDIA H200 GPU. This acceleration enables micromagnetic simulations with non-trivial defect geometries and resolves nanoscale magnetic structures. It expands the scope of micromagnetic simulations towards realistic, large-scale problems that can guide experiments. More broadly, CuPyMag is developed using widely adopted Python libraries, which provide cross-platform compatibility, ease of installation, and accessibility for adaptations to diverse applications.
- Pigreads: The Python-integrated GPU-enabled reaction-diffusion solver using OpenCL for cardiac electrophysiology and other applicationsPigreads is a streamlined Python module for efficient numerical solution of reaction-diffusion systems on graphics cards (GPU), with CPU fallback. It exposes a simple and straight-forward NumPy-compatible API. Users may employ built-in models – including electrophysiology examples – or supply custom reaction terms. Supported features include 0D-3D uniform Cartesian grids, no-flux and periodic boundary conditions, anisotropic diffusion, spatially varying diffusion and reaction, and localised source terms. The project is open-source, tested, documented, and distributed with examples and tutorials.
- cfdmfFTFoam: a front-tracking solver for multiphase flows on general unstructured grids in OpenFOAMThe Front-Tracking Method (FTM) is a promising approach for numerical solution of multiphase flows, considering a trade-off between accuracy and computational cost. The existing open-source open-access software for FTM is scarce, due to complexity of the coding and algorithms, and is limited to structured Cartesian grids or connectivity-free front mesh hybrid FTMs. To provide a pure FTM solver on general unstructured grids, the Ftc3D FTM code has been integrated into the OpenFOAM CFD software by implementing necessary front mesh to Eulerian grid communication and front nodes advection algorithms, applicable to unstructured grids and for both serial and parallel runs. The new FTM package, called cfdmfFTFoam, has been further equipped with a variety of FTM sub-algorithms, including the front volume correction, remeshing, surface tension computation, indicator function construction, etc. Assessments and validations of the new solver are provided against several standard multiphase flow benchmarks. It is anticipated that cfdmfFTFoam would facilitate future research on and algorithm improvement in the field of FTM.
- A numerical approach to calculate cross sections for relativistic electrons and terrestrial gamma-ray flashesTerrestrial gamma-ray flashes (TGFs) are bursts of energetic X- and gamma-rays which are emitted from thunderstorms as the Bremsstrahlung of relativistic electrons. While such relativistic particles propagate through the atmosphere, they interact with air molecules which can be studied with particle Monte Carlo models. Such models require cross sections as an input. Whilst there are well established data for elastic scattering, Bremsstrahlung and impact ionization through relativistic electrons as well as for photoionization, Compton scattering and pair production for energetic photons, we lack cross sections for photoexcitation, photodissociation or the excitation of air molecules through electrons which can contribute to the chemical activation of the atmosphere, potentially relevant for the study of greenhouse gas production from lightning in our atmosphere. In order to fill this cross section gap, we here present a novel numerical tool calculating cross sections directly from Feynman diagrams providing both differential and total cross sections. We provide an overview of the code structure and present benchmarking cases against well-known cross sections. Additionally, we will present a first application by calculating the cross section for photodissociation for a wide range of energies. The presented model is capable of calculating cross sections for more leptonic and photonic processes in the atmosphere than today. Subsequently, this will allow for more realistic simulations of energetic phenomena in our atmosphere.
- rhodent: A python package for analyzing real-time TDDFT responseReal-time time-dependent density functional theory (rt-TDDFT) is a well-established method for studying the dynamic response of matter in the femtosecond or optical range. In this method, the Kohn-Sham wave functions are propagated forward in time, and in principle, one can extract any observable at any given time. Alternatively, by taking a Fourier transform, spectroscopic quantities can be extracted. There are many publicly available codes implementing rt-TDDFT, which differ in their numeric solution of the KS equations, their available Exchange-correlation functionals, and in their analysis capabilities. For users of rt-TDDFT, this is an inconvenient situation because they may need to use a numerical method that is available in one code, but an analysis method available in another. Here, we introduce rhodent, a modular Python package for processing the output of rt-TDDFT calculations. Our package can be used to calculate hot-carrier distributions, energies, induced densities, and dipole moments, and various decompositions thereof. In its current version, rhodent handles calculation results from the gpaw code, but can readily be extended to support other rt-TDDFT codes. Additionally, under the assumption of linear response, rhodent can be used to calculate the response to a narrow-band laser, from the response to a broad-band perturbation, greatly speeding up the analysis of frequency-dependent excitations. We demonstrate the capabilities of rhodent via a set of examples, for systems consisting of Al and Ag clusters and organic molecules.
- High-performance simulations of higher representations of Wilson fermionsWe present HiRep v2, an open-source software suite for high-performance lattice field theory simulations with dynamical Wilson fermions in higher representations of SU(N_g) gauge groups. This new version fully supports graphics processing unit (GPU) acceleration, optimizing both gauge configuration generation and measurements for NVIDIA and AMD GPUs. HiRep v2 integrates improved gauge and fermionic lattice actions, advanced inverters, and Monte Carlo algorithms, including the (Rational) Hybrid Monte Carlo ((R)HMC) with Hasenbusch acceleration. It exhibits excellent scalability across multiple GPUs and nodes with minimal efficiency loss, making it a robust tool for large-scale simulations in physics beyond the Standard Model.
- TBPLaS 2.0: A tight-binding package for large-scale simulationThe common exact diagonalization-based techniques to solving tight-binding models suffer from O(N^2) and O(N^3) scaling with respect to model size in memory and CPU time, hindering their applications in large tight-binding models. On the contrary, the tight-binding propagation method (TBPM) can achieve linear scaling in both memory and CPU time, and is capable of handling large tight-binding models with billions of orbitals. In this paper, we introduce version 2.0 of TBPLaS, a package for large-scale simulation based on TBPM [1]. This new version brings significant improvements with many new features. Existing Python/Cython modeling tools have been thoroughly optimized, and a compatible C++ implementation of the modeling tools is now available, offering efficiency enhancement of several orders. The solvers have been rewritten in C++ from scratch, with the efficiency enhanced by several times or even by an order of magnitude. The workflow of utilizing solvers has also been unified into a more comprehensive and consistent manner. New features include spin texture, Berry curvature and Chern number calculation, partial diagonalization for specific eigenvalues and eigenstates, analytical Hamiltonian, and GPU computing support. The documentation and tutorials have also been updated to the new version. In this paper, we discuss the revisions with respect to version 1.3 and demonstrate the new features. Benchmarks on modeling tools and solvers are also provided.
