Integration of LHCb workflows on the Santos Dumont supercomputer
Description
To handle the growing amount of data coming from the LHC Run3 and then the High-Luminosity LHC phases, LHCb aims at integrating Monte-Carlo simulation workflows into supercomputers such as Santos Dumont, hosted in LNCC, in Brazil. Data focus on the use of Santos Dumont CPU resources for three months, in the context of LHCb. Data mainly include CSV files related to the CPU usage of the resources (CPU benchmark results, CPU seconds processed per second, statuses of the jobs, wallclock time allocated and used), as well as a Jupyter Notebook to present plots based on data. Data show that providing a more accurate CPU power value and leveraging multi-node allocations would ease the exploitation of a larger number of resources.
Files
Steps to reproduce
- Raw data and transformed data come from the Santos Dumont supercomputer and from the LHCbDIRAC webapp interface: more details about data collection are available in `/data/<webapp or hpc>/README.md`. - `SDumontAnalysis.ipynb` is the main file: it gathers data from `data` and contains the source code to generate plots in `res`. You need to have installed Python3 along with jupyter notebook, matplotlib, seaborn, numpy and pandas to run it properly: ``` jupyter notebook SDumontAnalysis.ipynb ```