Operational Data for Fault Prognosis in Particle Accelerators with Machine Learning

Published: 16 October 2023| Version 3 | DOI: 10.17632/9zxrt6pf2k.3
Majdi Radaideh,
, Sarah Cousineau


This repository showcases real-world operational data gathered from the power systems of the Spallation Neutron Source facility, renowned for delivering the world's most intense neutron beam. This dataset serves as a valuable resource for crafting techniques and algorithms aimed at preemptively identifying system faults, enabling timely operator intervention, and effective maintenance oversight. The authors utilized a radio-frequency test facility (RFTF) to conduct controlled laboratory experiments simulating system failures, all without triggering a catastrophic system breakdown. The dataset comprises waveform signals obtained during both regular system operations and deliberate fault induction efforts, offering a substantial amount of data for training statistical or machine learning models. Afterward, the authors carried out 21 test experiments wherein they gradually introduced faults into the RFTF system to evaluate the models' effectiveness in detecting and preempting impending faults. These experiments involved combinations of magnetic flux compensation and adjustments to start pulse width, leading to a gradual deterioration in various waveform aspects such as system output voltage and current. These alterations effectively mimicked real fault scenarios. All experiments took place at the Oak Ridge National Laboratory's Spallation Neutron Source facility in Oak Ridge, Tennessee, United States, during July 2022. The users of this dataset may include researchers in control, predictive maintenance, machine learning, and signal processing.


Steps to reproduce

Dataset Specifications: - Specific Areas: Machine learning, Fault detection, control engineering - Data type: Signals (time series data) - Data Format: Raw data (minimally processed) - Data usage: A simple Python script called "load_dataset.py" is provided with the dataset to show the user how to load and plot the data. - Data Acquisition: The data acquisition system of the radio-frequency test facility (RFTF) captures 12 waveforms, each spanning 1.5 milliseconds, and samples them at a rate of 400 nanoseconds before saving them to an external hard drive. These data are organized into 3D arrays and stored as binary NumPy files. - Readable data: We provided excerpts of the binary data in "data/train/sample_train_data.xlsx" and "data/test/sample_test_data.xlsx", which have human-readable data to give the user an impression of the binary data nature and structure. - Parameters for data collection: Time series data are collected from real-time operation of the radio-frequency test facility (RFTF) high voltage converter modulator system during July 2022. The dataset features 12 unique waveforms such as IGBT currents, modulator voltage and current, cap bank voltage, and magnetic flux. - Data source location: Spallation Neutron Source, Oak Ridge National Laboratory, Oak Ridge, Tennessee, United States. - Related works: This dataset is described in detail in a "Data in Brief" article and is analyzed in a paper published in the "International Journal of Prognostics and Health Management". See "Related links" below for further information about these papers.


Oak Ridge National Laboratory


Signal Processing, Machine Learning, Prognosis, Power Electronics, Particle Accelerator


U.S. Department of Energy