Predictive maintenance dataset

Published: 23 January 2026| Version 1 | DOI: 10.17632/5ww3zv87y7.1
Contributor:
Ragavan K

Description

The AI4I 2020 Predictive Maintenance Dataset is a synthetic industrial dataset designed to support research and development in predictive maintenance using machine learning. It simulates realistic operational conditions of industrial machinery, enabling failure prediction when real-world industrial data is unavailable due to privacy or cost constraints. The dataset was created for benchmarking predictive maintenance algorithms and is publicly available through the UCI Machine Learning Repository. It contains approximately 10,000 observations, each representing a machine operating instance. The dataset includes process parameters, operational conditions, and failure information, making it suitable for binary and multi-class classification tasks. Key operational features include air temperature, process temperature, rotational speed, torque, and tool wear, along with product quality categories. The primary target variable, Machine failure, indicates whether a failure occurred. Additionally, the dataset provides labels for five specific failure modes: Tool Wear Failure (TWF) Heat Dissipation Failure (HDF) Power Failure (PWF) Overstrain Failure (OSF) Random Failure (RNF) A machine is considered failed if any one of these failure modes occurs. Although synthetic, the dataset is structured to closely resemble real industrial sensor data and is widely used for: Predictive maintenance modeling Failure detection and classification Feature importance and explainable AI (XAI) studies Benchmarking machine learning algorithms Because of its clean structure and labeled failure modes, the AI4I 2020 dataset is especially useful for educational purposes, academic research, and baseline model evaluation in predictive maintenance applications.

Files

Categories

Data Science, Machine Learning, Industrial Analysis, Industry 4.0

Licence