A High-Dimensional Machine Learning Framework for Robust Vehicle Failure Prediction and Predictive Maintenance

Published: 23 February 2026| Version 1 | DOI: 10.17632/vs9jgz76yt.1
Contributor:
Ahmed AYON

Description

The rapid digitalization of modern vehicles has led to the generation of high-dimensional sensor data, enabling advanced predictive maintenance and early failure detection strategies. Traditional diagnostic systems often fail to capture complex nonlinear patterns inherent in such data. This study proposes a robust machine learning benchmarking framework for vehicle failure prediction using a dataset containing 171 sensor-based attributes. Seven supervised learning algorithms are evaluated under a consistent preprocessing and validation strategy: Logistic Regression, Support Vector Machine, k-Nearest Neighbors, Naïve Bayes, Decision Tree, Random Forest, and Neural Network. Performance is assessed using accuracy, precision, recall, F1-score, and Receiver Operating Characteristic–Area Under Curve (ROC–AUC) metrics to ensure robustness in imbalanced classification settings. Experimental results reveal that ensemble-based methods, particularly Random Forest, significantly outperform linear, probabilistic, and distance-based models, achieving an AUROC of 0.983 on the test set. The findings provide empirical support for ensemble learning as a reliable and scalable solution for real-world vehicle failure diagnostics and predictive maintenance applications.

Files

Steps to reproduce

Prior studies in predictive vehicle maintenance typically evaluate only a subset of available classification algorithms, use accuracy as the primary (or sole) evaluation metric, and do not account for the class imbalance that is inherent to failure datasets. These limitations can result in overly optimistic performance estimates and inappropriate algorithm selection for production deployment.

Institutions

Categories

Machine Learning Algorithm, Machine Learning Theory

Licence