A High-Dimensional Machine Learning Framework for Robust Vehicle Failure Prediction and Predictive Maintenance
Description
The rapid digitalization of modern vehicles has led to the generation of high-dimensional sensor data, enabling advanced predictive maintenance and early failure detection strategies. Traditional diagnostic systems often fail to capture complex nonlinear patterns inherent in such data. This study proposes a robust machine learning benchmarking framework for vehicle failure prediction using a dataset containing 171 sensor-based attributes. Seven supervised learning algorithms are evaluated under a consistent preprocessing and validation strategy: Logistic Regression, Support Vector Machine, k-Nearest Neighbors, Naïve Bayes, Decision Tree, Random Forest, and Neural Network. Performance is assessed using accuracy, precision, recall, F1-score, and Receiver Operating Characteristic–Area Under Curve (ROC–AUC) metrics to ensure robustness in imbalanced classification settings. Experimental results reveal that ensemble-based methods, particularly Random Forest, significantly outperform linear, probabilistic, and distance-based models, achieving an AUROC of 0.983 on the test set. The findings provide empirical support for ensemble learning as a reliable and scalable solution for real-world vehicle failure diagnostics and predictive maintenance applications.
Files
Steps to reproduce
Prior studies in predictive vehicle maintenance typically evaluate only a subset of available classification algorithms, use accuracy as the primary (or sole) evaluation metric, and do not account for the class imbalance that is inherent to failure datasets. These limitations can result in overly optimistic performance estimates and inappropriate algorithm selection for production deployment.
Institutions
- North South UniversityDhaka Division, Dhaka