Large-Scale Curated Multivariate Time Series Anomaly Detection Dataset for Laptop Performance Metrics
Description
High-quality multivariate time-series datasets are significantly less accessible compared to more common data types such as images or text, due to the resource-intensive process of continuous monitoring, precise annotation, and long-term observation. This paper introduces a cost-effective solution in the form of a large-scale, curated dataset specifically designed for anomaly detection in computing systems’ performance metrics. The dataset encompasses 45 GB of multivariate time-series data collected from 66 systems, capturing key performance indicators such as CPU usage, memory consumption, disk I/O, system load, and power consumption across diverse hardware configurations and real-world usage scenarios. Annotated anomalies, including performance degradation and resource inefficiencies, provide a reliable benchmark and ground truth for evaluating anomaly detection models. By addressing the accessibility challenges associated with time-series data, this resource facilitates advancements in machine learning applications, including anomaly detection, predictive maintenance, and system optimisation. Its comprehensive and practical design makes it a foundational asset for researchers and practitioners dedicated to developing reliable and efficient computing systems.