ClusterSense-CO: A Large-Scale Cost Optimization Dataset for Energy-Efficient CPU Cluster Operations
Description
The ClusterSense-CO dataset is a large-scale dataset developed to support research on operational cost optimization and energy-efficient resource management in CPU cluster infrastructures. The dataset contains detailed information on resource utilization, power consumption, operational expenses, workload distribution, energy costs, and optimization-related performance metrics. It enables researchers to investigate cost-aware scheduling strategies, energy consumption prediction, resource provisioning techniques, and sustainable computing solutions. The dataset is suitable for developing machine learning and optimization models aimed at reducing operational costs while maintaining high system performance and resource efficiency in modern cluster computing environments.
Files
Steps to reproduce
Collect resource utilization data from CPU cluster systems, including CPU usage, memory consumption, energy usage, and workload information. Record operational cost metrics associated with resource consumption and system operation. Aggregate workload and cost-related information over predefined monitoring intervals. Clean and preprocess the collected data by handling missing values and removing anomalies. Calculate optimization-related indicators where applicable. Store the finalized dataset in CSV format. Utilize the dataset to develop and evaluate cost optimization, energy efficiency, and resource management models for cluster computing environments.
Institutions
- Daffodil International UniversityDhaka Division, Dhaka