Data set for learning heuristic parameters by a multi-target regression with dependent outputs

Published: 22 December 2022| Version 1 | DOI: 10.17632/hm5d3cc45y.1
Christian Gahm


In the research paper "Learning-augmented heuristics for scheduling parallel serial-batch processing machines" (Computers & Operations Research, 2022, 10.1016/j.cor.2022.106122), we propose to improve the solution efficiency by using machine learning (i.e., Artificial neural networks, NN) to predict most suitable heuristic parameter configurations for a heuristic from the literature (BATCS-b) and for a new heuristic (BATCS-d). Hereby, a multi-target regression with dependent outputs must be performed. For the learning task, the heuristic parameter configurations achieving the best objective value (minimum weighted tardiness) are known for all 93,360 instances provided by the UGWT PSBIJF instance set (cf. “Extended instance sets for the parallel serial-batch scheduling problem with incompatible job families, sequence-dependent setup times, and arbitrary sizes”, Mendeley Data, V2, 10.17632/rxc695hj2k.2). An important aspect to consider during learning is that more than one parameter configuration can achieve the best objective value for one single scheduling problem instance. The provided data set contains the instance identifier for each PSBIJ problem instance, the best objective value, two feature vectors (CF with 12 features and AF with 85 features), and the best parameter configurations (targets). Details on the data can be found in the research paper cited above.



Universitat Augsburg Wirtschaftswissenschaftliche Fakultat


Heuristics, Machine Learning, Batch Scheduling Scheduling