Multi-task synthetic dataset

Published: 10 December 2025| Version 2 | DOI: 10.17632/r2mnkjfmh3.2
Contributors:
,
,

Description

A synthetic dataset collection designed for evaluating multi-task learning and transfer learning algorithms under both regression and binary classification settings. It consists of 100 independently generated batches, each initialized with distinct random seeds to promote diversity across tasks. Every batch contains 10 tasks (including two designated outliers), with 300 training and 1,000 test instances per task distributed across five input features. The dataset ensures balanced class representation and controlled task variation through a weighting parameter of w = 0.9.

Files

Steps to reproduce

To reproduce the dataset, please refer to the corresponding GitHub repository available at https://github.com/GAA-UAM/R-MTGB

Institutions

  • Universidad Autonoma de Madrid

Categories

Multi-Objective Parameter Optimization, Linear Regression, Binary Classification

Licence