Synthetic Clustered Multi-Task Dataset

Name: Synthetic Clustered Multi-Task Dataset
Creator: Seyedsaman Emami
Published: 2026-01-26T11:43:30.402Z
Keywords: Machine Learning, Clustering, Linear Regression, Binary Classification

Emami, Seyedsaman; Hernández Lobato, Daniel; Martínez Muñoz, Gonzalo

doi:10.17632/3gms2fvs93.1

Synthetic Clustered Multi-Task Dataset

Published: 26 January 2026| Version 1 | DOI: 10.17632/3gms2fvs93.1

Contributors:

Seyedsaman Emami,

,

Description

Data are generated with a mixing parameter $\omega = 0.9$. Input features $\mathbf{x}$ are sampled uniformly from the hypercube $[-1,1]^{d_x}$, where $d_x = 5$ denotes the input dimensionality.

Files

Steps to reproduce

Define the number of repetitions (100), number of tasks per repetition (25), number of clusters (5), and tasks per cluster (5). Fix the mixing parameter to $\omega = 0.9$ and the input dimension to $d_x = 5$. For each cluster, instantiate a shared latent function using independently sampled random Fourier features. This component represents the common structure shared by all tasks within the same cluster.

Institutions

Universidad Autonoma de Madrid
Madrid, Madrid

Synthetic Clustered Multi-Task Dataset

Description

Files

Steps to reproduce

Institutions

Categories

Licence