Synthetic Clustered Multi-Task Dataset

Published: 26 January 2026| Version 1 | DOI: 10.17632/3gms2fvs93.1
Contributors:
Seyedsaman Emami,
,

Description

Data are generated with a mixing parameter $\omega = 0.9$. Input features $\mathbf{x}$ are sampled uniformly from the hypercube $[-1,1]^{d_x}$, where $d_x = 5$ denotes the input dimensionality.

Files

Steps to reproduce

Define the number of repetitions (100), number of tasks per repetition (25), number of clusters (5), and tasks per cluster (5). Fix the mixing parameter to $\omega = 0.9$ and the input dimension to $d_x = 5$. For each cluster, instantiate a shared latent function using independently sampled random Fourier features. This component represents the common structure shared by all tasks within the same cluster.

Institutions

Categories

Machine Learning, Clustering, Linear Regression, Binary Classification

Licence