Metadata for Simulated Brain Lesion Classification Tasks

Published: 29 July 2025| Version 1 | DOI: 10.17632/gcdng8r3g3.1
Contributors:
,
,
,

Description

The BrainLesion-15Class (100 Patients) dataset is a fully synthetic metadata collection representing 100 simulated brain MRI cases labeled with 15 real-world-inspired lesion types such as Glioma, Stroke, MS, and AVM. Each entry includes demographic info (age, gender), imaging modality (T1, T2, FLAIR, T1c), scan resolution, and scanner type. The dataset is ideal for machine learning tasks like multi-class classification, metadata filtering, and data stratification, while being completely privacy-safe and non-clinical.

Files

Steps to reproduce

To reproduce the BrainLesion-15Class (100 Patients) dataset, begin by defining a list of 15 commonly known brain lesion types such as Glioma, Stroke, or AVM. Using Python libraries like numpy and pandas, randomly generate patient metadata including age (between 20 and 80), gender (male or female), MRI modality (T1, T2, FLAIR, T1c), scan resolution, and scanner type (e.g., GE 3T). Create a DataFrame where each row represents a synthetic subject and assign a lesion type randomly from the predefined list. Ensure reproducibility by setting a fixed random seed. Finally, export the dataset to a CSV file using df.to_csv(). This process creates a fully synthetic, privacy-safe metadata file suitable for machine learning and medical imaging experimentation.

Institutions

VIT Bhopal University

Categories

Neurocomputing, Brain Tumor, Brain Abnormalities

Licence