Metadata for Simulated Brain Lesion Classification Tasks
Description
The BrainLesion-15Class (100 Patients) dataset is a fully synthetic metadata collection representing 100 simulated brain MRI cases labeled with 15 real-world-inspired lesion types such as Glioma, Stroke, MS, and AVM. Each entry includes demographic info (age, gender), imaging modality (T1, T2, FLAIR, T1c), scan resolution, and scanner type. The dataset is ideal for machine learning tasks like multi-class classification, metadata filtering, and data stratification, while being completely privacy-safe and non-clinical.
Files
Steps to reproduce
To reproduce the BrainLesion-15Class (100 Patients) dataset, begin by defining a list of 15 commonly known brain lesion types such as Glioma, Stroke, or AVM. Using Python libraries like numpy and pandas, randomly generate patient metadata including age (between 20 and 80), gender (male or female), MRI modality (T1, T2, FLAIR, T1c), scan resolution, and scanner type (e.g., GE 3T). Create a DataFrame where each row represents a synthetic subject and assign a lesion type randomly from the predefined list. Ensure reproducibility by setting a fixed random seed. Finally, export the dataset to a CSV file using df.to_csv(). This process creates a fully synthetic, privacy-safe metadata file suitable for machine learning and medical imaging experimentation.