Brain tumor MRI image for Fedarated Learning

Published: 29 May 2026| Version 2 | DOI: 10.17632/x69nzw9zzw.2
Contributors:
,

Description

This dataset contains 10,417 de-identified, single-channel (greyscale) brain MRI slices. It is designed to support machine learning and computer vision research for robust, four-class brain tumor classification, particularly contributing to innovations in privacy-preserving federated learning and explainable AI (XAI) frameworks. Dataset Composition: The dataset is organized into four distinct, clinically meaningful categories based on pathology. ⦁Glioma tumor (2,547 images): MRI slices exhibiting glioma tumors. ⦁Meningioma tumor (2,712 images): MRI slices exhibiting meningioma tumors. ⦁Pituitary tumor (2,658 images): MRI slices exhibiting pituitary tumors. ⦁No tumor (2,500 images): Healthy, tumor-free brain MRI slices. Preprocessing: ⦁Slices are harmonized to 224x224 pixels and loaded as single-channel grayscale images. ⦁A unified preprocessing pipeline was applied, including resizing, optional center/zero-padding, and per-image normalization to ensure consistency. ⦁For federated learning simulations, the dataset is pre-partitioned across four clients and further divided into an 80/10/10 (train/validation/test) split conducted patient-wise to prevent data leakage. This dataset can be effectively used for: ⦁Multiclass image classification and brain tumor recognition tasks. ⦁Deep learning model development, including parameter-efficient CNNs, deeper CNNs, and hybrid architectures. ⦁Benchmarking synchronous Federated Averaging (FedAvg) and privacy-preserving multi-site training methodologies. ⦁Evaluating quantitative explainability (XAI) metrics, such as Deletion AUC and Grad-CAM++ visualizations. File Information: ⦁Total Images: 10,417 ⦁Image Format: Grayscale MRI slices ⦁Resolution: 224x224 pixels ⦁Data Structure: Distributed across 4 distinct client partitions.

Files

Steps to reproduce

1. Download the dataset: Access the dataset files from the respective repository and extract the contents into your working environment. 2. Dataset structure: Because this dataset is designed for federated learning, the data is distributed across distinct client partitions. Inside each client, the data is organized by pathology class: BrainTumor_MRI_Dataset/ ├── client_1/ │ ├── glioma_tumor/ │ │ ├── glioma_tumor_client_1_1.jpg │ │ ├── glioma_tumor_client_1_1.jpg │ │ ├── ... │ ├── meningioma_tumor/ │ ├── pituitary_tumor/ │ └── no_tumor/ ├── client_2/ │ ├── glioma_tumor/ │ ├── ... ├── client_3/ └── client_4/ 3. Data preprocessing: ⦁ Ensure all images are loaded in grayscale (single-channel) format and resized to the harmonized 224x224 pixel dimensions. ⦁ Apply the unified preprocessing pipeline, utilizing center/zero-padding where necessary, and perform per-image normalization to ensure consistency across the dataset. ⦁ Maintain the patient-wise 80/10/10 (train/validation/test) split within or across clients to strictly prevent data leakage during model validation. 4. Model training example (optional): ⦁ Load the data using standard data loaders in PyTorch, TensorFlow, or a federated learning framework like Flower (Flwr) or PySyft. ⦁ Choose a model architecture, ranging from parameter-efficient CNNs to deeper hybrid architectures. ⦁ To benchmark privacy-preserving multi-site training, initialize a synchronous Federated Averaging (FedAvg) simulation where each "Client" folder trains locally and sends weights to a central server. 5. Evaluation: ⦁ Evaluate standard classification metrics such as accuracy, precision, recall, and F1-score across the four clinical categories. ⦁ Assess the interpretability of your trained models by extracting Explainable AI (XAI) metrics, specifically generating Grad-CAM++ visualizations to highlight tumor regions and calculating quantitative scores like Deletion AUC to verify the model's spatial focus.

Institutions

Categories

Machine Learning, Brain Tumor, Deep Learning, Federated Learning

Licence