LA-Breast DCE-MRI Dataset

Published: 7 June 2024| Version 1 | DOI: 10.17632/8rzyn3ng9c.1
Contributors:
Ruben Fonnegra,
,
,

Description

Images were selected from retrospective studies corresponding to Latin American patients. Studies were selected according to the acquisition parameters with the aim to preserve diversity but ensure homogeneity among them. Each study was anonymized following the convention “MRI\_N” where N represents the integer number of the patient (between 1 and 200). Each study contains 15 imaging sequences: pre-contrast T1 fat saturated Dynamic (d0), five postcontrast T1 fat saturated dynamics (d1 to d5), T1 with no fat saturation (t1), T2 with no fat saturation (t2), the apparent diffusion coefficient image (ADC) and the diffusion image (Diff). All image sequences were obtained and stored using the standard DICOM 3.0. Besides, all studies were acquired using multiple 1.5T scanners and all contrast agents were gadolinium-based with dosages between 0.014 and 0.016 l/mol as they were obtained retrospectively. Each patient data was filtered according to their available clinical finding, were benign and malignant lesions were prioritized to ensure at least one relevant clinical finding across all images. Thereupon, the data contains balanced train, test and validation sets in terms of benign/malignant lesions, as well as non-dense/dense tissue. Additionally, annotations per image are provided with lesion location (x and y coordinates) BIRADS and tissue density. This data can be used for multiple purposes, including image synthesis, image characterization, lesion classification, tissue segmentation, among others. This dataset can be used for multiple purposes, such image synthesis, image characterization, lesion classification, tissue segmentation, among others.

Files

Steps to reproduce

Images were selected from retrospective studies in the archive of the Instituto de Alta Tecnología Médica (IATM), in Medellín, Colombia. Thereupon, all images correspond to Latin American patients. Studies were selected according to the acquisition parameters with the aim of preserving diversity while ensuring homogeneity among them. Consequently, all images were obtained using multiple 1.5T scanners and all contrast agents were gadolinium-based with dosages between 0.014 and 0.016 l/mol under different acquisition conditions as they were obtained retrospectively. Each study was anonymized following the convention “MRI\_N” where N represents the integer number of the patient (between 1 and 200). Besides, each study was annotated with multiple visual lesions in the tissue (benign and malignant), and its location was provided. For image postprocessing after acquisition, we computed the phase postcontrast sequences manually using the software Horos. Then, images were stored and treated individually as independent sequences. For image preprocessing, all volume images were filtered and only slices with any ROI were considered. The slice of the annotation is located through the Center_z parameter in the sequences and images were taken separately according to their visualization. Each image was stored and marked in tiff format containing the patient identifier, the ROI identifier and original DCM filename (e.g., Mri_1_R1_IM-1682-0111.tiff). Finally, patients were handcrafted splitted into train, test and val partitions according to their BI-RADS and ACR. To perform the selection, test and val partitions were prioritized to be balanced in terms of benign and malignant lesions for analysis purposes. A similar procedure was also considered for the ACR parameter, where tissue densities A and B were considered as mostly fatty tissue; and tissue densities C and D were considered as mostly fibroglandular tissue.

Institutions

Tecnologico Pascual Bravo, Suramericana SA Colombia, Instituto Tecnologico Metropolitano

Categories

Medical Imaging, Magnetic Resonance Imaging, Breast Cancer

Licence