CADICA: a new dataset for coronary artery disease

Published: 19 February 2024| Version 2 | DOI: 10.17632/p9bpx9ctcv.2
Contributors:
,
,
,
,
,
,

Description

The CADICA dataset is an annotated Invasive Coronary Angiography (ICA) dataset of 42 patients. In ICA imaging, lesion degree assessment is commonly done by visual estimation, which implies a subjective factor and interobserver variability. Accurate recognition of lesions is crucial for a correct diagnosis and treatment. This motivates the development of computer-aided systems that can support specialists in their clinical procedures. This dataset can be used by clinicians to train their skills in angiographic assessment of CAD severity, by computer scientists to create computer-aided diagnostic systems to help in such assessment, and to validate existing methods for CAD detection approaching solutions for clinical settings. CADICA dataset includes ICA images, manually labeled lesion bounding boxes, and selected clinical features. The CADICA dataset becomes a directory that contains the "metadata.xlsx" file, which is the file where the clinical data is located, as well as two main folders that differentiate the videos selected by the medical team for each patient: "nonselectedVideos" and "selectedVideos". Inside each folder, there are several sub-directories with the naming convention "pX", where X is the ID of each patient, and "vY", where Y is the ID of the video of that patient. The folder "pX" contains the following information: "vY": several sub-directories with the videos selected for that patient. "lesionVideos.txt": includes the IDs of the videos chosen where appears at least one lesion which is labeled. "nonlesionVideos.txt": contains the IDs of the selected videos with no visible lesions. The folder "vY" contains the following information: "input": a sub-directory containing a separate PNG file for each video frame. "pX_vY_selectedFrames.txt": includes the IDs of the keyframes for the medical team for all the selected videos. "groundtruth": a sub-directory available only if there are lesions in that selected video. The folder "groundtruth" contains the following information: "pX_vY_000ZZ.txt": contains the bounding boxes and their category in each row. There are such files as frames in "pX_vY_selectedFrames.txt". Bounding boxes are specified in the format [x,y,w,h], where (x,y) are the pixel coordinates of the top left corner, w is the width and h is the height of the bounding box. "pX_vY_groundTruthTable.mat": contains a table with the ground truth information of that video.

Files

Steps to reproduce

CADICA dataset images were acquired at Hospital Universitario Virgen de la Victoria, Málaga, Spain. The invasive coronary angiography (ICA) videos were acquired as Digital Imaging and Communication in Medicine (DICOM) files recorded at 10 frames per second and with different duration (4-8 seconds) depending on the projection used, but they were converted to PNG images for effortless management. The frame size of each video is 512 x 512 pixels, while the length of the videos varies from 1 to 151 frames. The cardiac angiography equipment used was Artis Zee (Siemens AG, Muenchen, Germany). The dose of radiation administered in each projection ranges between 5-50 mGy. The protocol normally used in each angiography included five projections for the left coronary artery (LCA), such as right anterior oblique (RAO) and left anterior oblique (LAO), both with cranial and caudal angulation, with some additional projections in case of diagnostic difficulties. The projections used for the right coronary artery (RCA) are LAO and RAO, with cranial and caudal angulation.

Institutions

Universidad de Malaga

Categories

Medical Imaging, Machine Learning, Image Analysis (Medical Imaging), Image Classification, Coronary Artery Disease, Cardiovascular Imaging, Deep Learning

Funding

Ministerio de Ciencia e Innovación

PID2022-136764OA-I00

Universidad de Málaga

B1-2019_02

Universidad de Málaga

B1-2021_20

Fundación Unicaja

PUNI-003_2023

Universidad de Málaga

B1-2019_01

Junta de Andalucía

UMA20-FEDERJA-108

Universidad de Málaga

B1-2022_14

Universidad de Málaga

B4-2022

Licence