CADICA: a new dataset for coronary artery disease
Description
The CADICA dataset is an annotated Invasive Coronary Angiography (ICA) dataset of 42 patients. In ICA imaging, lesion degree assessment is commonly done by visual estimation, which implies a subjective factor and interobserver variability. Accurate recognition of lesions is crucial for a correct diagnosis and treatment. This motivates the development of computer-aided systems that can support specialists in their clinical procedures. This dataset can be used by clinicians to train their skills in angiographic assessment of CAD severity, by computer scientists to create computer-aided diagnostic systems to help in such assessment, and to validate existing methods for CAD detection approaching solutions for clinical settings. CADICA dataset includes ICA images, manually labeled lesion bounding boxes, and selected clinical features. The CADICA dataset becomes a directory that contains the "metadata.xlsx" file, which is the file where the clinical data is located, as well as two main folders that differentiate the videos selected by the medical team for each patient: "nonselectedVideos" and "selectedVideos". Inside each folder, there are several sub-directories with the naming convention "pX", where X is the ID of each patient, and "vY", where Y is the ID of the video of that patient. The folder "pX" contains the following information: "vY": several sub-directories with the videos selected for that patient. "lesionVideos.txt": includes the IDs of the videos chosen where appears at least one lesion which is labeled. "nonlesionVideos.txt": contains the IDs of the selected videos with no visible lesions. The folder "vY" contains the following information: "input": a sub-directory containing a separate PNG file for each video frame. "pX_vY_selectedFrames.txt": includes the IDs of the keyframes for the medical team for all the selected videos. "groundtruth": a sub-directory available only if there are lesions in that selected video. The folder "groundtruth" contains the following information: "pX_vY_000ZZ.txt": contains the bounding boxes and their category in each row. There are such files as frames in "pX_vY_selectedFrames.txt". Bounding boxes are specified in the format [x,y,w,h], where (x,y) are the pixel coordinates of the top left corner, w is the width and h is the height of the bounding box. "pX_vY_groundTruthTable.mat": contains a table with the ground truth information of that video.
Files
Steps to reproduce
CADICA dataset images were acquired at Hospital Universitario Virgen de la Victoria, Málaga, Spain. The invasive coronary angiography (ICA) videos were acquired as Digital Imaging and Communication in Medicine (DICOM) files recorded at 10 frames per second and with different duration (4-8 seconds) depending on the projection used, but they were converted to PNG images for effortless management. The frame size of each video is 512 x 512 pixels, while the length of the videos varies from 1 to 151 frames. The cardiac angiography equipment used was Artis Zee (Siemens AG, Muenchen, Germany). The dose of radiation administered in each projection ranges between 5-50 mGy. The protocol normally used in each angiography included five projections for the left coronary artery (LCA), such as right anterior oblique (RAO) and left anterior oblique (LAO), both with cranial and caudal angulation, with some additional projections in case of diagnostic difficulties. The projections used for the right coronary artery (RCA) are LAO and RAO, with cranial and caudal angulation.
Institutions
Categories
Funding
Ministerio de Ciencia e Innovación
PID2022-136764OA-I00
Universidad de Málaga
B1-2019_02
Universidad de Málaga
B1-2021_20
Fundación Unicaja
PUNI-003_2023
Universidad de Málaga
B1-2019_01
Junta de Andalucía
UMA20-FEDERJA-108
Universidad de Málaga
B1-2022_14
Universidad de Málaga
B4-2022