RSPHC CT-Scan Dataset

Published: 22 April 2024| Version 1 | DOI: 10.17632/f8p462hpmv.1
Radiologi RSPHC


The RSPHC Dataset emerges from the radiology department of RSPHC Surabaya, a beacon of medical excellence serving over 1.5 million patients annually with a dedicated team of more than 115 medical professionals and 320 nursing staff. This dataset is a pivotal addition to medical research, offering a deep dive into chest radiographs of patients affected by COVID-19, as well as those without the infection. The dataset is an extensive collection of 10,354 2D chest radiograph images, categorized into two primary groups: - Images from patients diagnosed with COVID-19. - Images from individuals without the infection (referred to as ‘normal’ individuals). - Annotation Process: Initially, the dataset presented images without annotations. To address this, we utilized a pre-trained UNet model for automatic segmentation, creating precise annotations for each image. A subset of the dataset underwent a rigorous validation process by radiologists at Airlangga University Hospital. This step was critical to ensure the accuracy of the annotations. Out of this process, 494 images—188 from COVID-19 patients and 306 from normal patients—received verified annotations, making them a reliable subset for further research. The dataset is organized into several folders for ease of access and analysis: - “images”: Contains the raw chest radiograph images. - “masks”: Includes the corresponding masks for the images, highlighting regions of interest and potential abnormalities. - “segmented images”: Features images with lung objects segmented out from the original radiographs, using the masks to focus on lung structures. - “npy_images”, “npy_masks”, “npy_segmented images”: These folders contain the same data as above but in NumPy array format, a standard in machine learning for efficient data handling. The RSPHC Dataset, with its validated annotations and structured format, is an invaluable asset for developing AI-driven diagnostic tools. It also enriches our understanding of the radiological characteristics of COVID-19 and other respiratory ailments, paving the way for advancements in medical imaging analysis.



Artificial Intelligence, Image Segmentation, Computed Tomography, Deep Learning, Chest Radiograph