Synthetic Medical Images for Robust, Privacy-Preserving Training of AI: Application to Retinopathy of Prematurity Diagnosis

Name: Synthetic Medical Images for Robust, Privacy-Preserving Training of AI: Application to Retinopathy of Prematurity Diagnosis
Creator: Aaron S. Coyner
Published: 2022-02-14T13:33:53.961Z
Keywords: Ophthalmology, Retina, Retinopathy of Prematurity, Synthetic Image, Ophthalmic Disorder, Diagnosis of Retinopathy of Prematurity, Computer-Aided Diagnosis, Convolutional Neural Network, Deep Learning

Coyner, Aaron S.; Chen, Jimmy S.; Chang, Ken; Singh, Praveer; Ostmo, Susan; Chan, R.V. Paul; Chiang, Michael F.; Kalpathy-Cramer, Jayashree; Campbell, J. Peter

doi:10.17632/fscyyhg6vt.1

Synthetic Medical Images for Robust, Privacy-Preserving Training of AI: Application to Retinopathy of Prematurity Diagnosis

Published: 14 February 2022| Version 1 | DOI: 10.17632/fscyyhg6vt.1

Contributors:

Aaron S. Coyner, Jimmy S. Chen, Ken Chang, Praveer Singh, Susan Ostmo, R.V. Paul Chan, Michael F. Chiang, Jayashree Kalpathy-Cramer, J. Peter Campbell

Description

Purpose: Developing robust artificial intelligence (AI) models for medical image analysis requires large quantities of diverse, well-curated data, which can prove challenging to collect due to privacy concerns, disease rarity, or diagnostic label quality. Collecting datasets for training diagnostic models for retinopathy of prematurity (ROP), a potentially-blinding disease, suffers from all of these challenges. Progressively-growing generative adversarial networks (PGANs) may help, as they can synthesize highly-realistic images that may increase both the size and diversity of medical datasets. Design: Diagnostic validation study of convolutional neural networks (CNNs) for plus disease detection, a component of severe ROP, using synthetic data. Participants: 5842 retinal fundus images (RFIs) collected from 963 preterm infants. Methods: Retinal vessel maps (RVMs) were segmented from RFIs. PGANs were trained to synthesize RVMs with normal, pre-plus, or plus disease vasculature. CNNs were trained, using real or synthetic RVMs, to detect plus disease from two real RVM test datasets. Main Outcome Measures: Features of real and synthetic RVMs were evaluated using Uniform Manifold Approximation and Projection (UMAP). Similarities were evaluated at the dataset and feature level using Frechet Inception Distance (FID) and Euclidean distance, respectively. CNN performance was assessed via area under the receiver operating characteristics (AUROC) curve; AUROCs were compared via bootstrapping and Delong’s test for correlated ROC curves. Confusion matrices were compared using McNemar’s 𝜒2 and Cohen’s κ. Results: The CNN trained on synthetic RVMs had a significantly higher AUROC (0.971, p = 0.006, p = 0.004) and classified plus disease more similarly to a set of eight international experts (κ = 0.922) than the CNN trained on real RVMs (AUC = 0.934, κ = 0.701). Real and synthetic RVMs overlapped, by plus disease diagnosis, on the UMAP manifold, showing that synthetic images spanned the disease severity spectrum. FID and Euclidean distances suggested that real and synthetic RVMs were more dissimilar to one another than real RVMs were to one another, further suggesting that synthetic RVMs were distinct from the training data with respect to privacy considerations. Conclusions: Synthetic medical datasets may be useful for training robust medical AI models. Furthermore, PGANs may be able to synthesize realistic data for use without protected health information concerns.

Files

Steps to reproduce

Images were generated using a progressively-growing generative adversarial network trained on a proprietary dataset of retinopathy of prematurity retinal vessel maps. More information can be found in the accompanying paper entitled, "Synthetic Medical Images for Robust, Privacy-Preserving Training of AI: Application to Retinopathy of Prematurity Diagnosis."

Institutions

Illinois Eye and Ear Infirmary
Oregon Health & Science University
Athinoula A Martinos Center for Biomedical Imaging
Massachusetts General Hospital

Synthetic Medical Images for Robust, Privacy-Preserving Training of AI: Application to Retinopathy of Prematurity Diagnosis

Description

Files

Steps to reproduce

Institutions

Categories

Related Links

Licence