Combining artificial intelligence and human expertise for more accurate dermoscopic melanoma diagnosis: A two-session retrospective reader study - Supplementary Material

Published: 5 February 2024| Version 1 | DOI: 10.17632/3cmknchc2g.1
Mario Giulini, Mohamad Goldust, Stephan Grabbe, Christian Ludwigs, Dominik Seliger, Priyanka Karagaiah, Hadrian Schepler, Florian Butsch, Beate Weidenthaler-Barth, Stephan Rietz


In our two-session retrospective reader study, 64 physicians, including 33 dermatologists, 11 dermatology residents, and 20 general practitioners, evaluated a deliberately challenging test set of 100 dermoscopic images comprising 50 melanomas and 50 nevi. In session 1, participants provided diagnoses and management decisions based solely on images. In session 2, conducted 4+ months later, participants evaluated the same images in different sequence, this time with assistance of a CNN-assigned traffic light system. Supplementary Figure 1 depicts the classification performance (melanoma, benign nevus) of individual physicians in session 1 (without CNN) and session 2 (with CNN). Physicians without CNN denoted by grey dots (n=64) and physicians with CNN denoted by black dots (n=64). Means denoted by squares. Blue line showing CNN’s performance on the test set. Supplementary Table 1 shows the mean differences in correctly classified lesions (melanoma, benign nevus) between session 1 (without CNN) and session 2 (with CNN) by physician group. Correctly classified lesions include true positives and true negatives.



Artificial Intelligence, Skin Cancer, Melanoma, Diagnostics, Nevus, Convolutional Neural Network