Diagnostic performance of artificial intelligence for histologic melanoma recognition compared to 18 international expert pathologists: Supplementary Material
The study was designed to compare the performance of classifiers based on image analyses by convolutionnal neural networks (CNNs) with that of 18 expert dermatopathologists in a binary classification task for pigmented skin lesions. Mendeley Supplementary Figure 1 shows a schematic representation of the testing approach. The CNNs were trained by cross-testing. Each iteration consists of five folds. Orange rectangles represent the folds used for testing and blue rectangles represent folds for training. For each iteration a CNN is trained and tested on the respective folds. Each trained CNN has a performance that is determined on the fold. To calculate the overall performance, the sum of all 5 performances is taken. This procedure was repeated 3 times and the results were combined to generate an ensemble. Mendeley Supplementary Figure 2 depicts the whole slide image (WSI) analysis. Tissue sections on whole slide images (top) were divided into tiles (middle). The CNN (a pre-trained ResNeXt50) assigned a malignancy score to every individual tile (bottom). Red tiles were classified as melanoma, blue tiles as nevus. Scores for all tiles on one image were averaged to a final malignancy score for the complete slide. Supplementary Table 1 shows the characteristics of the pigmented skin lesions included in the test set. Supplementary Table 2 provides an analysis of statistical differences between the performance of pathologists and CNN classifiers.