Multi-Laboratory Hematoxylin and Eosin Staining Variance Supervised Machine Learning Dataset

Published: 12 September 2022| Version 1 | DOI: 10.17632/8c5hkbwykd.1
Fabi Prezja, Ilkka Pölönen,


We provide the generated dataset used for supervised machine learning in [1]. The data is in CSV format and contains all principal components and ground truth labels per tissue type. Tissue type codes used are; C1 for kidney, C2 for skin, and C3 for colon. 'PC' stands for the principal component. For feature extraction specifications, please see the original design in [1]. Features have been extracted independently for each tissue type. Reference: [1] Prezja, F.; Pölönen, I.; Äyrämö, S.; Ruusuvuori, P.; Kuopio, T. H&E Multi-Laboratory Staining Variance Exploration with Machine Learning. Appl. Sci. 2022, 12, 7511.


Steps to reproduce

The exact specifications can be found in [1].


Jyvaskylan Yliopisto


Artificial Intelligence, Machine Learning, Supervised Learning, Histopathology