H&E-stained oral squamous cell carcinoma histological images dataset
Description
Computer-aided diagnosis (CAD) can be used as an important tool to aid and enhance pathologists' diagnostic decision-making. Deep learning techniques, such as convolutional neural networks (CNN) and fully convolutional networks (FCN), have been successfully applied in medical and biological research. Unfortunately, histological image segmentation is often constrained by the availability of labeled training data once labeling histological images for segmentation purposes is a highly-skilled, complex, and time-consuming task. This paper presents the hematoxylin and eosin (H&E) stained oral cavity-derived cancer (OCDC) dataset, a labeled dataset containing H&E-stained histological images of oral squamous cell carcinoma (OSCC) cases. The tumor regions in our dataset are labeled manually by a specialist and validated by a pathologist. The OCDC dataset presents 1,020 histological images of size 640x640 pixels containing tumor regions fully annotated for segmentation purposes. All the histological images are digitized at 20x magnification.