Panoramic radiography Images with diagnosis of need for Apicoectomy surgery as label
This data set consists of panoramic radiography images of patient’s jaws. data was labeled by skilled endodontist. this dataset was collected for the purpose of training machine learning algorithm to automate diagnosis workflow. There is a test directory in the dataset this portion of data is remained unseen in training of model and was used only for assessment and comparison. Data is annotated using bounding boxes, each bounding box captures a single tooth that either was subjected to root canal treatment in the past (class 0) or is considered in need of surgery by endodontist (class 1). Bounding boxes of each image can be found in text file with same name of image each line indicates a bonding box location and class in format of (class_id x_center y_center w h), keep in mind that this numbers are normalized. Beside bonding boxes there is binary label for each original image that indicate final result of image, if there is found one or more teeth in need of surgery final label is 1 meaning doctor deduced patient needs surgery treatment; otherwise , label is 0 meaning patient doesn’t need this treatment. in training set there is a directory named single_teeth, contents of this directory are cropped from original image in the other word they are teeth that labels suggest need further investigation. Original images are resized to (2000 ,1000, 3) for convince of use and storage they; however, single_teeth images have various shapes due nature of their content. also, all images in dataset are saved in PNG or JPG format not DICOM.
Steps to reproduce
Raw data was acquired from open-source project. Link will be provided in Related links section. The first step of making data set was to protect patient’s data, so we removed meta data from or DICOM images. name of patients has been removed already from data but information like gender, age and time of accusation was present as meta data in DICOM file. Next, we had to deal with large size of image with median width of 3000px and median height of 1500px, so we reduced image size to 2000px, 1000px using center-crop technique this method would preserve aspect ratio of image and data loss would be minimal because central region of image is our region of interest. In annotation step, roboflow platform was used to perform bounding box drawing. First, we uploaded our data to roboflow platform (Link will be provided in Related links section) then we start to manually annotate our images. when we annotated all images, platform automaticity converted drawing to annotation text file. There are open-source and commercial alternative annotation tools that could have been used in this step. Next step, data set were splitted in to two directory one for training and one for testing; although, it would be fine to take these splits from same distribution, because of data skewed nature we add more positive example to testing set to better evaluate or models. In the last part, we cropped bounding boxes out of image and put them on sub-directory of training set named single_teeth to provide training feed for another model in workflow. One model will inspect original picture and perform object detection and another model will learn to only lock at suspicious part of picture and inspect them on concentrated manner. For this step we used a python script and PIL liberty to automatically carry out this time-consuming task.