Testing Dataset for Accelerated Superpixel Image Segmentation with a Parallelized DBSCAN Algorithm

Published: 06-05-2021| Version 2 | DOI: 10.17632/m52mb6ptj7.2
Seng Cheong Loke,
Bruce MacDonald,
Matthew Parsons,
Burkhard Wünsche


This dataset consists of the processing output from the comparison of superpixel algorithms for the article "Accelerated Superpixel Image Segmentation with a Parallelized DBSCAN Algorithm". Division of images into groups of perceptually similar and proximate pixels is a necessary pre-processing step in many computer graphics algorithms such as image segmentation, classification, object tracking, and motion estimation. By reducing the number of operational units being processed into superpixels, the algorithms can run more efficiently and hence faster. The article describes the development of a new superpixel algorithm and compares its performance to the optimized algorithms in the OpenCV library and others selected based on their processing speed and quality. The new algorithm's segmentation performance in terms of Boundary Recall, Under-Segmentation Error, and Achievable Segmentation Accuracy are comparable to the OpenCV algorithms while performing between 4-135 times faster. The original images were obtained from the Berkeley Segmentation Data Set and Benchmarks 500 (BSDS500) and should be placed in the root folder of the dataset (*.jpg), while the ground truth MATLAB files (*.mat) should be placed in the folder "GroundTruth". The folders "GroundTruthBoundaries" and "GroundTruthSegmentation" are generated from the ground truth files and are used by the software to compare with the corresponding outputs from the respective superpixel algorithms in "OutlineImages" and "SegmentationImages" to generate the segmentation metrices. The folders "NoiseOutlineImages" and "NoiseSegmentationImages" are outputs generated from BSDS500 images that have increasing amounts of added uniform (salt-and-pepper) noise. The "Results" folder contains the output Excel spreadsheet and MATLAB script files to generate the graphs from the article. The "Images" sub-folder in the Results folder contains selected processed and segmented images that were used for visual comparison. The "ScanSegment" folder contains intermediate processing files, and the segmented output for CRS, ERS, ETPS, RT-DBSCAN (FastSuperpixel), and the new algorithm (ScanSegment). Segmentation output for the OpenCV algorithms are not included due to space limitations, but can be generated from the software.


Steps to reproduce

The performance of our algorithm was evaluated against other superpixel algorithms by testing on the BSDS500 images and the accompanying ground truth human-segmented images and boundaries. For commonly used ones such as LSC, SEEDS, SLIC, and its variants Zero parameter SLIC (SLICO) and Manifold SLIC (MSLIC), the optimized OpenCV versions were used with the default parameters [27-29]. The top three performers from a recent head-to-head comparison (ETPS, CRS, and ERS) that do not have optimized OpenCV versions were also tested together using the implementations from the test suite with the default parameters. As not all algorithms return the same number of clusters as the superpixels requested, a pre-processing step was used to adjust the requested superpixels so that the returned clusters approximately matched a range of numbers from 100 to 600 in steps of 100 to give six batches. Processing speed was measured as the average speed to process each of the BSDS500 images according to batch. The processing time was taken as the total time required to convert an OpenCV input mat color image into the algorithm’s native format, and obtain the output as a mat image labelled from 0 to N – 1, when N is the number of returned superpixels. Visual comparisons were made by cropping out selected regions with both high and low contrast areas from the BSDS500 images. Quantitative comparisons were made with four commonly used metrics: 1) processing time (PT), 2) boundary recall (BR), 3) under-segmentation error (UE), and 4) achievable segmentation accuracy (ASA). As each BSDS500 image was hand segmented by at least five independent observers, the values for PT, BR, UE, and ASA were given as the average across all observations for that image and algorithm. BR is defined as the fraction of ground truth boundary pixels within or equal to two pixels of a superpixel boundary. UE is determined by assigning each superpixel to a human-segmented cluster according to majority or plurality overlap and calculating the number of pixels that fall outside of their assigned cluster, divided by the total pixel count of the image. ASA is the opposite of UE and measures the fraction of pixels that lie within their assigned clusters. The processing time was remeasured after scaling the width and height of the first ten BSDS500 images by a multiplier in the range of 0.5 to 8.0, with a fixed superpixel count of 200. Scaling was performed using the OpenCV implementation of the Lanczos resampling filter. The square root of the processing time was then plotted against the length multiplier. Resistance to noise was tested by adding ‘salt and pepper’ or uniform noise to all BSDS500 images with a fixed superpixel count of 200. The amount of noise added was set from a range of 0-20% in steps of 2.5%. BR and ASA were then measured and plotted against the fraction of added noise.