Fossil diatom microscopy image datasets and annotations for object detection

Published: 20 November 2025| Version 3 | DOI: 10.17632/457x9c7rp3.3
Contributors:
,
,

Description

This dataset primarily consists of scanned images of permanent slides prepared from surface sediments collected across the Southern Ocean for fossil diatom observation. The sampling sites comprise eighteen locations and encompass a wide range of geomorphological settings and depositional systems, making this dataset well-suited for evaluating the performance of object detection models targeting fossil diatoms, as well as for investigating the biogeography of fossil diatoms in paleoenvironmental reconstruction. The dataset is divided into virtual slides, tile images, annotations, and trained models. The virtual slides are high-resolution images obtained by photographing the permanent slides and are provided in NDPI format (the original format of Hamamatsu Photonics K.K.). Each tile image is a JPEG format extracted from the virtual slides, covering a field of view of 552 × 552 μm. YOLO annotations are provided for each tile image and its corresponding folder. These tile images and annotations are specifically designed for detecting Eucampia antarctica (Castracane) Mangin, a diatom species endemic to the Southern Ocean, and are suitable for use as training and testing data in object detection models. The object detection models (YOLOv5x, Ultralytics) trained on a subset of these data are stored under “Trained models.” Note that this dataset includes only selected regions from the areas where tile images were originally captured, due to the storage limitations of the data repository.

Files

Steps to reproduce

Virtual slides and tile images were taken by NanoZoomer SQ (a Digital whole slide scanner developed by Hamamatsu Photonics K.K.) and associated software. Annotations were produced using Visual Object Tagging Tool (v2.2.0, Microsoft) and converted to a format suitable for training YOLOv5 models (Ultralytics, https://github.com/ultralytics/yolov5/releases).

Categories

Object Detection, Microfossil, Light Microscopy

Licence