Bacteria Data for Machine Vision and Digital Biology

Published: 3 September 2023| Version 1 | DOI: 10.17632/cvkgfzp7ck.1
Mohammad (Behdad) Jamshidi, Saleh Sargolzaee, Salimeh Foorginezhad, Omid Moztarzadeh


The dataset used in this work is a combination of two datasets, the datasets in [1] and the Digital Images of Bacteria Species (DIBaS) dataset, which is a publicly available dataset provided by Zieliński et al. in 2017 [2]. This dataset includes RGB images of 33 isolated bacteria species, with each species containing 20-23 images, totaling 689 images. The dataset size is relatively small for training deep learning models, which can lead to overfitting and poor generalization. Additional data were collected for this study to augment the dataset, resulting in a total of 2722 RGB images. These images were captured using a Nikon E200 microscope with a 100x objective after staining the bacteria with the Gram method. The combined dataset is not only more substantial in size but also represents multiple institutions and equipment sources, enhancing its real-world applicability. 1. M.B. Jamshidi, S. Sargolzaei, S. Foorginezhad, and O. Moztarzadeh, Metaverse and Microorganism Digital Twins: A Deep Transfer Learning Approach Applied Soft Computing, 2023. 2. B. Zieliński, A. Plichta, K. Misztal, P. Spurek, M. Brzychczy-Włoch, and D. Ochońska, "Deep learning approach to bacterial colony classification," PloS one, vol. 12, no. 9, p. e0184554, 2017.



Univerzita Karlova, Islamic Azad University Mashhad Branch, Zapadoceska univerzita


Artificial Intelligence, Data Science, Machine Learning, Machine Vision, Digital Twin, Deep Transfer Learning, Digital Health, Digital Twin Technology