Body fluid cell Dataset

Published: 20 January 2025| Version 1 | DOI: 10.17632/9xtjb9gczd.1
Contributor:
Chollanot Kaset

Description

The dataset utilized in this study consists of 12,966 cells, categorized into 14 distinct cell classes. It was systematically divided into training, validation, and test sets, following proportions of approximately 70%, 20%, and 10%, respectively, to ensure comprehensive representation and robust evaluation of the model's performance. Training Set The training set comprises 8,660 cells, representing the majority of the dataset to allow the model to learn diverse morphological features of various cell types. The distribution includes common cell types, such as 4,997 lymphocytes and 3,305 neutrophils, as well as rarer classes, such as 46 signet-ring cells and 14 mitotic cells. Data augmentation techniques were applied to enhance the diversity and balance of the training set, particularly for rare categories. Validation Set The validation set contains 2,495 cells, carefully selected to evaluate the model's performance during the training process. It includes 1,551 lymphocytes and 1,030 neutrophils, along with less frequent types, such as 10 basophils and 48 promonocytes, ensuring comprehensive representation of all classes. The validation set was used to fine-tune model parameters and prevent overfitting. Test Set The test set comprises 1,811 cells, reserved for final performance evaluation. This set includes 745 lymphocytes, 442 neutrophils, and 6 signet-ring cells, among others. It provides a diverse range of cell types to assess the model's ability to generalize to unseen data, highlighting strengths and areas for improvement in its classification capabilities.

Files

Institutions

Thammasat University - Rangsit Campus

Categories

Lymphocyte, Macrophage, Neutrophil, Body Fluid, Plasma Cell, Mesothelial Cell

Funding

Thammasat University

None

Licence