HDSNE a New Unsupervised Multiple Image Database Fusion Learning Algorithm with Flexible and Crispy Production of One Database: A Proof Case Study of Lung Infection Diagnose In Chest X-ray Images
Description
HDSNE is a New Unsupervised Multiple Image Database Fusion using Learning Algorithm to Diagnose Covid-19 by Chest X-ray Images. We proposed a global data aggregation scale model with six image databases selected from specific global resources. The Hash MD5 algorithm returns a unique hash value for each image, making it appropriate for duplication removal. The hash MD5 and T-SNE algorithms are applied recursively, producing a balanced and uniform database containing equal samples per category: normal, pneumonia, and COVID-19. The dataset is then cleaned by removing non-posteroanterior (PA) view and CT images that affect the model training process. These duplicates and out-of-scope images are removed during the curation process and a refined dataset is available for download. The final version of the aggregated dataset is 441 frontal X-ray images per class. This database is a partial fulfillment for the main belonging article submitted to BMC Medical Imaging Journal
Files
Steps to reproduce
The Hash MD5 algorithm returns a unique hash value for each image per class, making it appropriate for duplication removal. Both the hash MD5 and T-SNE algorithms are applied recursively, producing a balanced and uniform database that contains equal samples per category: normal, pneumonia, and COVID-19.