SalmonScan: A Novel Image Dataset for Machine Learning and Deep Learning Analysis in Fish Disease Detection in Aquaculture

Published: 2 April 2024| Version 3 | DOI: 10.17632/x3fz2nfm4w.3
Md Shoaib Ahmed


The SalmonScan dataset is a collection of images of salmon fish, including healthy fish and infected fish. The dataset consists of two classes of images: Fresh salmon 🐟 Infected Salmon 🐠 This dataset is ideal for various computer vision tasks in machine learning and deep learning applications. Whether you are a researcher, developer, or student, the SalmonScan dataset offers a rich and diverse data source to support your projects and experiments. So, dive in and explore the fascinating world of salmon health and disease! The SalmonScan dataset (raw) consists of 24 fresh fish and 91 infected fish. [Due to server cleaning in the past, some raw datasets have been deleted] The SalmonScan dataset (augmented) consists of approximately 1,208 images of salmon fish, classified into two classes: - Fresh salmon (healthy fish with no visible signs of disease), 456 images - Infected Salmon containing disease, 752 images Each class contains a representative and diverse collection of images, capturing a range of different perspectives, scales, and lighting conditions. The images have been carefully curated to ensure that they are of high quality and suitable for use in a variety of computer vision tasks. Data Preprocessing The input images were preprocessed to enhance their quality and suitability for further analysis. The following steps were taken: Resizing 📏: All the images were resized to a uniform size of 600 pixels in width and 250 pixels in height to ensure compatibility with the learning algorithm. Image Augmentation 📸: To overcome the small amount of images, various image augmentation techniques were applied to the input images. These included: Horizontal Flip ↩ī¸: The images were horizontally flipped to create additional samples. Vertical Flip âŦ†ī¸: The images were vertically flipped to create additional samples. Rotation 🔄: The images were rotated to create additional samples. Cropping đŸĒ“: A portion of the image was randomly cropped to create additional samples. Gaussian Noise 🌌: Gaussian noise was added to the images to create additional samples. Shearing 🌆: The images were sheared to create additional samples. Contrast Adjustment (Gamma) ⚖ī¸: The gamma correction was applied to the images to adjust their contrast. Contrast Adjustment (Sigmoid) ⚖ī¸: The sigmoid function was applied to the images to adjust their contrast. Usage To use the salmon scan dataset in your ML and DL projects, follow these steps: - Clone or download the salmon scan dataset repository from GitHub. - Use standard libraries such as numpy or pandas to convert the images into arrays, which can be input into a machine learning or deep learning model. - Split the dataset into training, validation, and test sets as per your requirement. - Preprocess the data as needed, such as resizing and normalizing the images. - Train your ML/DL model using the preprocessed training data. - Evaluate the model on the test set and make predictions on new, unseen data.



Artificial Intelligence, Computer Vision, Aquaculture, Machine Learning, Marine Biology, Aquaculture Disease, Fish Disease, Deep Learning