Water Quality Monitoring Dataset for Tilapia (Oreochromis niloticus) Aquaculture in Montería, Colombia (2024)
Description
This dataset contains comprehensive water quality measurements for a tilapia (Oreochromis niloticus) aquaculture pond located in Montería, Colombia, collected over a six-month period in 2024. The dataset is part of a study aimed at enhancing water quality management in aquaculture systems, particularly in rural environments with limited technological infrastructure. Data Collection and Parameters: The data was gathered through an Internet of Things (IoT) system designed to continuously monitor key water quality parameters. Parameters include: Temperature (°C): Essential for maintaining optimal conditions for fish metabolism and growth. Dissolved Oxygen (DO) (mg/L): Crucial for fish respiration and overall pond health. pH: Indicates the acidity or alkalinity of the water, which affects fish health and nutrient availability. Turbidity (NTU): Reflects the clarity of the water, which can impact light penetration and fish behavior. Purpose and Applications: This dataset supports predictive modeling for water quality management using Machine Learning (ML) algorithms, including Random Forest and Support Vector Machines (SVM), with optimizations implemented via the Quantum Approximate Optimization Algorithm (QAOA). The dataset was utilized to train and validate models for real-time water quality prediction, achieving high accuracy in managing aquaculture conditions and reducing fish mortality. Significance: The dataset provides valuable insights into water quality dynamics in tropical aquaculture settings, making it suitable for researchers, aquaculture managers, and data scientists focused on sustainable aquaculture practices. The data can aid in developing predictive models to stabilize water quality, support rural and urban aquaculture management, and contribute to global food security initiatives. Acknowledgments: The collection of this dataset was made possible by an IoT-based monitoring system tailored for the environmental conditions of Montería, with an emphasis on adaptability for resource-limited rural regions.
Files
Steps to reproduce
Set Up Study Site and Environment: Identify a controlled aquaculture environment with stable water flow and similar climate characteristics to Montería, Córdoba, Colombia. Establish ponds for tilapia (Oreochromis niloticus) and ensure natural spring water supply if possible. Deploy IoT System Architecture: Install sensors in the pond to measure temperature, dissolved oxygen (DO), pH, and turbidity. Connect sensors to a Raspberry Pi configured as the central processing unit for data collection and transmission. Ensure the Raspberry Pi is capable of storing data locally and transmitting it to a remote server over Wi-Fi. Set up a Django-based web interface for real-time data visualization and alert threshold configuration. Calibrate Sensors: Calibrate DO and pH sensors according to ISO 5814:2012 and ISO 10523:2008 standards. Repeat calibration every 15 days to ensure measurement accuracy, especially in varying environmental conditions. Implement IoT Data Management System: Use the "IoT Data Management" algorithm provided in the study to initialize and monitor sensor data. Program the Raspberry Pi to manage data collection, send alerts for threshold deviations, and update data on the web interface. Ensure continuous operation by maintaining the Wi-Fi connection and local database on the Raspberry Pi. Data Collection and Recording: Collect data on temperature (°C), dissolved oxygen (mg/L), pH, and turbidity (NTU) every hour. Ensure data normalization for variables (temperature and DO) using Min-Max Scaling for predictive modeling integration. Apply Fuzzy Comprehensive Evaluation (FCE): Use FCE to categorize the water quality parameters into "Good," "Moderate," and "Poor" based on local conditions and thresholds. Apply these categories for real-time decision-making and alert generation on water quality status. Train Machine Learning Models: Select Random Forest (RF) and Support Vector Machine (SVM) models. Train the RF model with 100 decision trees and a maximum depth of 10. Train the SVM with an RBF kernel and penalty parameter 𝐶 = 1.0, specifically for pH and DO predictions. Optimize with Quantum Approximate Optimization Algorithm (QAOA): Use QAOA to optimize model processing time, reducing it by approximately 50%. Follow the provided pseudocode for QAOA and adjust parameters iteratively using gradient descent to minimize the cost function. Evaluate Model Performance: Calculate performance metrics such as RRMSPE, MAPE, RRMSE, and R² to assess model accuracy. Validate the model's robustness in predicting water quality variables with real-time accuracy requirements. Maintain and Update System: Regularly update the Django interface for enhanced visualization. Perform periodic sensor maintenance, especially for pH and DO electrodes, to ensure long-term accuracy. Implement inline calibration in future steps if possible for enhanced reliability.