Planetary Health: A Comprehensive View of Food in Brazil - PHFood Brazil
Description
This dataset is a result of the "Integrating national open databases for a comprehensive view on food, environment and health in Brazil", an initiative that integrates diverse national open databases to provide insights into food production, consumption, environmental impact, and public health in Brazil from 1974 to 2022. The primary hypothesis behind the dataset is that analyzing food systems, nutrition, and agricultural practices over time will reveal patterns that can inform sustainable food policies and health interventions. We hypothesize that by aggregating and harmonizing data from various national sources, it will be possible to identify critical relationships between food production, nutrient availability, and environmental sustainability. This integrated dataset allows us to evaluate how agricultural practices and food consumption patterns have evolved in response to changing environmental, economic, and societal conditions in Brazil. Food Production and Nutrient Data: The dataset includes detailed information on harvested areas, food production (in tons), and nutrient availability (e.g., energy, protein, fiber, vitamins, and minerals) for various food groups across different Brazilian regions and states. Water and Environmental Data: Water usage, deficit, and environmental impact data are linked to agricultural activities, allowing the assessment of water efficiency in food production. Pesticides and Residue Monitoring: Information on pesticide usage, maximum residue levels (MRL), and residue percentage in food items is provided, with additional details on authorized pesticide types, toxicity classifications, and environmental risks. Food Consumption: Data on food acquisition and consumption patterns, broken down by food groups (e.g., beans, vegetables, fruits), are presented, highlighting dietary trends across various population groups. Data Collection and Methodology: The dataset was constructed by linking several government datasets, including agricultural statistics, water use reports, and residue monitoring programs. Each dataset was meticulously harmonized to ensure compatibility and completeness, following Extract, Transform, Load (ETL) processes. The data was collected over several decades, representing longitudinal trends in food systems and health. The harmonization process ensures that all data fields are consistent and comparable across time and regions. Interpretation and Use: This dataset can be used to explore the connections between food systems, health, and the environment. It supports studies in agrifood systems, sustainability, public health, and climate change. Researchers can investigate the impacts of food production on nutrient availability, water consumption, and pesticide use over time. It can also be used to model future food system sustainability under different climate scenarios and dietary patterns, contributing to the development of food policies that balance human and planetary health.
Files
Steps to reproduce
Data Collection: Aggregate data from official Brazilian national databases, including agricultural statistics, environmental data (water usage and deficits), food consumption patterns, and pesticide residue monitoring. Sources include government platforms such as IBGE (Brazilian Institute of Geography and Statistics), ANVISA (Brazilian Health Regulatory Agency), and the Ministry of Agriculture. Data Preprocessing: Clean the raw data by standardizing formats (e.g., units, dates) and handling missing or incomplete data. Use ETL (Extract, Transform, Load) processes to harmonize data from various sources into a consistent format. Data Integration: Merge datasets by common keys such as food types, regions, and years. Ensure compatibility of columns across datasets. Link food production data with water usage, pesticide residue, and nutrient composition data. Data Transformation: Calculate aggregate values for nutrients (e.g., protein, fiber, vitamins) based on food production and consumption. Classify food items according to food groups, regions, and other relevant categories. Machine Learning Application: Apply state-of-the-art machine learning techniques to identify trends and correlations between food production, environmental sustainability, and health outcomes. These machine learning models can be adapted to suit specific research queries based on the dataset. FAIR Principles Adherence: Ensure that the dataset is structured in a way that adheres to FAIR (Findable, Accessible, Interoperable, Reusable) principles to facilitate reproducibility.
Institutions
Categories
Funding
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior