Pantanal deforestation forecast

Published: 5 November 2023| Version 1 | DOI: 10.17632/8kxbz4ht2c.1
Pedro Machado,


this research aims at predicting Pantanal’s deforestation until 2030 using modern Machine Learning techniques under different scenarios that vary in terms of agricultural production, the number of heads of cattle produced, and the area burned during the years. The model chosen for the study is Extreme Gradient Boosting (XGBoost), an algorithm that combines many simple models sequentially cleverly and yields state-of-the-art results. Our main objective is to simulate different scenarios of agricultural/cattle decisions and burned areas (realistic, optimistic, and pessimistic) that can aid in decision-making for strategic land use decisions in Brazil. As a secondary objective, we want to build a robust and scalable code base so others can contribute to the project, increasing its reach and impact. Finally, hopefully, this project helps ensure this critical biome’s integrity and guarantee future generations’ well-being. The model uses two data sources: The Brazilian Geography and Statistics Institute (IBGE) (, from which data for land use (ha) by and quantity produced (ton) of temporary and permanent crops from 1985 to 2021 are extracted. From IBGE we also extract the headcount of caddle (in number of heads) in Brazil, from 1985 to 2021. Data for temporary crops can be found at For permanent crops, in and for caddle head count in In this repository, data for these variables are stored in five Excel files: Divided into "culturas_temporarias_1", "culturas_temporarias_2" and "culturas_temporarias_3" (temporary crops), "culturas_permanentes" (permanent crops) and "pecuaria" for headcount. Within crops files, the sheet "Área destinada à colheita (H..." contains the land use information per year in hectares, and the sheet "Quantidade produzida (Tonela..." contains the quantity produced information per year. For caddle heads, there is only one sheet that contains information in number of heads from 1985 to 2021. the second data information is Mpbiomas (, from which we obtain information on land use transitions (in hectares) and forest fires (in hectares). Data can be downloaded from (land use transitions) and (fires). In this repository, data for land use transitions is called "mapbiomas-2021" and for fires is called "mapbiomas-queimadas-2021. The code is also included in this repository ( and can be found at To run the model, please read the "READ ME" file. Simply add all Excel files in a single folder and run the code from the same folder. More info on data can be found in the file "metadata".



Universidade de Sao Paulo Escola Politecnica


Wetlands, Agricultural Land, Brazil, Deforestation