Vectorial dataset for thermokarst detection in the European Arctic region by Earth Observation and AI

Published: 15 December 2025| Version 1 | DOI: 10.17632/xhy65jmpbb.1
Contributors:
,
,
,
,
,

Description

The dataset comprises the vector database developed to train an image segmentation model aimed at detecting and mapping thermokarst-affected areas across the European Arctic. Training polygons were randomly selected from several representative sites undergoing permafrost thaw and were delineated through detailed geomorphological mapping. High-resolution Maxar satellite imagery (0.6 m) served as the primary source for identifying thermokarst-induced features within Arctic peatlands. To construct the training dataset, regions exhibiting both thermokarst activity and stable terrain were manually labelled in ArcGIS Pro using its image classification tools. This process involved digitizing polygons over distinct thermokarst landforms, taking advantage of the high spatial resolution provided by the World Imagery service to ensure accurate feature delineation. In total, 133 polygon features representing thermokarst landsystems were extracted from northern Finland, 205 from the Abisko region and northeastern Sweden, and 59 from northern Norway. No fixed sample size was predetermined. The dataset aimed to include as many representative features as possible. The final dataset contains vector data for both thermokarst landsystems and independently mapped thermokarst ponds, along with full spatial references and metadata. Additionally, shapefiles delineating the areas where thermokarst and non-thermokarst features were detected are provided.

Files

Steps to reproduce

Thermokarst detection and mapping were carried out using high-resolution satellite imagery from SPOT (2.5 m), DigitalGlobe, and Maxar (0.5 m) datasets available through the ArcGIS World Imagery service. Only snow-free images acquired between late May and late August were selected to ensure visibility of surface features during the palsa mire growing season. Lower-resolution imagery such as Planet and Sentinel-2 was tested in a preliminary phase but proved unsuitable for identifying small-scale landforms. Training and detection areas were chosen in Arctic Finland, Sweden, and Norway, focusing on peatlands within discontinuous or sporadic permafrost zones, where thermokarst textures are best expressed. Manual geomorphological mapping was performed in ArcGIS Pro by delineating polygons over distinct thermokarst features using established visual interpretation criteria, including shape, tone, size, texture, and spatial association. A total of 397 polygons were digitized to represent characteristic landforms such as peat plateaus, palsas, thaw depressions, and thermokarst ponds smaller than one hectare. These samples were used to train a deep learning model for automatic detection of thermokarst and non-thermokarst areas. The trained model was then applied across Arctic test areas to identify thermokarst terrains and lakes. Post-processing steps refined the results through spatial filtering and area thresholds, producing two final datasets: one mapping all thermokarst features and another isolating thermokarst ponds indicative of ongoing permafrost degradation.

Institutions

  • Ilmatieteen Laitos
  • Universitatea din Bucuresti Facultatea de Geografie
  • Institutul de Speologie Emil Racovita

Categories

Remote Sensing, Permafrost Processes, Earth Observation, Thawing, Deep Learning, U-Net

Funders

Licence