Species data used in Random Forest modelling to determine predictors of extinction.

Published: 18 August 2023| Version 1 | DOI: 10.17632/tc6syk8vwf.1
Contributors:
,
,
,
,
,
,

Description

Plant species information extracted from the South African National Red List (http://redlist.sanbi.org/index.php). The list of species includes taxa as categorised by the Red List as Extinct, Critically Endangered Possibly Extinct, and a sample of extant taxa (Critically Endangered, Endangered, Vulnerable, Near Threatened, Least Concern) totaling 946 taxa. The information included are Genspec (UniqueID), Group (Extinct, Threatened or not-Threatened), Life form (LF), Growth form (GF), Fam (taxonomic family), South African biome that the taxa occur in, national Red List status, Range size in km2, threats impacting the species as identified during Red List assessment indicated using binary coding (Yes/No) for Habitat loss, Habitat degradation, Invasive alien species (IAS), Other, Over-exploitation, Pollution, Unknown. This dataset was used to classify taxa as EX, threatened or not-threatened to identify which predictors are the most important for classifying taxa as extinct using Random Forest modeling. The first dataset (All_threat_data.csv) includes threat information for all taxa. While the second dataset (Ex_Threatened_threat_data.csv), only includes threat information for extinct and threatened taxa.

Files

Steps to reproduce

The specific data fields required were extracted from the South African National Plant Red List database by making use of a query in an Access database. Similar data, as available upon request.

Institutions

South African National Biodiversity Institute, Stellenbosch University

Categories

Botany, Conservation Biology

Licence