Species data used in Random Forest modelling to determine predictors of extinction.
Description
Plant species information extracted from the South African National Red List (http://redlist.sanbi.org/index.php). The list of species includes taxa as categorised by the Red List as Extinct, Critically Endangered Possibly Extinct, and a sample of extant taxa (Critically Endangered, Endangered, Vulnerable, Near Threatened, Least Concern) totaling 946 taxa. The information included are Genspec (UniqueID), Group (Extinct, Threatened or not-Threatened), Life form (LF), Growth form (GF), Fam (taxonomic family), South African biome that the taxa occur in, national Red List status, Range size in km2, threats impacting the species as identified during Red List assessment indicated using binary coding (Yes/No) for Habitat loss, Habitat degradation, Invasive alien species (IAS), Other, Over-exploitation, Pollution, Unknown. This dataset was used to classify taxa as EX, threatened or not-threatened to identify which predictors are the most important for classifying taxa as extinct using Random Forest modeling. The first dataset (All_threat_data.csv) includes threat information for all taxa. While the second dataset (Ex_Threatened_threat_data.csv), only includes threat information for extinct and threatened taxa.
Files
Steps to reproduce
The specific data fields required were extracted from the South African National Plant Red List database by making use of a query in an Access database. Similar data, as available upon request.