Distribution Modelling of Sea Pens, Sponges, Stalked Tunicates and Soft Corals from Research Vessel Survey Data in the Gulf of St. Lawrence for Use in the Identification of Significant Benthic Areas
Models of probability of occurrence and predicted biomass distribution have been created using random forest (RF) machine learning techniques for different invertebrate taxa in the Gulf of St. Lawrence. Response data were derived from by-catch data collected from DFO research vessel trawl surveys following a stratified random design based on depth and geographic region. Predictors were drawn from 78 environmental data layers. Occurrence models performed very well for sea pens and stalked tunicates and better than those for soft corals and sponges, with cross-validated AUC (area under the receiver operating characteristic curve) values ranging from 0.71 to 0.91. For the models based on biomass, soft corals and sea pens had the highest R2 values (0.42 and 0.37, respectively) in the southern Gulf of St. Lawrence and stalked tunicates and sea pens in the north (0.41 and 0.27, respectively). Sponges had R2 values less than 0.1 in both areas indicating poor model performance. Biomass models from RF were compared with Generalized Additive Models (GAM). In most of the cases RF and GAM models provided similar results and were both good options, although the fewer assumptions required for RF makes this method more convenient. These results could be used to identify the potential distribution of some vulnerable marine ecosystems indicator taxa and help to refine the borders of the significant benthic area polygons defining significant concentrations of these taxa as identified through the kernel density analyses. In particular these models can be used to extrapolate to areas not covered by the research vessel surveys.