Quantitative Developments
Quantitative Developments - ARMI Papers & Reports
Papers & Reports Simulated soundscapes and transfer learning boost the performance of acoustic classifiers under data scarcity
Authors: Matthew J Weldy; Damon B Lesmeister; Tom Denton; Adam Duarte; Ben J Vernasco; Amandine Gasc; Jennifer C Rowe; Michael J Adams; Matthew G Betts
Date: 2025-06-26 | Outlet: Methods in Ecology and Evolution
The biodiversity crisis necessitates spatially extensive methods to monitor multiple taxonomic groups for evidence of change in response to evolving environmental conditions. Programs that combine passive acoustic monitoring and machine learning are increasingly used to meet this need. These methods require large, annotated datasets, which are time-consuming and expensive to produce, creating potential barriers to adoption in data- and funding-poor regions. Recently released pre-trained avian acoustic classification models provide opportunities to reduce the need for manual labelling and accelerate the development of new acoustic classification algorithms through transfer learning. Transfer learning is a strategy for developing algorithms under data scarcity that uses pre-trained models from related tasks to adapt to new tasks.
Our primary objective was to develop a transfer learning strategy using the feature embeddings of a pre-trained avian classification model to train custom acoustic classification models in data-scarce contexts. We used three annotated avian acoustic datasets to test whether transfer learning and soundscape simulation-based data augmentation could substantially reduce the annotated training data necessary to develop performant custom acoustic classifiers. We also conducted a sensitivity analysis for hyperparameter choice and model architecture. We then assessed the generalizability of our strategy to increasingly novel non-avian classification tasks.
With as few as two training examples per class, our soundscape simulation data augmentation approach consistently yielded new classifiers with improved performance relative to the pre-trained classification model and transfer learning classifiers trained with other augmentation approaches. Performance increases were evident for three avian test datasets, including single-class and multi-label contexts. We observed that the relative performance among our data augmentation approaches varied for the avian datasets and nearly converged for one dataset when we included more training examples.
We demonstrate an efficient approach to developing new acoustic classifiers leveraging open-source sound repositories and pre-trained networks to reduce manual labelling. With very few examples, our soundscape simulation approach to data augmentation yielded classifiers with performance equivalent to those trained with many more examples, showing it is possible to reduce manual labelling while still achieving high-performance classifiers and, in turn, expanding the potential for passive acoustic monitoring to address rising biodiversity monitoring needs.
Our primary objective was to develop a transfer learning strategy using the feature embeddings of a pre-trained avian classification model to train custom acoustic classification models in data-scarce contexts. We used three annotated avian acoustic datasets to test whether transfer learning and soundscape simulation-based data augmentation could substantially reduce the annotated training data necessary to develop performant custom acoustic classifiers. We also conducted a sensitivity analysis for hyperparameter choice and model architecture. We then assessed the generalizability of our strategy to increasingly novel non-avian classification tasks.
With as few as two training examples per class, our soundscape simulation data augmentation approach consistently yielded new classifiers with improved performance relative to the pre-trained classification model and transfer learning classifiers trained with other augmentation approaches. Performance increases were evident for three avian test datasets, including single-class and multi-label contexts. We observed that the relative performance among our data augmentation approaches varied for the avian datasets and nearly converged for one dataset when we included more training examples.
We demonstrate an efficient approach to developing new acoustic classifiers leveraging open-source sound repositories and pre-trained networks to reduce manual labelling. With very few examples, our soundscape simulation approach to data augmentation yielded classifiers with performance equivalent to those trained with many more examples, showing it is possible to reduce manual labelling while still achieving high-performance classifiers and, in turn, expanding the potential for passive acoustic monitoring to address rising biodiversity monitoring needs.
Papers & Reports Bayesian networks facilitate updating of species distribution and habitat suitability models
Authors: Adam Duarte; Robert S Spaan; James T Peterson; Christopher A Pearl; Michael J Adams
Date: 2024-12-06 | Outlet: Ecological Modelling
Managers often rely on predictions of species distributions and habitat suitability to inform conservation and management decisions. Although numerous approaches are available to develop models to make these predictions, few approaches exist to update existing models as new data accumulate. There is a need for updatable models to ensure good modeling practices in an aim to keep pace with change in the environment and change in data availability to continue to use the best-available science to inform decisions. We demonstrated a workflow to deliver predictive models to user groups within Bayesian networks, allowing models to be used to make predictions across new sites and to be easily updated with new data. To demonstrate this workflow, we focus on species distribution and habitat suitability models given their importance to informing conservation strategies across the globe. In particular, we followed a standard process of collating species encounter data available in online databases and ancillary covariate data to develop a habitat suitability model. We then used this model to parameterize a Bayesian network and updated the model with new data to predict species presence in a new focal ecoregion. We found the network updated relatively quickly as new data were incorporated, and the overall error rate generally decreased with each model update. Our approach allows for the formal incorporation of new data into predictions to help ensure model predictions are based on all relevant data available, regardless of whether they were collected after initial model development. Although our focus is on species distribution and habitat suitability models to inform conservation efforts, the workflow we describe herein can easily be applied to any use case where model uncertainty reduction and increased model prediction accuracy are desired via model updating as new data become available. Thus, our paper describes a generalizable workflow to implement model updating, which is widely recognized as a good modeling practice but is also underutilized in applied ecology.
Papers & Reports Informative priors can account for location uncertainty in stop-level analyses of the North American Breeding Bird Survey (BBS), allowing fine-scale ecological analyses
Authors: Ryan C Burner; Alan Kirschbaum; Jeffrey A. Hostetler; David J. Ziolkowski Jr; Nicholas M. Anich; Daniel Turek; Eli D. Striegel; Neal D. Niemuth
Date: 2024-09-14 | Outlet: Ornithological Applications
Ecologists can learn a lot about species by studying the precise locations in which they do (and do not) occur, but the location information associated with many species records is imprecise. A prominent example of this is the North American Breeding Bird Survey (BBS), in which volunteer observers have surveyed birds at points along consistent routes across the United States for over fifty-five years. As the BBS was designed for large-scale analyses, detailed location information for each bird count is not recorded. We estimate location uncertainty, and the resulting uncertainty in land cover covariates, for the BBS data and present a modeling method that accounts for this uncertainty in a way that opens new possibilities for fine-scale uses of this extensive dataset, unlocking its potential to advance the study of the relationships between birds and their immediate habitat. More broadly, our methods and modeling framework could be used in a variety of situations in which covariate or location uncertainty is a challenge.
View All Data Releases on Quantitative Developments
* USGS neither sponsors nor endorses non-USGS web sites; per requirement "3.4.1 Prohibition of Commercial Endorsement."
* PDF documents require Adobe Reader or Google Chrome Browser for viewing.
* PDF documents require Adobe Reader or Google Chrome Browser for viewing.