Detecting primate vocalizations in a jungle of audio recordings
Biodiversity monitoring in tropical rainforests in relation to wood logging certification
Bioacoustic monitoring:
Applied Data Science seed grant for interdisciplinary collaboration
Team:
Species | # vocalizations | example |
---|---|---|
Chimpanzee | 1190 | |
Guenon | 554 | |
Mandrill | 2717 | |
Red Capped Mangabey | 584 |
“We need more”
Combine vocalizations with jungle noise
Dampen the vocalizations to simulate distance
For each segment, 4 new segments are created
How to test the classifiers?
Feature extraction inspired by human speech recognition
Determining number of features to select with RFE
Using feature_importances
method in ExtraTreesClassifier to select 50 most important features
Train SVM model with selected features
Convolutional Neural Networks (CNN)
Spectrogram - represents the intensity of different frequencies as they change over time, typically using a color map
Log-mel-spectrogram a variation of the standard spectrogram that applies a filter bank and a log function on top of it.
making quieter sounds more detectable.
Aligns the representation with human auditory perception
Normalizes the features
PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition
designed for audio event detection and classification
a combination of convolutional blocks and pooling operations
Trained on | SVM | CNN | CNN10 |
---|---|---|---|
Sanctuary | 0.86 | 0.81 | 0.83 |
Synthetic | 0.65 | 0.82 | 0.85 |
Sanctuary + Synthetic | 0.87 | 0.83 | 0.87 |
Numbers represent: Unweighted Average Recall (UAR)
Python modules for audio analysis
Public train and test data
Publication
Generic Audio Analysis Platform
Parisa Zahedi & Jelle Treep