With one in three threatened or near to extinction, the Department of Conservation (DOC) needs to take innovative approaches to ensure their survival.
New Zealand’s national bird, the kiwi, is one of these threatened species. There are only about 68,000 kiwi left, and they’re disappearing at a rate of about 20 per week. DOC is using many tactics to increase kiwi numbers, from predator control and raising chicks in captivity, to genetic research.
To understand the impact of these conservation efforts, and identify which tactics are working, it’s essential that DOC can accurately measure and monitor kiwi populations.
As part of their kiwi monitoring programme, DOC placed microphones in Fiordland kiwi habitats. Here, an extensive but dwindling population of kiwi is spread across almost one million hectares of forest.
These recordings created 2,000 hours of audio, split into 8,000 15-minute recordings. Each recording needed to be listened to and tagged for kiwi calls so DOC scientists could track where and how many kiwi were in the area.
With such a large and unstructured data set, manually locating kiwi calls in each file was a huge task. It would take 12 straight weeks, morning and night, just to listen to all 8,000 audio files. On top of this, identifying kiwi calls is incredibly difficult, with ambient forest noise and other bird, insect and animal sounds obscuring the audio.
DOC challenged Qrious to develop a model that could automatically identify kiwi bird sounds within audio recordings faster and more accurately than human-based approaches.
With the progression in artificial intelligence image recognition technology, Qrious data scientists decided to transform the unstructured audio files provided by DOC into visual spectrograms.
They could then use image classification technology and machine learning to train a model to automatically identify kiwi calls in those spectrograms.
DOC provided Qrious with 8,000 15-minute audio files as training data sets which had been manually tagged as ‘kiwi’ or ‘non-kiwi’.
Using AWS technology, these recordings were transformed into 900 second spectrograms. Where a kiwi call was identified in the spectrogram, this section was cropped into a seven second segment, and these segments became the training data.
Building the model
An image classification algorithm was then used to build the image recognition model, and a neural network model was trained to automatically classify a spectrogram as ‘kiwi’, ‘other bird’ or ‘background noise’.
To test the model, additional spectrograms were cropped into seven second frames which it was then able to separate into ‘kiwi’ or ‘non-kiwi’ files based on how they matched the training data.
Automating the results
The tool can now automatically convert each recording into 128 cropped image frames and find kiwi sounds within them. If the model believes there is a kiwi in one of the frames, it identifies when and automatically converts that frame back into a sound snippet for DOC’s team to identify.
"The results are excellent. DOC can now consider integrating the machine learning model into its wider monitoring programmes, and this model will have significant impact on our efforts to help save our endangered birds."
- Gavin Walker, Chief Architect, DOC
As more data is introduced, the model will become more robust and accuracy will continue to increase. It can then be applied to the wider monitoring programme, or extended to other at-risk species and protected animals, saving DOC time and money in conservation efforts, and helping to save these iconic New Zealand birds.