Main content

150 years down to 10 days - how the Bat Conservation Trust improved its data analysis vision with AWS

Madeline Bennett Profile picture for user Madeline Bennett July 10, 2024
Summary:
What would have taken 150 work years of effort can now be done in 10 days.

BATTY

Bats might be most commonly associated with the vampire kind. But banish those thoughts of shape-shifting, blood-sucking creatures of the night from your mind. Of the 1,400 species of bat, only three actually drink blood.

Far from being a creature to be afraid of, bats play a key role in the natural world: as seed dispersers, plant pollinators and insect control. It’s estimated that bats eat enough pests to save more than $1 billion every year in crop damage and pesticide costs in the US corn industry alone; across all agricultural production, that rises to more than $3 billion per year. 

To ensure bats can continue to play their vital role in our ecosystem, they need protection from a range of human-made threats, including habitat destruction, light pollution, wind turbines and forest fires.

That’s where the Bat Conservation Trust (BCT) comes in. Set up in 1990 as an umbrella organization for the 80-plus bat groups across the UK, the charity works to support the recovery and resilience of bat populations, identify threats, protect habitats and bust misconceptions.

To carry out this work, data collection and analysis is crucial. The BCT produces the official stats for bat species in the UK, such as the lesser horseshoe bat and common pipistrelle, including trends, population numbers and locations.

A lot of the Trust’s work is centred around acoustic monitoring, recording and analyzing sound to understand which bat species are in a particular area. While bat calls are usually pitched at too high a frequency for humans to hear naturally, they can be heard using a bat detector and recorded onto an SD card.

The primary detectors that the BCT uses are static versions about the size of a credit card, which cost about a tenth of the price of traditional bat detectors. Thanks to the significant price drop, this has allowed the charity to expand the number of detectors in use and the amount of data collected via citizen science surveys. Dr Lia Gilmour, Head of Conservation Projects, Bat Conservation Trust, explains:

A lot of our work is citizen science, so using citizen scientists to collect vast amounts of data, and using that data to ask questions about population, monitoring how species are faring and predictions for the future.

Sound and vision 

One night’s worth of sound recording occupies around 24-25 gigabytes of data – which soon adds up with the hundreds of detectors dotted around. These vast amounts of data allow the BCT to better understand the threats bats are facing and how the new global change is going to impact different species. Gilmour adds:

We've got some bats moving north, and then we've also got maybe a mismatch - if there's going to be the insects or the diet that they need in those new places. It's important that we understand, to be able to do something about that.

This is where the latest technology from AWS is proving helpful, supporting auto-identification software that makes the process of analyzing all this data much more efficient and quicker.

Traditional data analysis is carried out manually, meaning someone first needs to convert a bat call into a picture, called a spectrogram; that picture is then analyzed to see what characteristics it has in order to identify the species of bat.

Thanks to an AWS UK Imagine Grant, the BCT has been able to develop a new version of its Sound Classification System (SCS) acoustic survey tool, which will fully automate audio-identification of bat calls, removing the manual requirement.

The SCS is hosted on AWS. The BCT selected AWS as Martin Newman, IT Research and Development Specialist at the BCT, had worked with another client using the technology, and felt that it was a good match for the project.

Newman put together a proof of concept a little before Covid hit, which lasted the organization until last year. Since then, the BCT has been working with partner Lambert Labs to make the SCS truly cloud native. Newman adds:

What I did wasn't really cloud native and the benefits of making it cloud native have been immense in terms of efficiency, cost saving and ease of use.

Lambert Labs developed a scalable and cost-effective audio processing pipeline for the BCT. This was based on Newman’s existing proof of concept, which ran on non-scaling EC2 instances and needed reimplementation to scale to handle over a hundred terabytes of input data.

Lambert Labs also assisted the BCT in applying for the AWS Imagine Grant, and deployed a serverless technology stack to handle large volumes of data without incurring high costs during idle periods. Newman adds:

One of the great things about serverless and AWS is it scales really easily. So our SCS works both for comparatively small-scale citizen science projects and very large-scale research projects. It will cope with both just as easily.

If you had to buy loads and loads of computers to do the analysis, it would not be economical. We collect the data over a comparatively small fraction of the year. That means that for a short period, we want an awful lot of processing power.

Timescales

For the largest project the BCT ran last year, it collected 66 terabytes of data. With the ability to push that through AWS with serverless techniques, it took the BCT just 10 days to process. Newman explains:

That involved running 300,000 batch jobs to achieve that. To have listened to the recordings in the way that you would have to do human identification on that - because there are ways, even though it's high frequency, that you can bring the frequency down so you can listen to it - it would have taken around 150 work years of effort to have done that. We've gone from 150 years down to 10 days, which makes the whole thing feasible. Technology has helped us in every step of the way to make this happen.

The BCT has also been working with UCL and Edinburgh University to use machine learning to improve the classifier aspect of bat detection. The resulting BatDetect2 technology uses BCT training data to help determine whether there's a bat call in the audio recording, and then identify the particular bat species.

With the new tech in place, the BCT will be able to make use of previously untapped information to discover more crucial details about bat populations and trends. Gilmour explains:

Before we were relying on small amounts of data to make quite big inferences about populations, but we've got potential to be able to tap into information that we haven't done before.

As an example, the BCT is currently looking into whether it’s possible to use acoustics to identify whether there are breeding bats in a certain location. Gilmour says:

If we can do that without intervention or invasive techniques, just by eavesdropping, that will be amazing because we'll be able to understand if there's a maternity roost and breeding females, so we know that species is doing really well in that area. There are so many questions that can be asked of this acoustic data, it's just learning how to mine that and asking the right questions.

Next, the BCT is going to explore putting a soundscape algorithm into the SCS. This will enable understanding of the soundscapes in forest habitats, so the organization can see whether there’s a ratio of human-made to natural sounds that gives an indication of the health of the habitat. Gilmour adds:

The three-year vision, if we get to the end, is to be as fully automated as possible to reduce staff costs. A lot of what AWS has started doing has stopped us doing it manually, which costs us, as a charity, lots of money. Running the data through AWS isn't that expensive, it's a fraction of the cost so the more we can automate that, the better.

While the BCT is keen to automate bat monitoring and identification as much as possible, Newman notes that it will still need some human involvement, for instance to retrain the algorithms. He adds:

The machines are learning from us and there has to be an us.

Loading
A grey colored placeholder image