Gunshot Detection from Audio Streams in Portable Devices

Bokelund, Linnea; Grane, Ellen

Gunshot Detection from Audio Streams in Portable Devices

Mark

Bokelund, Linnea ^LU and Grane, Ellen ^LU (2022) EITM01 20221
Department of Electrical and Information Technology

Abstract: Machine learning and artificial neural networks can be used to classify or detect specific sound events in audio signals. Gunshot detection is one use case for such networks and can be used to help law enforcement by alerting officers or triggering camera recordings.

However, artificial neural networks with a high performance usually require large amounts of computational power, meaning that they do not work on smaller portable devices. This thesis shows that a small convolutional neural network (CNN) can be used for real-time gunshot detection on a portable camera without requiring too much memory, battery consumption, or CPU power.

We implemented a CNN with four layers and 100k trainable parameters to detect gunshots. We could... (More); Machine learning and artificial neural networks can be used to classify or detect specific sound events in audio signals. Gunshot detection is one use case for such networks and can be used to help law enforcement by alerting officers or triggering camera recordings.

However, artificial neural networks with a high performance usually require large amounts of computational power, meaning that they do not work on smaller portable devices. This thesis shows that a small convolutional neural network (CNN) can be used for real-time gunshot detection on a portable camera without requiring too much memory, battery consumption, or CPU power.

We implemented a CNN with four layers and 100k trainable parameters to detect gunshots. We could reach an average precision of 0.98 and an F1 score of 0.95. We benchmarked the runtime performance of this architecture on the Axis Body Worn Cameras (BWCs). For real-time gunshot detection, our system uses 11.9 MB RAM and requires 4.9 MB persistent memory; it decreases the battery time by only 8.4% and uses approximately 11.5% of the CPU. With our configuration, the real-time detection has a latency of 3.6 seconds on the BWC.

The results of our Master’s thesis show that audio-based gunshot detection on portable devices is indeed viable. We hope it will encourage the research on simpler features for audio classification. (Less)
Popular Abstract: When a threatening situation suddenly arises, it may be difficult to quickly get an opportunity to alert the authorities. During such circumstances, it would be beneficial if a nearby electronic device could be set to automatically detect the threat, and trigger appropriate actions to get help.

In this master thesis, we focus on detecting gunshot sounds on the Axis Body Worn Camera (BWC) that is used by police officers and guards. It is not hard to imagine that in a situation where shots are fired, the police officer does not have the possibility to start a recording manually. If the camera could identify gunshots, it could trigger a recording automatically and thus ensure that evidence is captured. The BWC has a prebuffer functionality... (More); When a threatening situation suddenly arises, it may be difficult to quickly get an opportunity to alert the authorities. During such circumstances, it would be beneficial if a nearby electronic device could be set to automatically detect the threat, and trigger appropriate actions to get help.

In this master thesis, we focus on detecting gunshot sounds on the Axis Body Worn Camera (BWC) that is used by police officers and guards. It is not hard to imagine that in a situation where shots are fired, the police officer does not have the possibility to start a recording manually. If the camera could identify gunshots, it could trigger a recording automatically and thus ensure that evidence is captured. The BWC has a prebuffer functionality that allows the last 90 seconds before a recording is started to be included in the resulting video, meaning that the events leading up to the gunshot will also be included.

Gunshot detection has been done before, but often the detection is performed on a powerful device such as a server. Our target device is a small, portable camera with limited battery life, CPU and memory. Therefore, our main challenge was to find a solution that uses as little energy and memory as possible, while at the same time identifies gunshots with a high accuracy.

Within machine learning, image recognition has been researched and developed widely over the last decades. Recognition of particular sounds in audio recordings has not been explored to the same extent. However, by transforming the audio signals into spectrograms that can be viewed as image representations of the signals, the advances in image recognition can be applied to sound identification as well. A common technique used for image recognition is convolutional neural networks (CNNs), and this is what we use for our gunshot detection.

A very important aspect of machine learning is the data. Regardless of the chosen technique, the model will need a lot of data to train on. In our case, we needed data in the form of sound recordings containing gunshots. We also needed other sounds, for the model to learn what a gunshot does not sound like. We used a public dataset called FSD50K containing tens of thousands of clips with sounds of people, traffic, animals, and much more. However, this dataset did not have enough gunshot sounds. Therefore, we recorded our own dataset in cooperation with a local pistol club.

By implementing a small CNN that we trained on the FSD50K dataset and our gunshot sounds, we were able to create a prototype that works on the BWC. With a smaller CNN, the computations needed to analyze one audio clip is reduced and thereby also the energy consumption. We found that to detect gunshot sounds in audio signals, a small CNN is sufficient, and thus our conclusion is that it is indeed viable to implement a machine learning gunshot detection on a small device such as the BWC. (Less)

Please use this url to cite or link to this publication: http://lup.lub.lu.se/student-papers/record/9090317

author

Bokelund, Linnea ^LU and Grane, Ellen ^LU

supervisor

Pierre Nugues ^LU

organization

Department of Electrical and Information Technology

course

EITM01 20221

year

2022

type

H2 - Master's Degree (Two Years)

subject

Technology and Engineering

keywords

machine learning, gunshot detection, convolutional neural networks, sound event detection, portable devices

report number

LU/LTH-EIT 2022-873

language

English

id

9090317

date added to LUP

2022-06-21 10:40:45

date last changed

2022-06-21 10:40:45

@misc{9090317,
  abstract     = {{Machine learning and artificial neural networks can be used to classify or detect specific sound events in audio signals. Gunshot detection is one use case for such networks and can be used to help law enforcement by alerting officers or triggering camera recordings.

However, artificial neural networks with a high performance usually require large amounts of computational power, meaning that they do not work on smaller portable devices. This thesis shows that a small convolutional neural network (CNN) can be used for real-time gunshot detection on a portable camera without requiring too much memory, battery consumption, or CPU power.

We implemented a CNN with four layers and 100k trainable parameters to detect gunshots. We could reach an average precision of 0.98 and an F1 score of 0.95. We benchmarked the runtime performance of this architecture on the Axis Body Worn Cameras (BWCs). For real-time gunshot detection, our system uses 11.9 MB RAM and requires 4.9 MB persistent memory; it decreases the battery time by only 8.4% and uses approximately 11.5% of the CPU. With our configuration, the real-time detection has a latency of 3.6 seconds on the BWC.

The results of our Master’s thesis show that audio-based gunshot detection on portable devices is indeed viable. We hope it will encourage the research on simpler features for audio classification.}},
  author       = {{Bokelund, Linnea and Grane, Ellen}},
  language     = {{eng}},
  note         = {{Student Paper}},
  title        = {{Gunshot Detection from Audio Streams in Portable Devices}},
  year         = {{2022}},
}

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Gunshot Detection from Audio Streams in Portable Devices