The Sound of Identity: Deep Learning vs. Classic ML for Handclap Biometrics
(2026) BMEM05 20261Division for Biomedical Engineering
- Abstract
- Various devices rely on invasive biometric identifiers, such as voice and facial recognition, raising privacy concerns. Acoustic handclap control presents a privacy-preserving alternative for interaction with computers, as it does not capture speech or visual data. An analysis of the original implementation suggested that existing classifiers were influenced by environmental bias. With the objective of improving performance and reducing environmental bias, this thesis compares two distinct approaches: a classical machine learning approach utilizing classical models with feature extraction, and a deep learning approach employing a modified ResNet-18 neural network trained on mel spectrograms. To isolate the handclap from room reverberation,... (More)
- Various devices rely on invasive biometric identifiers, such as voice and facial recognition, raising privacy concerns. Acoustic handclap control presents a privacy-preserving alternative for interaction with computers, as it does not capture speech or visual data. An analysis of the original implementation suggested that existing classifiers were influenced by environmental bias. With the objective of improving performance and reducing environmental bias, this thesis compares two distinct approaches: a classical machine learning approach utilizing classical models with feature extraction, and a deep learning approach employing a modified ResNet-18 neural network trained on mel spectrograms. To isolate the handclap from room reverberation, the study introduced a 20 ms audio truncation preprocessing step in comparison to the approximately 1 second of data utilized in the previous study. Furthermore, the system architecture was decoupled into distinct handclap detection and biometric classification stages with the goal of improving performance.
The offline evaluation demonstrated that the deep learning approach outperformed the classical machine learning approach in both biometrics and clap configuration classification. While the 20 ms truncation showed a significant reduction in environmental bias, this dependency persisted, with classifiers continuing to associate user identities with specific room acoustics. Furthermore, real-time testing revealed high model sensitivity to unclassified impulsive noises, and the memory footprint of the ResNet-18 presents a significant constraint for devices with limited memory. Decoupling the system also did not show any meaningful improvements. However, the impact of these suggested architectural changes was likely fundamentally limited by the restricted, environmentally biased dataset. These findings demonstrate that while biometric identification via handclaps is possible, it remains a highly complex task which requires a large and diverse dataset. (Less) - Popular Abstract (Swedish)
- Kan en handklapp identifiera dig?
Idag utnyttjar vi teknologier så som ansiktsigenkänning och fingeravtryck för att exempelvis låsa upp mobiltelefonen eller auktorisera betalningar. Dessa metoder är bekväma, men de samlar in permanenta identifierande egenskaper som kan missbrukas om de hamnar i fel händer. I det här examensarbetet undersöktes ett mer integritetsvänligt alternativ, identifiering av personer baserat på deras handklappar.
Alla klappar olika. Handens form, fingrarnas kontakt och kraften bakom klappen skapar en unik akustisk signatur. Frågan är om en datormodell kan lära sig att känna igen denna signatur och koppla den till rätt person. Tidigare studier har presenterat lovande resultat, men visat tecken på att modellen... (More) - Kan en handklapp identifiera dig?
Idag utnyttjar vi teknologier så som ansiktsigenkänning och fingeravtryck för att exempelvis låsa upp mobiltelefonen eller auktorisera betalningar. Dessa metoder är bekväma, men de samlar in permanenta identifierande egenskaper som kan missbrukas om de hamnar i fel händer. I det här examensarbetet undersöktes ett mer integritetsvänligt alternativ, identifiering av personer baserat på deras handklappar.
Alla klappar olika. Handens form, fingrarnas kontakt och kraften bakom klappen skapar en unik akustisk signatur. Frågan är om en datormodell kan lära sig att känna igen denna signatur och koppla den till rätt person. Tidigare studier har presenterat lovande resultat, men visat tecken på att modellen fuskat. Istället för att lära sig själva klappen, har den lärt sig hur rummet eller mikrofonen låter, ett fenomen som kallas miljöbias.
I denna studie jämfördes två olika metoder: en klassisk maskininlärningsmetod där relevanta egenskaper i ljudet manuellt plockades ut, och en djupinlärningsmetod där ett neuralt nätverk själv fick lära sig vad som var viktigt.
För att undersöka miljöbias användes två strategier. Ett experiment där man undersökte hur ofta modellen gissade att en person var en annan som spelades in under samma dag. I det andra experimentet användes Grad-CAM, ett verktyg som skapar värmekartor för att visa exakt vad modellen tittar på, för att analysera vilka komponenter av indatan som hade mest inflytande på modellens beslut.
Resultaten från dessa analyser indikerade att miljöbias fanns i datan, som även användes i ett tidigare arbete. För att motverka detta klipptes ljudklippen ned till 20 millisekunder, tillräckligt för att fånga själva klappen, men samtidigt utesluta rumskontexten. Därefter jämfördes de klassiska modellerna mot det neurala nätverket.
Djupinlärningen presterade bättre än de klassiska maskininlärningsmetoderna. Även om klippningen minskade inflytandet från rummet fanns fortfarande tecken på en kvarstående miljöbias. Slutsatsen blev att biometrisk identifiering via handklappar är möjligt, men kräver ett stort och varierat datamaterial för att fungera tillförlitligt i alla miljöer. (Less)
Please use this url to cite or link to this publication:
https://lup.lub.lu.se/student-papers/record/9241006
- author
- Lundager, Mark Peter LU and Perlerot, Moa LU
- supervisor
- organization
- alternative title
- Handklappens Identitet: En Jämförelse mellan Djupinlärning och Klassisk Maskininlärning i Biometrisk Identifiering
- course
- BMEM05 20261
- year
- 2026
- type
- H2 - Master's Degree (Two Years)
- subject
- keywords
- Deep Learning, Classical Machine Learning, Feature Engineering, Biometrics
- language
- English
- additional info
- 2026-13
- id
- 9241006
- date added to LUP
- 2026-06-29 08:53:46
- date last changed
- 2026-06-29 08:53:46
@misc{9241006,
abstract = {{Various devices rely on invasive biometric identifiers, such as voice and facial recognition, raising privacy concerns. Acoustic handclap control presents a privacy-preserving alternative for interaction with computers, as it does not capture speech or visual data. An analysis of the original implementation suggested that existing classifiers were influenced by environmental bias. With the objective of improving performance and reducing environmental bias, this thesis compares two distinct approaches: a classical machine learning approach utilizing classical models with feature extraction, and a deep learning approach employing a modified ResNet-18 neural network trained on mel spectrograms. To isolate the handclap from room reverberation, the study introduced a 20 ms audio truncation preprocessing step in comparison to the approximately 1 second of data utilized in the previous study. Furthermore, the system architecture was decoupled into distinct handclap detection and biometric classification stages with the goal of improving performance.
The offline evaluation demonstrated that the deep learning approach outperformed the classical machine learning approach in both biometrics and clap configuration classification. While the 20 ms truncation showed a significant reduction in environmental bias, this dependency persisted, with classifiers continuing to associate user identities with specific room acoustics. Furthermore, real-time testing revealed high model sensitivity to unclassified impulsive noises, and the memory footprint of the ResNet-18 presents a significant constraint for devices with limited memory. Decoupling the system also did not show any meaningful improvements. However, the impact of these suggested architectural changes was likely fundamentally limited by the restricted, environmentally biased dataset. These findings demonstrate that while biometric identification via handclaps is possible, it remains a highly complex task which requires a large and diverse dataset.}},
author = {{Lundager, Mark Peter and Perlerot, Moa}},
language = {{eng}},
note = {{Student Paper}},
title = {{The Sound of Identity: Deep Learning vs. Classic ML for Handclap Biometrics}},
year = {{2026}},
}