Advanced

A self-calibrating system for finger tracking using sound waves

Hammarlund, Linus LU (2017) In Master's Theses in Mathematical Sciences FMA820 20152
Mathematics (Faculty of Engineering)
Abstract
In this thesis a system for tracking the fingers of a user using sound waves is developed. The proposed solution is to attach a small speaker to each finger and then have a number of microphones placed ad hoc around a computer monitor listening to the speakers. The system should then be able to track the positions of the fingers so that the coordinates can be mapped to the computer monitor and be used for human-computer interfacing. The thesis focuses on the proof-of-concept of the system. The system pipeline consists of three parts: signal processing, system self-calibration and real-time sound source tracking.

In the signal processing step four different signal methods are constructed and evaluated. It is shown that multiple signals... (More)
In this thesis a system for tracking the fingers of a user using sound waves is developed. The proposed solution is to attach a small speaker to each finger and then have a number of microphones placed ad hoc around a computer monitor listening to the speakers. The system should then be able to track the positions of the fingers so that the coordinates can be mapped to the computer monitor and be used for human-computer interfacing. The thesis focuses on the proof-of-concept of the system. The system pipeline consists of three parts: signal processing, system self-calibration and real-time sound source tracking.

In the signal processing step four different signal methods are constructed and evaluated. It is shown that multiple signals can be used in parallel. The signal method with the best performance uses a number of dampened sine waves stacked on top of each other, with each sound wave having a different frequency within a specified frequency band. The goal was to use ultrasound frequency bands for the system but experimenting showed that they gave rise to a lot of aliasing, thus rendering the higher frequency bands unusable.

The second step, the system self-calibration, aims to do a scene reconstruction to find the positions of the microphones and the sound source path using only the received signal transmissions. First the time-difference of arrival (TDOA) values are estimated using robust techniques centred around a GCC-PHAT. The time offsets are then estimated in order to convert the TDOA problem into a time-of-arrival (TOA) problem so that the positions of the receivers and sound events can be calculated. Finally a "virtual screen" is fitted to the sound source path to be used for coordinate projection.

The scene reconstruction was successful in 80 % of the test cases, in the sense that it managed to estimate the spatial positions at all. The estimates for the microphones had errors of 11.8 +/- 5 centimetres on average for the successful test cases, which is worse than the results presented in previous research. However, the best test case outperformed the results of another paper. The newly developed and implemented technique for finding the virtual screen was far from robust and only found a reasonable virtual screen in 12.5 % of the test cases.

In the third step the sound events were estimated, one sound event at a time, using the SRP-PHAT method with the CFRC improvement. Unfortunate choices of the search volumes made the calculations very computationally heavy. The results were comparable to those of the system self-calibration when using the same data and the estimated microphone positions. (Less)
Please use this url to cite or link to this publication:
author
Hammarlund, Linus LU
supervisor
organization
course
FMA820 20152
year
type
H2 - Master's Degree (Two Years)
subject
keywords
cfrc, gcc-phat, srp-phat, self-calibration, toda, toa
publication/series
Master's Theses in Mathematical Sciences
report number
LUTFMA-3333-2017
ISSN
1404-6342
other publication id
2017:E63
language
English
id
8926259
date added to LUP
2017-09-29 15:48:19
date last changed
2017-09-29 15:48:19
@misc{8926259,
  abstract     = {In this thesis a system for tracking the fingers of a user using sound waves is developed. The proposed solution is to attach a small speaker to each finger and then have a number of microphones placed ad hoc around a computer monitor listening to the speakers. The system should then be able to track the positions of the fingers so that the coordinates can be mapped to the computer monitor and be used for human-computer interfacing. The thesis focuses on the proof-of-concept of the system. The system pipeline consists of three parts: signal processing, system self-calibration and real-time sound source tracking. 

In the signal processing step four different signal methods are constructed and evaluated. It is shown that multiple signals can be used in parallel. The signal method with the best performance uses a number of dampened sine waves stacked on top of each other, with each sound wave having a different frequency within a specified frequency band. The goal was to use ultrasound frequency bands for the system but experimenting showed that they gave rise to a lot of aliasing, thus rendering the higher frequency bands unusable.

The second step, the system self-calibration, aims to do a scene reconstruction to find the positions of the microphones and the sound source path using only the received signal transmissions. First the time-difference of arrival (TDOA) values are estimated using robust techniques centred around a GCC-PHAT. The time offsets are then estimated in order to convert the TDOA problem into a time-of-arrival (TOA) problem so that the positions of the receivers and sound events can be calculated. Finally a "virtual screen" is fitted to the sound source path to be used for coordinate projection.

The scene reconstruction was successful in 80 % of the test cases, in the sense that it managed to estimate the spatial positions at all. The estimates for the microphones had errors of 11.8 +/- 5 centimetres on average for the successful test cases, which is worse than the results presented in previous research. However, the best test case outperformed the results of another paper. The newly developed and implemented technique for finding the virtual screen was far from robust and only found a reasonable virtual screen in 12.5 % of the test cases.

In the third step the sound events were estimated, one sound event at a time, using the SRP-PHAT method with the CFRC improvement. Unfortunate choices of the search volumes made the calculations very computationally heavy. The results were comparable to those of the system self-calibration when using the same data and the estimated microphone positions.},
  author       = {Hammarlund, Linus},
  issn         = {1404-6342},
  keyword      = {cfrc,gcc-phat,srp-phat,self-calibration,toda,toa},
  language     = {eng},
  note         = {Student Paper},
  series       = {Master's Theses in Mathematical Sciences},
  title        = {A self-calibrating system for finger tracking using sound waves},
  year         = {2017},
}