From Sound to Structure : Robust Localization of Sources, Sensors, and Surroundings
(2025)- Abstract
- The topic of this thesis is how to infer geometric information using sound data. Achieving this requires solving several subproblems. First, signal processing of the recorded sound is needed to compute measurements of primitive geometric relations. Secondly, robust estimation is needed to go from primitive geometric measurements to more useful higher-level information such as the locations of microphones and sound sources.
In the case of an uncontrolled sound source, one of the main ways of extracting geometric information comes from computing the time between a sound arriving at each of two microphones. This measurement is referred to as the Time-Difference-of-Arrival (TDOA) and it defines a hyperboloid relative to the two... (More) - The topic of this thesis is how to infer geometric information using sound data. Achieving this requires solving several subproblems. First, signal processing of the recorded sound is needed to compute measurements of primitive geometric relations. Secondly, robust estimation is needed to go from primitive geometric measurements to more useful higher-level information such as the locations of microphones and sound sources.
In the case of an uncontrolled sound source, one of the main ways of extracting geometric information comes from computing the time between a sound arriving at each of two microphones. This measurement is referred to as the Time-Difference-of-Arrival (TDOA) and it defines a hyperboloid relative to the two microphones, on which the sound source must lie. While classical correlation-based techniques exist for how to compute the TDOA from two recordings, they typically struggle in reverberant environments where the two signals are not just shifted noisy versions of each other. One of the results of this thesis is showing that better time-delay estimation can be performed by using a learning-based approach. The main issue with using a learning-based approach in this domain is a lack of data. However, this thesis demonstrates that it is possible to solve this issue by utilizing simulations of sound propagation to create synthetic data. This data can then be used to train an energy-based model, which demonstrates improved performance on real data compared to classical methods.
After computing primitive geometric relationships from the sensor data, the goal is to convert them into more useful higher-level information such as the locations of microphones and sound sources. The main problem here lies in that a fraction of the measurements are outliers which means that robust estimation methods such as RANSAC (a hypothesis-and-test framework) need to be used. Since the speed of hypothesis creation is key when using RANSAC, this thesis shows how to construct new minimal solvers for several problems. One example is that we show that sensor network self-calibration in the presence of a reverberant plane allows for minimal problems containing fewer microphones than in the echo-free case. (Less)
Please use this url to cite or link to this publication:
https://lup.lub.lu.se/record/8fa24dae-dab9-4405-abea-581777aee9e1
- author
- Tegler, Erik LU
- supervisor
-
- Kalle Åström LU
- Magnus Oskarsson LU
- Viktor Larsson LU
- Fredrik Tufvesson LU
- Bo Bernhardsson LU
- opponent
-
- Prof. Virtanen, Tuomas, Tampere University, Finland.
- organization
- alternative title
- Från ljud till struktur : Robust lokalisering av ljudkällor, sensorer och omgivande miljö
- publishing date
- 2025-12-04
- type
- Thesis
- publication status
- published
- subject
- pages
- 91 pages
- publisher
- Centre for Mathematical Sciences, Lund University
- defense location
- Lecture Hall MH:Hörmander, Centre of Mathematical Sciences, Märkesbacken 4, Faculty of Engineering LTH, Lund University, Lund.
- defense date
- 2026-01-16 13:15:00
- ISSN
- 1404-0034
- 1404-0034
- ISBN
- 978-91-8104-764-6
- 978-91-8104-763-9
- language
- English
- LU publication?
- yes
- id
- 8fa24dae-dab9-4405-abea-581777aee9e1
- date added to LUP
- 2025-12-04 13:40:47
- date last changed
- 2025-12-10 10:48:48
@phdthesis{8fa24dae-dab9-4405-abea-581777aee9e1,
abstract = {{The topic of this thesis is how to infer geometric information using sound data. Achieving this requires solving several subproblems. First, signal processing of the recorded sound is needed to compute measurements of primitive geometric relations. Secondly, robust estimation is needed to go from primitive geometric measurements to more useful higher-level information such as the locations of microphones and sound sources. <br/><br/>In the case of an uncontrolled sound source, one of the main ways of extracting geometric information comes from computing the time between a sound arriving at each of two microphones. This measurement is referred to as the Time-Difference-of-Arrival (TDOA) and it defines a hyperboloid relative to the two microphones, on which the sound source must lie. While classical correlation-based techniques exist for how to compute the TDOA from two recordings, they typically struggle in reverberant environments where the two signals are not just shifted noisy versions of each other. One of the results of this thesis is showing that better time-delay estimation can be performed by using a learning-based approach. The main issue with using a learning-based approach in this domain is a lack of data. However, this thesis demonstrates that it is possible to solve this issue by utilizing simulations of sound propagation to create synthetic data. This data can then be used to train an energy-based model, which demonstrates improved performance on real data compared to classical methods.<br/><br/>After computing primitive geometric relationships from the sensor data, the goal is to convert them into more useful higher-level information such as the locations of microphones and sound sources. The main problem here lies in that a fraction of the measurements are outliers which means that robust estimation methods such as RANSAC (a hypothesis-and-test framework) need to be used. Since the speed of hypothesis creation is key when using RANSAC, this thesis shows how to construct new minimal solvers for several problems. One example is that we show that sensor network self-calibration in the presence of a reverberant plane allows for minimal problems containing fewer microphones than in the echo-free case.}},
author = {{Tegler, Erik}},
isbn = {{978-91-8104-764-6}},
issn = {{1404-0034}},
language = {{eng}},
month = {{12}},
publisher = {{Centre for Mathematical Sciences, Lund University}},
school = {{Lund University}},
title = {{From Sound to Structure : Robust Localization of Sources, Sensors, and Surroundings}},
url = {{https://lup.lub.lu.se/search/files/234840108/thesis_tegler_e_spik.pdf}},
year = {{2025}},
}