From Sound to Structure : Robust Localization of Sources, Sensors, and Surroundings

Tegler, Erik

From Sound to Structure : Robust Localization of Sources, Sensors, and Surroundings

Mark

Tegler, Erik ^LU (2025)

Abstract: The topic of this thesis is how to infer geometric information using sound data. Achieving this requires solving several subproblems. First, signal processing of the recorded sound is needed to compute measurements of primitive geometric relations. Secondly, robust estimation is needed to go from primitive geometric measurements to more useful higher-level information such as the locations of microphones and sound sources.

In the case of an uncontrolled sound source, one of the main ways of extracting geometric information comes from computing the time between a sound arriving at each of two microphones. This measurement is referred to as the Time-Difference-of-Arrival (TDOA) and it defines a hyperboloid relative to the two... (More); The topic of this thesis is how to infer geometric information using sound data. Achieving this requires solving several subproblems. First, signal processing of the recorded sound is needed to compute measurements of primitive geometric relations. Secondly, robust estimation is needed to go from primitive geometric measurements to more useful higher-level information such as the locations of microphones and sound sources.

In the case of an uncontrolled sound source, one of the main ways of extracting geometric information comes from computing the time between a sound arriving at each of two microphones. This measurement is referred to as the Time-Difference-of-Arrival (TDOA) and it defines a hyperboloid relative to the two microphones, on which the sound source must lie. While classical correlation-based techniques exist for how to compute the TDOA from two recordings, they typically struggle in reverberant environments where the two signals are not just shifted noisy versions of each other. One of the results of this thesis is showing that better time-delay estimation can be performed by using a learning-based approach. The main issue with using a learning-based approach in this domain is a lack of data. However, this thesis demonstrates that it is possible to solve this issue by utilizing simulations of sound propagation to create synthetic data. This data can then be used to train an energy-based model, which demonstrates improved performance on real data compared to classical methods.

After computing primitive geometric relationships from the sensor data, the goal is to convert them into more useful higher-level information such as the locations of microphones and sound sources. The main problem here lies in that a fraction of the measurements are outliers which means that robust estimation methods such as RANSAC (a hypothesis-and-test framework) need to be used. Since the speed of hypothesis creation is key when using RANSAC, this thesis shows how to construct new minimal solvers for several problems. One example is that we show that sensor network self-calibration in the presence of a reverberant plane allows for minimal problems containing fewer microphones than in the echo-free case. (Less)

Please use this url to cite or link to this publication: https://lup.lub.lu.se/record/8fa24dae-dab9-4405-abea-581777aee9e1

author

Tegler, Erik ^LU

supervisor

opponent

Prof. Virtanen, Tuomas, Tampere University, Finland.

organization

alternative title

Från ljud till struktur : Robust lokalisering av ljudkällor, sensorer och omgivande miljö

publishing date

2025-12-04

type

Thesis

publication status

published

subject

Computer Vision and learning System

pages

91 pages

publisher

Centre for Mathematical Sciences, Lund University

defense location

Lecture Hall MH:Hörmander, Centre of Mathematical Sciences, Märkesbacken 4, Faculty of Engineering LTH, Lund University, Lund.

defense date

2026-01-16 13:15:00

ISSN

1404-0034

ISBN

978-91-8104-764-6

978-91-8104-763-9

project

Mathematical Imaging Group

language

English

LU publication?

yes

id

8fa24dae-dab9-4405-abea-581777aee9e1

date added to LUP

2025-12-04 13:40:47

date last changed

2026-02-12 13:11:55

@phdthesis{8fa24dae-dab9-4405-abea-581777aee9e1,
  abstract     = {{The topic of this thesis is how to infer geometric information using sound data. Achieving this requires solving several subproblems. First, signal processing of the recorded sound is needed to compute measurements of primitive geometric relations. Secondly, robust estimation is needed to go from primitive geometric measurements to more useful higher-level information such as the locations of microphones and sound sources. <br/><br/>In the case of an uncontrolled sound source, one of the main ways of extracting geometric information comes from computing the time between a sound arriving at each of two microphones. This measurement is referred to as the Time-Difference-of-Arrival (TDOA) and it defines a hyperboloid relative to the two microphones, on which the sound source must lie. While classical correlation-based techniques exist for how to compute the TDOA from two recordings, they typically struggle in reverberant environments where the two signals are not just shifted noisy versions of each other. One of the results of this thesis is showing that better time-delay estimation can be performed by using a learning-based approach. The main issue with using a learning-based approach in this domain is a lack of data. However, this thesis demonstrates that it is possible to solve this issue by utilizing simulations of sound propagation to create synthetic data. This data can then be used to train an energy-based model, which demonstrates improved performance on real data compared to classical methods.<br/><br/>After computing primitive geometric relationships from the sensor data, the goal is to convert them into more useful higher-level information such as the locations of microphones and sound sources. The main problem here lies in that a fraction of the measurements are outliers which means that robust estimation methods such as RANSAC (a hypothesis-and-test framework) need to be used. Since the speed of hypothesis creation is key when using RANSAC, this thesis shows how to construct new minimal solvers for several problems. One example is that we show that sensor network self-calibration in the presence of a reverberant plane allows for minimal problems containing fewer microphones than in the echo-free case.}},
  author       = {{Tegler, Erik}},
  isbn         = {{978-91-8104-764-6}},
  issn         = {{1404-0034}},
  language     = {{eng}},
  month        = {{12}},
  publisher    = {{Centre for Mathematical Sciences, Lund University}},
  school       = {{Lund University}},
  title        = {{From Sound to Structure : Robust Localization of Sources, Sensors, and Surroundings}},
  url          = {{https://lup.lub.lu.se/search/files/234840108/thesis_tegler_e_spik.pdf}},
  year         = {{2025}},
}

Lund University Publications

LUND UNIVERSITY LIBRARIES

From Sound to Structure : Robust Localization of Sources, Sensors, and Surroundings