Optimal Transport Regularization for Simulation-Informed Room Impulse Response Estimation
(2025) In IEEE Transactions on Signal Processing 73. p.5244-5256- Abstract
Many audio applications, including echo-cancellation and active noise control, rely on the availability of accurately estimated room impulse responses (RIRs). For these applications, it is common that the source signal is short and primarily consists of speech or music, which may cause the estimation of the RIR to be poorly conditioned. Although priors on the amplitudes of the RIR could in principle be used to resolve the conditioning issue, there are situations where also the delay structure of the RIR is uncertain. In particular, we here consider when the prior is a simulated RIR obtained from a 3D-reconstruction of the room, from where uncertainties in the geometry, speed of sound, and the source and receiver positions all cause... (More)
Many audio applications, including echo-cancellation and active noise control, rely on the availability of accurately estimated room impulse responses (RIRs). For these applications, it is common that the source signal is short and primarily consists of speech or music, which may cause the estimation of the RIR to be poorly conditioned. Although priors on the amplitudes of the RIR could in principle be used to resolve the conditioning issue, there are situations where also the delay structure of the RIR is uncertain. In particular, we here consider when the prior is a simulated RIR obtained from a 3D-reconstruction of the room, from where uncertainties in the geometry, speed of sound, and the source and receiver positions all cause uncertainties in the delay structure of the simulated RIR. By considering such sources of error, we derive two robust regularizers for RIR estimation based on the concept of optimal transport. For each estimator, an efficient solver is proposed based on proximal splitting and Sinkhorn-type iterations. From numerical experiments on real data, we find that when only the uncertainty in the amplitude structure is considered in the regularizer, the simulated prior can in fact worsen the estimation as compared to the Tikhonov and Lasso estimators. Interestingly enough, when robustness for uncertainties in the delay structure is also introduced using the proposed regularizers, even the most naive room model, i.e., a shoe-box approximation, can significantly improve the estimate.
(Less)
- author
- Björkman, Anton
; Sundström, David
LU
; Jakobsson, Andreas
LU
and Elvander, Filip
LU
- organization
- publishing date
- 2025
- type
- Contribution to journal
- publication status
- published
- subject
- keywords
- optimal transport, room impulse response, spatial audio modelling
- in
- IEEE Transactions on Signal Processing
- volume
- 73
- pages
- 13 pages
- publisher
- IEEE - Institute of Electrical and Electronics Engineers Inc.
- external identifiers
-
- scopus:105025683787
- ISSN
- 1053-587X
- DOI
- 10.1109/TSP.2025.3643595
- language
- English
- LU publication?
- yes
- additional info
- Publisher Copyright: © 1991-2012 IEEE.
- id
- 42ce450e-9225-4707-bc6c-3b2ecfb5b233
- date added to LUP
- 2026-01-20 08:30:37
- date last changed
- 2026-03-24 13:10:36
@article{42ce450e-9225-4707-bc6c-3b2ecfb5b233,
abstract = {{<p>Many audio applications, including echo-cancellation and active noise control, rely on the availability of accurately estimated room impulse responses (RIRs). For these applications, it is common that the source signal is short and primarily consists of speech or music, which may cause the estimation of the RIR to be poorly conditioned. Although priors on the amplitudes of the RIR could in principle be used to resolve the conditioning issue, there are situations where also the delay structure of the RIR is uncertain. In particular, we here consider when the prior is a simulated RIR obtained from a 3D-reconstruction of the room, from where uncertainties in the geometry, speed of sound, and the source and receiver positions all cause uncertainties in the delay structure of the simulated RIR. By considering such sources of error, we derive two robust regularizers for RIR estimation based on the concept of optimal transport. For each estimator, an efficient solver is proposed based on proximal splitting and Sinkhorn-type iterations. From numerical experiments on real data, we find that when only the uncertainty in the amplitude structure is considered in the regularizer, the simulated prior can in fact worsen the estimation as compared to the Tikhonov and Lasso estimators. Interestingly enough, when robustness for uncertainties in the delay structure is also introduced using the proposed regularizers, even the most naive room model, i.e., a shoe-box approximation, can significantly improve the estimate.</p>}},
author = {{Björkman, Anton and Sundström, David and Jakobsson, Andreas and Elvander, Filip}},
issn = {{1053-587X}},
keywords = {{optimal transport; room impulse response; spatial audio modelling}},
language = {{eng}},
pages = {{5244--5256}},
publisher = {{IEEE - Institute of Electrical and Electronics Engineers Inc.}},
series = {{IEEE Transactions on Signal Processing}},
title = {{Optimal Transport Regularization for Simulation-Informed Room Impulse Response Estimation}},
url = {{http://dx.doi.org/10.1109/TSP.2025.3643595}},
doi = {{10.1109/TSP.2025.3643595}},
volume = {{73}},
year = {{2025}},
}