Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

Learning an interpretable end-to-end network for real-time acoustic beamforming

Liang, Hao ; Zhou, Guanxing ; Tu, Xiaotong LU orcid ; Jakobsson, Andreas LU orcid ; Ding, Xinghao and Huang, Yue (2024) In Journal of Sound and Vibration 591.
Abstract

Recently, many forms of audio industrial applications, such as sound monitoring and source localization, have begun exploiting smart multi-modal devices equipped with a microphone array. Regrettably, model-based methods are often difficult to employ for such devices due to their high computational complexity, as well as the difficulty of appropriately selecting the user-determined parameters. As an alternative, one may use deep network-based methods, but these are often difficult to generalize, nor can they generate the desired beamforming map directly. In this paper, a computationally efficient acoustic beamforming algorithm is proposed, which may be unrolled to form a model-based deep learning network for real-time imaging, here... (More)

Recently, many forms of audio industrial applications, such as sound monitoring and source localization, have begun exploiting smart multi-modal devices equipped with a microphone array. Regrettably, model-based methods are often difficult to employ for such devices due to their high computational complexity, as well as the difficulty of appropriately selecting the user-determined parameters. As an alternative, one may use deep network-based methods, but these are often difficult to generalize, nor can they generate the desired beamforming map directly. In this paper, a computationally efficient acoustic beamforming algorithm is proposed, which may be unrolled to form a model-based deep learning network for real-time imaging, here termed the DAMAS-FISTA-Net. By exploiting the natural structure of an acoustic beamformer, the proposed network inherits the physical knowledge of the acoustic system, and thus learns the underlying physical properties of the propagation. As a result, all the network parameters may be learned end-to-end, guided by a model-based prior using back-propagation. Notably, the proposed network enables an excellent interpretability and the ability of being able to process the raw data directly. Extensive numerical experiments using both simulated and real-world data illustrate the preferable performance of the DAMAS-FISTA-Net as compared to alternative approaches.

(Less)
Please use this url to cite or link to this publication:
author
; ; ; ; and
organization
publishing date
type
Contribution to journal
publication status
published
subject
keywords
Acoustic beamforming, Acoustic imaging, Array signal processing, Interpretable network, Model-based deep learning, Source localization
in
Journal of Sound and Vibration
volume
591
article number
118620
publisher
Elsevier
external identifiers
  • scopus:85198609264
ISSN
0022-460X
DOI
10.1016/j.jsv.2024.118620
language
English
LU publication?
yes
id
7a5a6121-3616-420b-bf3d-6a0cc6c8d189
date added to LUP
2024-08-26 15:47:59
date last changed
2024-08-26 15:49:01
@article{7a5a6121-3616-420b-bf3d-6a0cc6c8d189,
  abstract     = {{<p>Recently, many forms of audio industrial applications, such as sound monitoring and source localization, have begun exploiting smart multi-modal devices equipped with a microphone array. Regrettably, model-based methods are often difficult to employ for such devices due to their high computational complexity, as well as the difficulty of appropriately selecting the user-determined parameters. As an alternative, one may use deep network-based methods, but these are often difficult to generalize, nor can they generate the desired beamforming map directly. In this paper, a computationally efficient acoustic beamforming algorithm is proposed, which may be unrolled to form a model-based deep learning network for real-time imaging, here termed the DAMAS-FISTA-Net. By exploiting the natural structure of an acoustic beamformer, the proposed network inherits the physical knowledge of the acoustic system, and thus learns the underlying physical properties of the propagation. As a result, all the network parameters may be learned end-to-end, guided by a model-based prior using back-propagation. Notably, the proposed network enables an excellent interpretability and the ability of being able to process the raw data directly. Extensive numerical experiments using both simulated and real-world data illustrate the preferable performance of the DAMAS-FISTA-Net as compared to alternative approaches.</p>}},
  author       = {{Liang, Hao and Zhou, Guanxing and Tu, Xiaotong and Jakobsson, Andreas and Ding, Xinghao and Huang, Yue}},
  issn         = {{0022-460X}},
  keywords     = {{Acoustic beamforming; Acoustic imaging; Array signal processing; Interpretable network; Model-based deep learning; Source localization}},
  language     = {{eng}},
  publisher    = {{Elsevier}},
  series       = {{Journal of Sound and Vibration}},
  title        = {{Learning an interpretable end-to-end network for real-time acoustic beamforming}},
  url          = {{http://dx.doi.org/10.1016/j.jsv.2024.118620}},
  doi          = {{10.1016/j.jsv.2024.118620}},
  volume       = {{591}},
  year         = {{2024}},
}