Blind subband beamforming with time-delay constraints for moving source speech enhancement

Yermeche, Zohra; Grbic, Nedelko; Claesson, Ingvar

Blind subband beamforming with time-delay constraints for moving source speech enhancement

Mark

Yermeche, Zohra ; Grbic, Nedelko ^LU and Claesson, Ingvar ^LU (2007) In IEEE Transactions on Audio, Speech, and Language Processing 15(8). p.2360-2372

Abstract: A new robust microphone array method to enhance speech signals generated by a moving person in a noisy environment is presented. This blind approach is based on a two-stage scheme. First, a subband time-delay estimation method is used to localize the dominant speech source. The second stage involves speech enhancement, based on the acquired spatial information, by means of a soft-constrained subband beamformer. The novelty of the proposed method involves considering the spatial spreading of the sound source as equivalent to a time-delay spreading, thus, allowing for the estimated intersensor time-delays to be directly used in the beamforming operations. In comparison to previous approaches, this new method requires no special array... (More); A new robust microphone array method to enhance speech signals generated by a moving person in a noisy environment is presented. This blind approach is based on a two-stage scheme. First, a subband time-delay estimation method is used to localize the dominant speech source. The second stage involves speech enhancement, based on the acquired spatial information, by means of a soft-constrained subband beamformer. The novelty of the proposed method involves considering the spatial spreading of the sound source as equivalent to a time-delay spreading, thus, allowing for the estimated intersensor time-delays to be directly used in the beamforming operations. In comparison to previous approaches, this new method requires no special array geometry, knowledge of the array manifold, or acquisition of calibration data to adapt the array weights. Furthermore, such a scheme allows for the beamformer to efficiently adapt to speaker movement. The robustness of the time-delay estimation of speech signals in high noise levels is improved by making use of the non-Gaussian nature of speech trough a subband Kurtosis-weighted structure. Evaluation in a real environment with a moving speaker shows promising results, with suppression levels of up to 16 dB for background noise and interfering (speech) signals, associated to a relatively small effect of speech distortion. (Less)

Please use this url to cite or link to this publication: https://lup.lub.lu.se/record/5b9d148b-1aca-4ae3-aff6-dd6c865ee210

author

Yermeche, Zohra ; Grbic, Nedelko ^LU and Claesson, Ingvar ^LU

organization

publishing date

2007-11

type

Contribution to journal

publication status

published

subject

Electrical Engineering, Electronic Engineering, Information Engineering

in

IEEE Transactions on Audio, Speech, and Language Processing

volume

15

issue

8

article number

8

pages

13 pages

publisher

IEEE - Institute of Electrical and Electronics Engineers Inc.

external identifiers

scopus:64149114799

ISSN

1558-7916

DOI

10.1109/TASL.2007.903309

language

English

LU publication?

yes

id

5b9d148b-1aca-4ae3-aff6-dd6c865ee210

date added to LUP

2016-06-23 14:07:11

date last changed

2025-10-14 11:59:18

@article{5b9d148b-1aca-4ae3-aff6-dd6c865ee210,
  abstract     = {{A new robust microphone array method to enhance speech signals generated by a moving person in a noisy environment is presented. This blind approach is based on a two-stage scheme. First, a subband time-delay estimation method is used to localize the dominant speech source. The second stage involves speech enhancement, based on the acquired spatial information, by means of a soft-constrained subband beamformer. The novelty of the proposed method involves considering the spatial spreading of the sound source as equivalent to a time-delay spreading, thus, allowing for the estimated intersensor time-delays to be directly used in the beamforming operations. In comparison to previous approaches, this new method requires no special array geometry, knowledge of the array manifold, or acquisition of calibration data to adapt the array weights. Furthermore, such a scheme allows for the beamformer to efficiently adapt to speaker movement. The robustness of the time-delay estimation of speech signals in high noise levels is improved by making use of the non-Gaussian nature of speech trough a subband Kurtosis-weighted structure. Evaluation in a real environment with a moving speaker shows promising results, with suppression levels of up to 16 dB for background noise and interfering (speech) signals, associated to a relatively small effect of speech distortion.}},
  author       = {{Yermeche, Zohra and Grbic, Nedelko and Claesson, Ingvar}},
  issn         = {{1558-7916}},
  language     = {{eng}},
  number       = {{8}},
  pages        = {{2360--2372}},
  publisher    = {{IEEE - Institute of Electrical and Electronics Engineers Inc.}},
  series       = {{IEEE Transactions on Audio, Speech, and Language Processing}},
  title        = {{Blind subband beamforming with time-delay constraints for moving source speech enhancement}},
  url          = {{http://dx.doi.org/10.1109/TASL.2007.903309}},
  doi          = {{10.1109/TASL.2007.903309}},
  volume       = {{15}},
  year         = {{2007}},
}

Lund University Publications

LUND UNIVERSITY LIBRARIES

Blind subband beamforming with time-delay constraints for moving source speech enhancement