Harmonic minimum mean squared error filters for multichannel speech enhancement

Jensen, Jesper Rindom; Christensen, Mads Groesboll; Jakobsson, Andreas

Harmonic minimum mean squared error filters for multichannel speech enhancement

Mark

Jensen, Jesper Rindom ; Christensen, Mads Groesboll and Jakobsson, Andreas ^LU

(2017) 42nd IEEE International Conference on Audio, Speech, and Signals Processing, ICASSP 2017 p.501-505

Abstract: Many state-of-the-art multichannel speech enhancement methods rely on second-order statistics of the desired speech signal, the noise signal, or both. Estimation of those are difficult in practice, resulting in a practical performance that is typically much lower than their potential theoretical performance. We propose two multichannel enhancement techniques that instead rely on a model for voiced speech. That is, the proposed methods are driven by the signals' fundamental frequencies, which may be accurately estimated even in noisy scenarios. The first method is designed independently of the microphone array geometry and source position, whereas these are utilized in the second approach. Thereby, we can investigate when to exploit such... (More); Many state-of-the-art multichannel speech enhancement methods rely on second-order statistics of the desired speech signal, the noise signal, or both. Estimation of those are difficult in practice, resulting in a practical performance that is typically much lower than their potential theoretical performance. We propose two multichannel enhancement techniques that instead rely on a model for voiced speech. That is, the proposed methods are driven by the signals' fundamental frequencies, which may be accurately estimated even in noisy scenarios. The first method is designed independently of the microphone array geometry and source position, whereas these are utilized in the second approach. Thereby, we can investigate when to exploit such information in the case of localization errors and violations of the spatial assumptions. Numerical results show that the proposed method is able to outperform competing methods in terms of both output SNRs and PESQ scores.
(Less)

Please use this url to cite or link to this publication: https://lup.lub.lu.se/record/a0558000-3137-45e0-bd8e-63630c3cf8b0

author

Jensen, Jesper Rindom ; Christensen, Mads Groesboll and Jakobsson, Andreas ^LU

organization

Mathematical Statistics

publishing date

2017-06-16

type

Chapter in Book/Report/Conference proceeding

publication status

published

subject

keywords

multichannel speech enhancement, voiced speech, MMSE filtering, harmonic filters, DOA mismatch

host publication

2017 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2017 - Proceedings

article number

7952206

pages

5 pages

publisher

IEEE - Institute of Electrical and Electronics Engineers Inc.

conference name

42nd IEEE International Conference on Audio, Speech, and Signals Processing, ICASSP 2017

conference location

New Orleans, United States

conference dates

2017-03-05 - 2017-03-09

external identifiers

scopus:85023776452

ISBN

9781509041176

DOI

10.1109/ICASSP.2017.7952206

project

Statistical Signal Processing Group

language

English

LU publication?

yes

id

a0558000-3137-45e0-bd8e-63630c3cf8b0

date added to LUP

2017-02-14 10:32:22

date last changed

2026-02-10 14:50:37

@inproceedings{a0558000-3137-45e0-bd8e-63630c3cf8b0,
  abstract     = {{<p>Many state-of-the-art multichannel speech enhancement methods rely on second-order statistics of the desired speech signal, the noise signal, or both. Estimation of those are difficult in practice, resulting in a practical performance that is typically much lower than their potential theoretical performance. We propose two multichannel enhancement techniques that instead rely on a model for voiced speech. That is, the proposed methods are driven by the signals' fundamental frequencies, which may be accurately estimated even in noisy scenarios. The first method is designed independently of the microphone array geometry and source position, whereas these are utilized in the second approach. Thereby, we can investigate when to exploit such information in the case of localization errors and violations of the spatial assumptions. Numerical results show that the proposed method is able to outperform competing methods in terms of both output SNRs and PESQ scores.</p>}},
  author       = {{Jensen, Jesper Rindom and Christensen, Mads Groesboll and Jakobsson, Andreas}},
  booktitle    = {{2017 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2017 - Proceedings}},
  isbn         = {{9781509041176}},
  keywords     = {{multichannel speech enhancement; voiced speech; MMSE filtering; harmonic filters; DOA mismatch}},
  language     = {{eng}},
  month        = {{06}},
  pages        = {{501--505}},
  publisher    = {{IEEE - Institute of Electrical and Electronics Engineers Inc.}},
  title        = {{Harmonic minimum mean squared error filters for multichannel speech enhancement}},
  url          = {{http://dx.doi.org/10.1109/ICASSP.2017.7952206}},
  doi          = {{10.1109/ICASSP.2017.7952206}},
  year         = {{2017}},
}

Lund University Publications

LUND UNIVERSITY LIBRARIES

Harmonic minimum mean squared error filters for multichannel speech enhancement