Advanced

Speech presence detection in the time-frequency domain using minimum statistics

Sorensen, KV and Andersen, Sören Vang LU (2004) 6th Nordic Signal Processing Symposium (NORSIG 2004) In NORSIG 2004: PROCEEDINGS OF THE 6TH NORDIC SIGNAL PROCESSING SYMPOSIUM 46. p.340-343
Abstract
The contribution of this paper is a time-frequency domain speech presence detection method that classifies power bins in the time-frequency domain as containing speech or not. An initial decision rule is based on ratios between optimally time-smoothed signal-plus-noise periodograms and weighted noise periodogram estimates, obtained from minimum statistics as proposed by Martin [1]. The initial decision rule is generalized into a weighted decomposition where the weights are obtained from off-line training by, means of an artificial neural network. Experiments show that the method can be configured to be very sensitive to speech presence even in very high levels of noise and without classifying much of the noise as speech. It is shown that a... (More)
The contribution of this paper is a time-frequency domain speech presence detection method that classifies power bins in the time-frequency domain as containing speech or not. An initial decision rule is based on ratios between optimally time-smoothed signal-plus-noise periodograms and weighted noise periodogram estimates, obtained from minimum statistics as proposed by Martin [1]. The initial decision rule is generalized into a weighted decomposition where the weights are obtained from off-line training by, means of an artificial neural network. Experiments show that the method can be configured to be very sensitive to speech presence even in very high levels of noise and without classifying much of the noise as speech. It is shown that a fixed set of weights gives good performance at different signal-to-noise ratios indicating that the terms in the decision rule have been adequately chosen. (Less)
Please use this url to cite or link to this publication:
author
publishing date
type
Chapter in Book/Report/Conference proceeding
publication status
published
subject
in
NORSIG 2004: PROCEEDINGS OF THE 6TH NORDIC SIGNAL PROCESSING SYMPOSIUM
volume
46
pages
340 - 343
publisher
HELSINKI UNIVERSITY TECHNOLOGY
conference name
6th Nordic Signal Processing Symposium (NORSIG 2004)
external identifiers
  • wos:000225463400086
  • scopus:11844291956
ISSN
1458-6401
language
English
LU publication?
no
id
f7ae5ed4-9bcf-4ccb-8f46-dbc9d66e6579 (old id 4092548)
date added to LUP
2013-10-17 10:43:49
date last changed
2017-06-11 04:37:43
@inproceedings{f7ae5ed4-9bcf-4ccb-8f46-dbc9d66e6579,
  abstract     = {The contribution of this paper is a time-frequency domain speech presence detection method that classifies power bins in the time-frequency domain as containing speech or not. An initial decision rule is based on ratios between optimally time-smoothed signal-plus-noise periodograms and weighted noise periodogram estimates, obtained from minimum statistics as proposed by Martin [1]. The initial decision rule is generalized into a weighted decomposition where the weights are obtained from off-line training by, means of an artificial neural network. Experiments show that the method can be configured to be very sensitive to speech presence even in very high levels of noise and without classifying much of the noise as speech. It is shown that a fixed set of weights gives good performance at different signal-to-noise ratios indicating that the terms in the decision rule have been adequately chosen.},
  author       = {Sorensen, KV and Andersen, Sören Vang},
  booktitle    = {NORSIG 2004: PROCEEDINGS OF THE 6TH NORDIC SIGNAL PROCESSING SYMPOSIUM},
  issn         = {1458-6401},
  language     = {eng},
  pages        = {340--343},
  publisher    = {HELSINKI UNIVERSITY TECHNOLOGY},
  title        = {Speech presence detection in the time-frequency domain using minimum statistics},
  volume       = {46},
  year         = {2004},
}