Comparing spectrum estimators in speaker verification under additive noise degradation

Hanilci, C.; Kinnunen, T.; Saeidi, R.; Pohjalainen, J.; Alku, P.; Ertas, F.; Sandberg, J.; Sandsten, Maria

Comparing spectrum estimators in speaker verification under additive noise degradation

Mark

Hanilci, C. ; Kinnunen, T. ; Saeidi, R. ; Pohjalainen, J. ; Alku, P. ; Ertas, F. ; Sandberg, J. and Sandsten, Maria ^LU (2012) 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) p.4769-4772

Abstract: Different short-term spectrum estimators for speaker verification under additive noise are considered. Conventionally, mel-frequency cepstral coefficients (MFCCs) are computed from discrete Fourier transform (DFT) spectra of windowed speech frames. Recently, linear prediction (LP) and its temporally weighted variants have been substituted as the spectrum analysis method in speech and speaker recognition. In this paper, 12 different short-term spectrum estimation methods are compared for speaker verification under additive noise contamination. Experimental results conducted on NIST 2002 SRE show that the spectrum estimation method has a large effect on recognition performance and stabilized weighted LP (SWLP) and minimum variance... (More); Different short-term spectrum estimators for speaker verification under additive noise are considered. Conventionally, mel-frequency cepstral coefficients (MFCCs) are computed from discrete Fourier transform (DFT) spectra of windowed speech frames. Recently, linear prediction (LP) and its temporally weighted variants have been substituted as the spectrum analysis method in speech and speaker recognition. In this paper, 12 different short-term spectrum estimation methods are compared for speaker verification under additive noise contamination. Experimental results conducted on NIST 2002 SRE show that the spectrum estimation method has a large effect on recognition performance and stabilized weighted LP (SWLP) and minimum variance distortionless response (MVDR) methods yield approximately 7 % and 8 % relative improvements over the standard DFT method at -10 dB SNR level of factory and babble noises, respectively in terms of equal error rate (EER). (Less)

Please use this url to cite or link to this publication: https://lup.lub.lu.se/record/3290081

author

Hanilci, C. ; Kinnunen, T. ; Saeidi, R. ; Pohjalainen, J. ; Alku, P. ; Ertas, F. ; Sandberg, J. and Sandsten, Maria ^LU

organization

Mathematical Statistics

publishing date

2012

type

Chapter in Book/Report/Conference proceeding

publication status

published

subject

Probability Theory and Statistics

keywords

speaker verification, spectrum estimation

host publication

Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on

pages

4 pages

publisher

IEEE - Institute of Electrical and Electronics Engineers Inc.

conference name

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

conference location

Kyoto, Japan

conference dates

2012-03-25 - 2012-03-30

external identifiers

scopus:84867590081

ISSN

1520-6149

ISBN

978-1-4673-0044-5 (online)

978-1-4673-0045-2 (print)

DOI

10.1109/ICASSP.2012.6288985

project

Statistical Signal Processing Group

language

English

LU publication?

yes

id

19390f56-2fc3-4b94-8f08-7bc18dec6457 (old id 3290081)

alternative location

http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6288985

date added to LUP

2016-04-01 14:08:02

date last changed

2026-02-10 14:36:19

@inproceedings{19390f56-2fc3-4b94-8f08-7bc18dec6457,
  abstract     = {{Different short-term spectrum estimators for speaker verification under additive noise are considered. Conventionally, mel-frequency cepstral coefficients (MFCCs) are computed from discrete Fourier transform (DFT) spectra of windowed speech frames. Recently, linear prediction (LP) and its temporally weighted variants have been substituted as the spectrum analysis method in speech and speaker recognition. In this paper, 12 different short-term spectrum estimation methods are compared for speaker verification under additive noise contamination. Experimental results conducted on NIST 2002 SRE show that the spectrum estimation method has a large effect on recognition performance and stabilized weighted LP (SWLP) and minimum variance distortionless response (MVDR) methods yield approximately 7 % and 8 % relative improvements over the standard DFT method at -10 dB SNR level of factory and babble noises, respectively in terms of equal error rate (EER).}},
  author       = {{Hanilci, C. and Kinnunen, T. and Saeidi, R. and Pohjalainen, J. and Alku, P. and Ertas, F. and Sandberg, J. and Sandsten, Maria}},
  booktitle    = {{Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on}},
  isbn         = {{978-1-4673-0044-5 (online)}},
  issn         = {{1520-6149}},
  keywords     = {{speaker verification; spectrum estimation}},
  language     = {{eng}},
  pages        = {{4769--4772}},
  publisher    = {{IEEE - Institute of Electrical and Electronics Engineers Inc.}},
  title        = {{Comparing spectrum estimators in speaker verification under additive noise degradation}},
  url          = {{http://dx.doi.org/10.1109/ICASSP.2012.6288985}},
  doi          = {{10.1109/ICASSP.2012.6288985}},
  year         = {{2012}},
}

Lund University Publications

LUND UNIVERSITY LIBRARIES

Comparing spectrum estimators in speaker verification under additive noise degradation