What Else is New Than the Hamming Window? Robust MFCCs for Speaker Recognition via Multitapering

Kinnunen, Tomi; Saeidi, Rahim; Sandberg, Johan; Sandsten, Maria

What Else is New Than the Hamming Window? Robust MFCCs for Speaker Recognition via Multitapering

Mark

Kinnunen, Tomi ; Saeidi, Rahim ; Sandberg, Johan ^LU and Sandsten, Maria ^LU (2010) Interspeech 2010 p.2734-2737

Abstract: Usually the mel-frequency cepstral coefficients (MFCCs) are derived via Hamming windowed DFT spectrum. In this paper, we advocate to use a so-called multitaper method instead. Multitaper methods form a spectrum estimate using multiple window functions and frequency-domain averaging. Multitapers provide a robust spectrum estimate but have not received much attention in speech processing. Our speaker recognition experiment on NIST 2002 yields equal error rates (EERs) of 9.66 % (clean data) and 16.41 % (-10 dB SNR) for the conventional Hamming method and 8.13 % (clean data) and 14.63 % (-10 dB SNR) using multitapers. Multitapering is a simple and robust alternative to the Hamming window method.

Please use this url to cite or link to this publication: https://lup.lub.lu.se/record/1718661

author

Kinnunen, Tomi ; Saeidi, Rahim ; Sandberg, Johan ^LU and Sandsten, Maria ^LU

organization

Mathematical Statistics

publishing date

2010

type

Chapter in Book/Report/Conference proceeding

publication status

published

subject

Probability Theory and Statistics

keywords

speaker verification, multiple window method

host publication

InterSpecch 2010

pages

2734 - 2737

conference name

Interspeech 2010

conference location

Makuhari, Japan

conference dates

0001-01-02

external identifiers

scopus:79959826333

project

Statistical Signal Processing Group

language

English

LU publication?

yes

id

ab9b6427-8f65-4cd1-8918-91a68f028072 (old id 1718661)

alternative location

http://cs.joensuu.fi/pages/tkinnu/webpage/pdf/MultiTaper_Interspeech2010.pdf

date added to LUP

2016-04-04 14:38:17

date last changed

2026-02-10 15:00:15

@inproceedings{ab9b6427-8f65-4cd1-8918-91a68f028072,
  abstract     = {{Usually the mel-frequency cepstral coefficients (MFCCs) are derived via Hamming windowed DFT spectrum. In this paper, we advocate to use a so-called multitaper method instead. Multitaper methods form a spectrum estimate using multiple window functions and frequency-domain averaging. Multitapers provide a robust spectrum estimate but have not received much attention in speech processing. Our speaker recognition experiment on NIST 2002 yields equal error rates (EERs) of 9.66 % (clean data) and 16.41 % (-10 dB SNR) for the conventional Hamming method and 8.13 % (clean data) and 14.63 % (-10 dB SNR) using multitapers. Multitapering is a simple and robust alternative to the Hamming window method.}},
  author       = {{Kinnunen, Tomi and Saeidi, Rahim and Sandberg, Johan and Sandsten, Maria}},
  booktitle    = {{InterSpecch 2010}},
  keywords     = {{speaker verification; multiple window method}},
  language     = {{eng}},
  pages        = {{2734--2737}},
  title        = {{What Else is New Than the Hamming Window? Robust MFCCs for Speaker Recognition via Multitapering}},
  url          = {{http://cs.joensuu.fi/pages/tkinnu/webpage/pdf/MultiTaper_Interspeech2010.pdf}},
  year         = {{2010}},
}

Lund University Publications

LUND UNIVERSITY LIBRARIES

What Else is New Than the Hamming Window? Robust MFCCs for Speaker Recognition via Multitapering