Nonlinear vocal phenomena and speech intelligibility

Anikin, Andrey; Reby, David; Pisanski, Katarzyna

Nonlinear vocal phenomena and speech intelligibility

Mark

; Reby, David and Pisanski, Katarzyna (2025) In Philosophical Transactions of the Royal Society B: Biological Sciences 380(1923).

Abstract: At some point in our evolutionary history, humans lost vocal membranes and air sacs, representing an unexpected simplification of the vocal apparatus relative to other great apes. One hypothesis is that these simplifications represent anatomical adaptations for speech because a simpler larynx provides a suitably stable and tonal vocal source with fewer nonlinear vocal phenomena (NLP). The key assumption that NLP reduce speech intelligibility is indirectly supported by studies of dysphonia, but it has not been experimentally tested. Here, we manipulate NLP in vocal stimuli ranging from single vowels to sentences, showing that the vocal source needs to be stable, but not necessarily tonal, for speech to be readily understood. When the task... (More); At some point in our evolutionary history, humans lost vocal membranes and air sacs, representing an unexpected simplification of the vocal apparatus relative to other great apes. One hypothesis is that these simplifications represent anatomical adaptations for speech because a simpler larynx provides a suitably stable and tonal vocal source with fewer nonlinear vocal phenomena (NLP). The key assumption that NLP reduce speech intelligibility is indirectly supported by studies of dysphonia, but it has not been experimentally tested. Here, we manipulate NLP in vocal stimuli ranging from single vowels to sentences, showing that the vocal source needs to be stable, but not necessarily tonal, for speech to be readily understood. When the task is to discriminate synthesized monophthong and diphthong vowels, continuous NLP (subharmonics, amplitude modulation and even deterministic chaos) actually improve vowel perception in high-pitched voices, likely because the resulting dense spectrum reveals formant transitions. Rough-sounding voices also remain highly intelligible when continuous NLP are added to recorded words and sentences. In contrast, voicing interruptions and pitch jumps dramatically reduce speech intelligibility, likely by interfering with voicing contrasts and normal intonation. We argue that NLP were not eliminated from the human vocal repertoire as we evolved for speech, but only brought under better control.

This article is part of the theme issue ‘Nonlinear phenomena in vertebrate vocalizations: mechanisms and communicative functions’. (Less)

Please use this url to cite or link to this publication: https://lup.lub.lu.se/record/21be370a-bbd5-4df6-9ee9-072a2466062d

author

Anikin, Andrey ^LU

; Reby, David and Pisanski, Katarzyna

organization

publishing date

2025

type

Contribution to journal

publication status

published

subject

keywords

speech, evolution of language, nonlinear vocal phenomena, voice, formant frequencies, vocal membranes

in

Philosophical Transactions of the Royal Society B: Biological Sciences

volume

380

issue

1923

article number

20240254

publisher

Royal Society Publishing

external identifiers

pmid:40176514
scopus:105001863545

ISSN

1471-2970

DOI

10.1098/rstb.2024.0254

language

English

LU publication?

yes

id

21be370a-bbd5-4df6-9ee9-072a2466062d

date added to LUP

2025-04-03 18:42:14

date last changed

2025-11-19 18:09:04

@article{21be370a-bbd5-4df6-9ee9-072a2466062d,
  abstract     = {{At some point in our evolutionary history, humans lost vocal membranes and air sacs, representing an unexpected simplification of the vocal apparatus relative to other great apes. One hypothesis is that these simplifications represent anatomical adaptations for speech because a simpler larynx provides a suitably stable and tonal vocal source with fewer nonlinear vocal phenomena (NLP). The key assumption that NLP reduce speech intelligibility is indirectly supported by studies of dysphonia, but it has not been experimentally tested. Here, we manipulate NLP in vocal stimuli ranging from single vowels to sentences, showing that the vocal source needs to be stable, but not necessarily tonal, for speech to be readily understood. When the task is to discriminate synthesized monophthong and diphthong vowels, continuous NLP (subharmonics, amplitude modulation and even deterministic chaos) actually improve vowel perception in high-pitched voices, likely because the resulting dense spectrum reveals formant transitions. Rough-sounding voices also remain highly intelligible when continuous NLP are added to recorded words and sentences. In contrast, voicing interruptions and pitch jumps dramatically reduce speech intelligibility, likely by interfering with voicing contrasts and normal intonation. We argue that NLP were not eliminated from the human vocal repertoire as we evolved for speech, but only brought under better control.<br/><br/>This article is part of the theme issue ‘Nonlinear phenomena in vertebrate vocalizations: mechanisms and communicative functions’.}},
  author       = {{Anikin, Andrey and Reby, David and Pisanski, Katarzyna}},
  issn         = {{1471-2970}},
  keywords     = {{speech; evolution of language; nonlinear vocal phenomena; voice; formant frequencies; vocal membranes}},
  language     = {{eng}},
  number       = {{1923}},
  publisher    = {{Royal Society Publishing}},
  series       = {{Philosophical Transactions of the Royal Society B: Biological Sciences}},
  title        = {{Nonlinear vocal phenomena and speech intelligibility}},
  url          = {{http://dx.doi.org/10.1098/rstb.2024.0254}},
  doi          = {{10.1098/rstb.2024.0254}},
  volume       = {{380}},
  year         = {{2025}},
}

Lund University Publications

LUND UNIVERSITY LIBRARIES

Nonlinear vocal phenomena and speech intelligibility