Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

Nonlinear vocal phenomena and speech intelligibility

Anikin, Andrey LU orcid ; Reby, David and Pisanski, Katarzyna (2025) In Philosophical Transactions of the Royal Society B: Biological Sciences
Abstract
At some point in our evolutionary history, humans lost vocal membranes and air sacs, representing an unexpected simplification of the vocal apparatus relative to other great apes. One hypothesis is that these simplifications represent anatomical adaptations for speech because a simpler larynx provides a suitably stable and tonal vocal source with fewer nonlinear vocal phenomena (NLP). The key assumption that NLP reduce speech intelligibility is indirectly supported by studies of dysphonia, but it has not been experimentally tested. Here, we manipulate NLP in vocal stimuli ranging from single vowels to sentences, showing that the vocal source needs to be stable, but not necessarily tonal, for speech to be readily understood. When the task... (More)
At some point in our evolutionary history, humans lost vocal membranes and air sacs, representing an unexpected simplification of the vocal apparatus relative to other great apes. One hypothesis is that these simplifications represent anatomical adaptations for speech because a simpler larynx provides a suitably stable and tonal vocal source with fewer nonlinear vocal phenomena (NLP). The key assumption that NLP reduce speech intelligibility is indirectly supported by studies of dysphonia, but it has not been experimentally tested. Here, we manipulate NLP in vocal stimuli ranging from single vowels to sentences, showing that the vocal source needs to be stable, but not necessarily tonal, for speech to be readily understood. When the task is to discriminate synthesized monophthong and diphthong vowels, continuous NLP (subharmonics, amplitude modulation and even deterministic chaos) actually improve vowel perception in high-pitched voices, likely because the resulting dense spectrum reveals formant transitions. Rough-sounding voices also remain highly intelligible when continuous NLP are added to recorded words and sentences. In contrast, voicing interruptions and pitch jumps dramatically reduce speech intelligibility, likely by interfering with voicing contrasts and normal intonation. We argue that NLP were not eliminated from the human vocal repertoire as we evolved for speech, but only brought under better control.

This article is part of the theme issue ‘Nonlinear phenomena in vertebrate vocalizations: mechanisms and communicative functions’. (Less)
Please use this url to cite or link to this publication:
author
; and
organization
publishing date
type
Contribution to journal
publication status
published
subject
keywords
speech, evolution of language, nonlinear vocal phenomena, voice, formant frequencies, vocal membranes
in
Philosophical Transactions of the Royal Society B: Biological Sciences
publisher
Royal Society Publishing
external identifiers
  • pmid:40176514
  • scopus:105001863545
ISSN
1471-2970
DOI
10.1098/rstb.2024.0254
language
English
LU publication?
yes
id
21be370a-bbd5-4df6-9ee9-072a2466062d
date added to LUP
2025-04-03 18:42:14
date last changed
2025-06-18 04:00:56
@article{21be370a-bbd5-4df6-9ee9-072a2466062d,
  abstract     = {{At some point in our evolutionary history, humans lost vocal membranes and air sacs, representing an unexpected simplification of the vocal apparatus relative to other great apes. One hypothesis is that these simplifications represent anatomical adaptations for speech because a simpler larynx provides a suitably stable and tonal vocal source with fewer nonlinear vocal phenomena (NLP). The key assumption that NLP reduce speech intelligibility is indirectly supported by studies of dysphonia, but it has not been experimentally tested. Here, we manipulate NLP in vocal stimuli ranging from single vowels to sentences, showing that the vocal source needs to be stable, but not necessarily tonal, for speech to be readily understood. When the task is to discriminate synthesized monophthong and diphthong vowels, continuous NLP (subharmonics, amplitude modulation and even deterministic chaos) actually improve vowel perception in high-pitched voices, likely because the resulting dense spectrum reveals formant transitions. Rough-sounding voices also remain highly intelligible when continuous NLP are added to recorded words and sentences. In contrast, voicing interruptions and pitch jumps dramatically reduce speech intelligibility, likely by interfering with voicing contrasts and normal intonation. We argue that NLP were not eliminated from the human vocal repertoire as we evolved for speech, but only brought under better control.<br/><br/>This article is part of the theme issue ‘Nonlinear phenomena in vertebrate vocalizations: mechanisms and communicative functions’.}},
  author       = {{Anikin, Andrey and Reby, David and Pisanski, Katarzyna}},
  issn         = {{1471-2970}},
  keywords     = {{speech; evolution of language; nonlinear vocal phenomena; voice; formant frequencies; vocal membranes}},
  language     = {{eng}},
  publisher    = {{Royal Society Publishing}},
  series       = {{Philosophical Transactions of the Royal Society B: Biological Sciences}},
  title        = {{Nonlinear vocal phenomena and speech intelligibility}},
  url          = {{http://dx.doi.org/10.1098/rstb.2024.0254}},
  doi          = {{10.1098/rstb.2024.0254}},
  year         = {{2025}},
}