Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

Exploring the Guessing-Game Experimental Paradigm : Inferences From Closed- Versus Open-Ended Semantic Space

Kuleshova, Svetlana ; Ćwiek, Aleksandra ; Hartmann, Stefan ; Pleyer, Michael ; Sibierska, Marta ; Placiński, Marek ; Blomberg, Johan LU ; Żywiczyński, Przemysław and Wacewicz, Sławomir (2026) In Cognitive Science 50(3).
Abstract

How we measure success in signal comprehension experiments fundamentally shapes our conclusions. Two recent studies have demonstrated that humans can guess the meanings of novel vocalizations and ape gestures above chance when selecting from limited alternatives. We replicated both experiments using open-ended responses instead of multiple choice. For the vocalization data, where participants provided single-word or short-phrase responses, we systematically compared three evaluation methods applied to the same responses: exact matching, graded similarity ratings, and computational semantic similarity. For the gesture data, we applied graded similarity ratings. Each evaluation method revealed a different semantic landscape. Participants’... (More)

How we measure success in signal comprehension experiments fundamentally shapes our conclusions. Two recent studies have demonstrated that humans can guess the meanings of novel vocalizations and ape gestures above chance when selecting from limited alternatives. We replicated both experiments using open-ended responses instead of multiple choice. For the vocalization data, where participants provided single-word or short-phrase responses, we systematically compared three evaluation methods applied to the same responses: exact matching, graded similarity ratings, and computational semantic similarity. For the gesture data, we applied graded similarity ratings. Each evaluation method revealed a different semantic landscape. Participants’ success was very low when measured by exact matching, moderate by similarity ratings, and substantially greater by computational measures, which capture broader thematic connections. Despite these differences, a consistent pattern emerged across both datasets and all evaluation methods: success was determined primarily by properties of the signals (their semantic category and degree of transparency) rather than individual participant abilities. Participants often reliably distinguished broad categories (actions vs. objects, animals vs. artifacts) but rarely identified specific concepts—and these distinct patterns only became visible through a combination of evaluation methods. In sum, our results partly align with the original studies yet also diverge in ways conducive to different conclusions about naïve humans’ ability to understand novel vocalizations or ape gestures. We show that closed- versus open-ended response formats, and different evaluation scales, function as complementary research tools rather than competing approaches. Each reveals different aspects of how humans navigate semantic space when interpreting novel signals. Experimental and evaluation designs are, therefore, not a technical detail but a theoretical choice about which semantic relationships we seek to expose.

(Less)
Please use this url to cite or link to this publication:
author
; ; ; ; ; ; ; and
organization
publishing date
type
Contribution to journal
publication status
published
subject
keywords
Bayesian hierarchical modeling, Conceptual replication, Ecological validity, Experimental semiotics, Semantic space, Understanding
in
Cognitive Science
volume
50
issue
3
article number
e70199
publisher
Wiley-Blackwell
external identifiers
  • scopus:105033984609
  • pmid:41870092
ISSN
0364-0213
DOI
10.1111/cogs.70199
language
English
LU publication?
yes
id
27348df7-880a-49bf-89ea-c9f38d9eea78
date added to LUP
2026-05-21 15:58:52
date last changed
2026-06-04 16:56:00
@article{27348df7-880a-49bf-89ea-c9f38d9eea78,
  abstract     = {{<p>How we measure success in signal comprehension experiments fundamentally shapes our conclusions. Two recent studies have demonstrated that humans can guess the meanings of novel vocalizations and ape gestures above chance when selecting from limited alternatives. We replicated both experiments using open-ended responses instead of multiple choice. For the vocalization data, where participants provided single-word or short-phrase responses, we systematically compared three evaluation methods applied to the same responses: exact matching, graded similarity ratings, and computational semantic similarity. For the gesture data, we applied graded similarity ratings. Each evaluation method revealed a different semantic landscape. Participants’ success was very low when measured by exact matching, moderate by similarity ratings, and substantially greater by computational measures, which capture broader thematic connections. Despite these differences, a consistent pattern emerged across both datasets and all evaluation methods: success was determined primarily by properties of the signals (their semantic category and degree of transparency) rather than individual participant abilities. Participants often reliably distinguished broad categories (actions vs. objects, animals vs. artifacts) but rarely identified specific concepts—and these distinct patterns only became visible through a combination of evaluation methods. In sum, our results partly align with the original studies yet also diverge in ways conducive to different conclusions about naïve humans’ ability to understand novel vocalizations or ape gestures. We show that closed- versus open-ended response formats, and different evaluation scales, function as complementary research tools rather than competing approaches. Each reveals different aspects of how humans navigate semantic space when interpreting novel signals. Experimental and evaluation designs are, therefore, not a technical detail but a theoretical choice about which semantic relationships we seek to expose.</p>}},
  author       = {{Kuleshova, Svetlana and Ćwiek, Aleksandra and Hartmann, Stefan and Pleyer, Michael and Sibierska, Marta and Placiński, Marek and Blomberg, Johan and Żywiczyński, Przemysław and Wacewicz, Sławomir}},
  issn         = {{0364-0213}},
  keywords     = {{Bayesian hierarchical modeling; Conceptual replication; Ecological validity; Experimental semiotics; Semantic space; Understanding}},
  language     = {{eng}},
  number       = {{3}},
  publisher    = {{Wiley-Blackwell}},
  series       = {{Cognitive Science}},
  title        = {{Exploring the Guessing-Game Experimental Paradigm : Inferences From Closed- Versus Open-Ended Semantic Space}},
  url          = {{http://dx.doi.org/10.1111/cogs.70199}},
  doi          = {{10.1111/cogs.70199}},
  volume       = {{50}},
  year         = {{2026}},
}