Replication Data for : Learning to predict - second language perception of reduced multi-word sequences

Lorenz, David; Tizon-Couto, David

Replication Data for : Learning to predict - second language perception of reduced multi-word sequences

Mark

Lorenz, David ^LU

and Tizon-Couto, David (2024)

Abstract: This is the data and code from a word-monitoring task, in which advanced learners of English responded to the word 'to' in verb + to-infinitive structures (V-to-Vinf) in English, where 'to' could occur in a full or reduced pronunciation (e.g. "prefer to" [tʊ] or "preferda" [ɾə]). The design of this experiment is replicated from our earlier study with American English native speakers (Lorenz & Tizón-Couto, 2019, see link to paper and dataset below *).
We tested the effects of string frequency (V+to) and transitional probability (of 'to' given the V) on the accuracy and speed of recognition of "to" in spoken sentences. These effects were analysed with mixed-effects generalized additive models (GAMM); the code also includes... (More); This is the data and code from a word-monitoring task, in which advanced learners of English responded to the word 'to' in verb + to-infinitive structures (V-to-Vinf) in English, where 'to' could occur in a full or reduced pronunciation (e.g. "prefer to" [tʊ] or "preferda" [ɾə]). The design of this experiment is replicated from our earlier study with American English native speakers (Lorenz & Tizón-Couto, 2019, see link to paper and dataset below *).
We tested the effects of string frequency (V+to) and transitional probability (of 'to' given the V) on the accuracy and speed of recognition of "to" in spoken sentences. These effects were analysed with mixed-effects generalized additive models (GAMM); the code also includes visualisations of these models.
The experiment was run with OpenSesame (version 3.2.6 for Mac, see Mathôt et al. 2012). The data include information on frequencies of occurrence of words and bigrams; this was extracted from the Corpus of Contemporary American English (COCA, Davies 2008–). We used R (version 4.3.1, R Core Team 2023) for all data analyses, hence the code can best be replicated in R.
(Less)
Abstract (Swedish)

Please use this url to cite or link to this publication: https://lup.lub.lu.se/record/ccac51fe-d3f0-4a18-a58e-d232ba99ac5d

author

Lorenz, David ^LU

and Tizon-Couto, David

organization

English Studies

publishing date

2024-05-14

type

Other contribution

publication status

published

subject

Studies of Specific Languages

keywords

Data set, data analysis, response times, Gengeralized Additive Models, word-monitoring task

DOI

10.18710/TE5ZOG

project

Cognitive representation of multi-word sequences in English: Converging evidence

language

English

LU publication?

yes

additional info

This Dataset may be reused according to the CLARIN PUB+BY+NC+LRT license. Information about the license terms are described under the Dataset Terms tab: https://dataverse.no/dataset.xhtml?persistentId=doi:10.18710/TE5ZOG&version=1.0&selectTab=termsTab

id

ccac51fe-d3f0-4a18-a58e-d232ba99ac5d

date added to LUP

2024-06-05 16:48:19

date last changed

2025-04-04 15:19:38

@misc{ccac51fe-d3f0-4a18-a58e-d232ba99ac5d,
  abstract     = {{This is the data and code from a word-monitoring task, in which advanced learners of English responded to the word 'to' in verb + to-infinitive structures (V-to-Vinf) in English, where 'to' could occur in a full or reduced pronunciation (e.g. "prefer to" [tʊ] or "preferda" [ɾə]). The design of this experiment is replicated from our earlier study with American English native speakers (Lorenz &amp; Tizón-Couto, 2019, see link to paper and dataset below *).<br/>We tested the effects of string frequency (V+to) and transitional probability (of 'to' given the V) on the accuracy and speed of recognition of "to" in spoken sentences. These effects were analysed with mixed-effects generalized additive models (GAMM); the code also includes visualisations of these models.<br/>The experiment was run with OpenSesame (version 3.2.6 for Mac, see Mathôt et al. 2012). The data include information on frequencies of occurrence of words and bigrams; this was extracted from the Corpus of Contemporary American English (COCA, Davies 2008–). We used R (version 4.3.1, R Core Team 2023) for all data analyses, hence the code can best be replicated in R.<br/>}},
  author       = {{Lorenz, David and Tizon-Couto, David}},
  keywords     = {{Data set; data analysis; response times; Gengeralized Additive Models; word-monitoring task}},
  language     = {{eng}},
  month        = {{05}},
  title        = {{Replication Data for : Learning to predict - second language perception of reduced multi-word sequences}},
  url          = {{http://dx.doi.org/10.18710/TE5ZOG}},
  doi          = {{10.18710/TE5ZOG}},
  year         = {{2024}},
}

Lund University Publications

LUND UNIVERSITY LIBRARIES

Replication Data for : Learning to predict - second language perception of reduced multi-word sequences