Advanced

Many Labs 2: Investigating variation in replicability across samples and settings

Innes-Ker, Åse LU ; Klein, Richard A.; Vianello, Michelangelo; Hasselman, Fred; Adams, Byron G.; Adams, Regibald B. and Alper, Sinan (2018) In Advances in Methods and Practices in Psychological Science 1(4). p.443-490
Abstract (Swedish)
We conducted preregistered replications of 28 classic and contemporary published findings, with protocols that were peer reviewed in advance, to examine variation in effect magnitudes across samples and settings. Each protocol was administered to approximately half of 125 samples that comprised 15,305 participants from 36 countries and territories. Using the conventional criterion of statistical significance (p < .05), we found that 15 (54%) of the replications provided evidence of a statistically significant effect in the same direction as the original finding. With a strict significance criterion (p < .0001), 14 (50%) of the replications still provided such evidence, a reflection of the extremely high-powered design. Seven (25%) of... (More)
We conducted preregistered replications of 28 classic and contemporary published findings, with protocols that were peer reviewed in advance, to examine variation in effect magnitudes across samples and settings. Each protocol was administered to approximately half of 125 samples that comprised 15,305 participants from 36 countries and territories. Using the conventional criterion of statistical significance (p < .05), we found that 15 (54%) of the replications provided evidence of a statistically significant effect in the same direction as the original finding. With a strict significance criterion (p < .0001), 14 (50%) of the replications still provided such evidence, a reflection of the extremely high-powered design. Seven (25%) of the replications yielded effect sizes larger than the original ones, and 21 (75%) yielded effect sizes smaller than the original ones. The median comparable Cohen’s ds were 0.60 for the original findings and 0.15 for the replications. The effect sizes were small (< 0.20) in 16 of the replications (57%), and 9 effects (32%) were in the direction opposite the direction of the original effect. Across settings, the Q statistic indicated significant heterogeneity in 11 (39%) of the replication effects, and most of those were among the findings with the largest overall effect sizes; only 1 effect that was near zero in the aggregate showed significant heterogeneity according to this measure. Only 1 effect had a tau value greater than .20, an indication of moderate heterogeneity. Eight others had tau values near or slightly above .10, an indication of slight heterogeneity. Moderation tests indicated that very little heterogeneity was attributable to the order in which the tasks were performed or whether the tasks were administered in lab versus online. Exploratory comparisons revealed little heterogeneity between Western, educated, industrialized, rich, and democratic (WEIRD) cultures and less WEIRD cultures (i.e., cultures with relatively high and low WEIRDness scores, respectively). Cumulatively, variability in the observed effect sizes was attributable more to the effect being studied than to the sample or setting in which it was studied (Less)
Please use this url to cite or link to this publication:
author
organization
publishing date
type
Contribution to journal
publication status
published
subject
keywords
social psychology, preregistered, open materials, open data, Registered Report,, meta-analysis, situational effects, sampling effects,, individual differences,, culture,, replication,, cognitive psychology, social psychology, cognitive psychology, replication, culture, individual differences, sampling effects, situational effects, meta-analysis, Registered Report, open data, open materials, preregistered
in
Advances in Methods and Practices in Psychological Science
volume
1
issue
4
pages
443 - 490
publisher
SAGE Publications Ltd
ISSN
2515-2459
DOI
10.1177/2515245918810225
language
English
LU publication?
yes
id
d0c82860-1c03-4182-9a6f-f385e21458c6
date added to LUP
2019-04-08 08:40:41
date last changed
2019-04-08 09:26:13
@article{d0c82860-1c03-4182-9a6f-f385e21458c6,
  abstract     = {We conducted preregistered replications of 28 classic and contemporary published findings, with protocols that were peer reviewed in advance, to examine variation in effect magnitudes across samples and settings. Each protocol was administered to approximately half of 125 samples that comprised 15,305 participants from 36 countries and territories. Using the conventional criterion of statistical significance (p &lt; .05), we found that 15 (54%) of the replications provided evidence of a statistically significant effect in the same direction as the original finding. With a strict significance criterion (p &lt; .0001), 14 (50%) of the replications still provided such evidence, a reflection of the extremely high-powered design. Seven (25%) of the replications yielded effect sizes larger than the original ones, and 21 (75%) yielded effect sizes smaller than the original ones. The median comparable Cohen’s ds were 0.60 for the original findings and 0.15 for the replications. The effect sizes were small (&lt; 0.20) in 16 of the replications (57%), and 9 effects (32%) were in the direction opposite the direction of the original effect. Across settings, the Q statistic indicated significant heterogeneity in 11 (39%) of the replication effects, and most of those were among the findings with the largest overall effect sizes; only 1 effect that was near zero in the aggregate showed significant heterogeneity according to this measure. Only 1 effect had a tau value greater than .20, an indication of moderate heterogeneity. Eight others had tau values near or slightly above .10, an indication of slight heterogeneity. Moderation tests indicated that very little heterogeneity was attributable to the order in which the tasks were performed or whether the tasks were administered in lab versus online. Exploratory comparisons revealed little heterogeneity between Western, educated, industrialized, rich, and democratic (WEIRD) cultures and less WEIRD cultures (i.e., cultures with relatively high and low WEIRDness scores, respectively). Cumulatively, variability in the observed effect sizes was attributable more to the effect being studied than to the sample or setting in which it was studied},
  author       = {Innes-Ker, Åse and Klein, Richard A. and Vianello, Michelangelo and Hasselman, Fred and Adams, Byron G. and Adams, Regibald B. and Alper, Sinan},
  issn         = {2515-2459},
  keyword      = {social psychology,preregistered,open materials,open data,Registered Report,,meta-analysis,situational effects,sampling effects,,individual differences,,culture,,replication,,cognitive psychology,social psychology,cognitive psychology,replication,culture,individual differences,sampling effects,situational effects,meta-analysis,Registered Report,open data,open materials,preregistered},
  language     = {eng},
  month        = {12},
  number       = {4},
  pages        = {443--490},
  publisher    = {SAGE Publications Ltd},
  series       = {Advances in Methods and Practices in Psychological Science},
  title        = {Many Labs 2: Investigating variation in replicability across samples and settings},
  url          = {http://dx.doi.org/10.1177/2515245918810225},
  volume       = {1},
  year         = {2018},
}