Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

Multiple imputation in veterinary epidemiological studies : A case study and simulation

Dohoo, Ian R. ; Nielsen, Christel R. LU orcid and Emanuelson, Ulf (2016) In Preventive Veterinary Medicine 129. p.35-47
Abstract

The problem of missing data occurs frequently in veterinary epidemiological studies. Most studies use a complete case (CC) analysis which excludes all observations for which any relevant variable have missing values. Alternative approaches (most notably multiple imputation (MI)) which avoid the exclusion of observations with missing values are now widely available but have been used very little in veterinary epidemiology. This paper uses a case study based on research into dairy producers' attitudes toward mastitis control procedures, combined with two simulation studies to evaluate the use of MI and compare results with a CC analysis. MI analysis of the original data produced results which had relatively minor differences from the CC... (More)

The problem of missing data occurs frequently in veterinary epidemiological studies. Most studies use a complete case (CC) analysis which excludes all observations for which any relevant variable have missing values. Alternative approaches (most notably multiple imputation (MI)) which avoid the exclusion of observations with missing values are now widely available but have been used very little in veterinary epidemiology. This paper uses a case study based on research into dairy producers' attitudes toward mastitis control procedures, combined with two simulation studies to evaluate the use of MI and compare results with a CC analysis. MI analysis of the original data produced results which had relatively minor differences from the CC analysis. However, most of the missing data in the original data set were in the dependent variable and a subsequent simulation study based on the observed missing data pattern and 1000 simulations showed that an MI analysis would not be expected to offer any advantages over a CC analysis in this situation. This was true regardless of the missing data mechanism (MCAR - missing completely at random, MAR - missing at random, or NMAR - not missing at random) underlying the missing values. Surprisingly, recent textbooks dealing with MI make little reference to this limitation of MI for dealing with missing values in the dependent variable. An additional simulation study (1000 runs for each of the three missing data mechanisms) compared MI and CC analyses for data in which varying levels (n = 7) of missing data were created in predictor variables. This study showed that MI analyses generally produced results that were less biased on average, were more precise (smaller SEs), were more consistent (less variability between simulation runs) and consequently were more likely to produce estimates that were close to the "truth" (results obtained from a data set with no missing values). While the benefit of MI varied with the mechanism used to generate the missing data, MI always performed as well as, or better than, CC analysis.

(Less)
Please use this url to cite or link to this publication:
author
; and
organization
publishing date
type
Contribution to journal
publication status
published
subject
keywords
"Dependent variable", "Multiple imputation", MAR, MCAR, NMAR, Questionnaire, Simulation
in
Preventive Veterinary Medicine
volume
129
pages
13 pages
publisher
Elsevier
external identifiers
  • scopus:84969856003
  • pmid:27317321
  • wos:000378967500005
ISSN
0167-5877
DOI
10.1016/j.prevetmed.2016.04.003
language
English
LU publication?
yes
id
752c9590-423e-4c38-b4fd-605d1b279295
date added to LUP
2017-01-18 15:28:01
date last changed
2024-02-03 08:46:52
@article{752c9590-423e-4c38-b4fd-605d1b279295,
  abstract     = {{<p>The problem of missing data occurs frequently in veterinary epidemiological studies. Most studies use a complete case (CC) analysis which excludes all observations for which any relevant variable have missing values. Alternative approaches (most notably multiple imputation (MI)) which avoid the exclusion of observations with missing values are now widely available but have been used very little in veterinary epidemiology. This paper uses a case study based on research into dairy producers' attitudes toward mastitis control procedures, combined with two simulation studies to evaluate the use of MI and compare results with a CC analysis. MI analysis of the original data produced results which had relatively minor differences from the CC analysis. However, most of the missing data in the original data set were in the dependent variable and a subsequent simulation study based on the observed missing data pattern and 1000 simulations showed that an MI analysis would not be expected to offer any advantages over a CC analysis in this situation. This was true regardless of the missing data mechanism (MCAR - missing completely at random, MAR - missing at random, or NMAR - not missing at random) underlying the missing values. Surprisingly, recent textbooks dealing with MI make little reference to this limitation of MI for dealing with missing values in the dependent variable. An additional simulation study (1000 runs for each of the three missing data mechanisms) compared MI and CC analyses for data in which varying levels (n = 7) of missing data were created in predictor variables. This study showed that MI analyses generally produced results that were less biased on average, were more precise (smaller SEs), were more consistent (less variability between simulation runs) and consequently were more likely to produce estimates that were close to the "truth" (results obtained from a data set with no missing values). While the benefit of MI varied with the mechanism used to generate the missing data, MI always performed as well as, or better than, CC analysis.</p>}},
  author       = {{Dohoo, Ian R. and Nielsen, Christel R. and Emanuelson, Ulf}},
  issn         = {{0167-5877}},
  keywords     = {{"Dependent variable"; "Multiple imputation"; MAR; MCAR; NMAR; Questionnaire; Simulation}},
  language     = {{eng}},
  month        = {{07}},
  pages        = {{35--47}},
  publisher    = {{Elsevier}},
  series       = {{Preventive Veterinary Medicine}},
  title        = {{Multiple imputation in veterinary epidemiological studies : A case study and simulation}},
  url          = {{http://dx.doi.org/10.1016/j.prevetmed.2016.04.003}},
  doi          = {{10.1016/j.prevetmed.2016.04.003}},
  volume       = {{129}},
  year         = {{2016}},
}