Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

Detecting potential outliers in longitudinal data with time-dependent covariates

Mramba, Lazarus K. ; Liu, Xiang ; Lynch, Kristian F. LU ; Yang, Jimin ; Aronsson, Carin Andrén LU orcid ; Hummel, Sandra ; Norris, Jill M. ; Virtanen, Suvi M. ; Hakola, Leena and Uusitalo, Ulla M. , et al. (2024) In European Journal of Clinical Nutrition
Abstract

Background: Outliers can influence regression model parameters and change the direction of the estimated effect, over-estimating or under-estimating the strength of the association between a response variable and an exposure of interest. Identifying visit-level outliers from longitudinal data with continuous time-dependent covariates is important when the distribution of such variable is highly skewed. Objectives: The primary objective was to identify potential outliers at follow-up visits using interquartile range (IQR) statistic and assess their influence on estimated Cox regression parameters. Methods: Study was motivated by a large TEDDY dietary longitudinal and time-to-event data with a continuous time-varying vitamin... (More)

Background: Outliers can influence regression model parameters and change the direction of the estimated effect, over-estimating or under-estimating the strength of the association between a response variable and an exposure of interest. Identifying visit-level outliers from longitudinal data with continuous time-dependent covariates is important when the distribution of such variable is highly skewed. Objectives: The primary objective was to identify potential outliers at follow-up visits using interquartile range (IQR) statistic and assess their influence on estimated Cox regression parameters. Methods: Study was motivated by a large TEDDY dietary longitudinal and time-to-event data with a continuous time-varying vitamin B12 intake as the exposure of interest and development of Islet Autoimmunity (IA) as the response variable. An IQR algorithm was applied to the TEDDY dataset to detect potential outliers at each visit. To assess the impact of detected outliers, data were analyzed using the extended time-dependent Cox model with robust sandwich estimator. Partial residual diagnostic plots were examined for highly influential outliers. Results: Extreme vitamin B12 observations that were cases of IA had a stronger influence on the Cox regression model than non-cases. Identified outliers changed the direction of hazard ratios, standard errors, or the strength of association with the risk of developing IA. Conclusion: At the exploratory data analysis stage, the IQR algorithm can be used as a data quality control tool to identify potential outliers at the visit level, which can be further investigated.

(Less)
Please use this url to cite or link to this publication:
author
; ; ; ; ; ; ; ; and , et al. (More)
; ; ; ; ; ; ; ; ; and (Less)
organization
publishing date
type
Contribution to journal
publication status
epub
subject
in
European Journal of Clinical Nutrition
publisher
Nature Publishing Group
external identifiers
  • pmid:38172348
  • scopus:85181259126
ISSN
0954-3007
DOI
10.1038/s41430-023-01393-6
language
English
LU publication?
yes
id
d61ff6b6-bee5-45ef-ba05-a01939c9c709
date added to LUP
2024-02-06 15:29:22
date last changed
2024-04-22 22:32:56
@article{d61ff6b6-bee5-45ef-ba05-a01939c9c709,
  abstract     = {{<p>Background: Outliers can influence regression model parameters and change the direction of the estimated effect, over-estimating or under-estimating the strength of the association between a response variable and an exposure of interest. Identifying visit-level outliers from longitudinal data with continuous time-dependent covariates is important when the distribution of such variable is highly skewed. Objectives: The primary objective was to identify potential outliers at follow-up visits using interquartile range (IQR) statistic and assess their influence on estimated Cox regression parameters. Methods: Study was motivated by a large TEDDY dietary longitudinal and time-to-event data with a continuous time-varying vitamin B<sub>12</sub> intake as the exposure of interest and development of Islet Autoimmunity (IA) as the response variable. An IQR algorithm was applied to the TEDDY dataset to detect potential outliers at each visit. To assess the impact of detected outliers, data were analyzed using the extended time-dependent Cox model with robust sandwich estimator. Partial residual diagnostic plots were examined for highly influential outliers. Results: Extreme vitamin B<sub>12</sub> observations that were cases of IA had a stronger influence on the Cox regression model than non-cases. Identified outliers changed the direction of hazard ratios, standard errors, or the strength of association with the risk of developing IA. Conclusion: At the exploratory data analysis stage, the IQR algorithm can be used as a data quality control tool to identify potential outliers at the visit level, which can be further investigated.</p>}},
  author       = {{Mramba, Lazarus K. and Liu, Xiang and Lynch, Kristian F. and Yang, Jimin and Aronsson, Carin Andrén and Hummel, Sandra and Norris, Jill M. and Virtanen, Suvi M. and Hakola, Leena and Uusitalo, Ulla M. and Krischer, Jeffrey P.}},
  issn         = {{0954-3007}},
  language     = {{eng}},
  publisher    = {{Nature Publishing Group}},
  series       = {{European Journal of Clinical Nutrition}},
  title        = {{Detecting potential outliers in longitudinal data with time-dependent covariates}},
  url          = {{http://dx.doi.org/10.1038/s41430-023-01393-6}},
  doi          = {{10.1038/s41430-023-01393-6}},
  year         = {{2024}},
}