Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

Nutritional markers of undiagnosed type 2 diabetes in adults : Findings of a machine learning analysis with external validation and benchmarking

De Silva, Kushan ; Lim, Siew ; Mousa, Aya ; Teede, Helena ; Forbes, Andrew ; Demmer, Ryan T ; Jönsson, Daniel LU and Enticott, Joanne (2021) In PLoS ONE 16(5). p.1-21
Abstract

OBJECTIVES: Using a nationally-representative, cross-sectional cohort, we examined nutritional markers of undiagnosed type 2 diabetes in adults via machine learning.

METHODS: A total of 16429 men and non-pregnant women ≥ 20 years of age were analysed from five consecutive cycles of the National Health and Nutrition Examination Survey. Cohorts from years 2013-2016 (n = 6673) was used for external validation. Undiagnosed type 2 diabetes was determined by a negative response to the question "Have you ever been told by a doctor that you have diabetes?" and a positive glycaemic response to one or more of the three diagnostic tests (HbA1c > 6.4% or FPG >125 mg/dl or 2-hr post-OGTT glucose > 200mg/dl). Following comprehensive... (More)

OBJECTIVES: Using a nationally-representative, cross-sectional cohort, we examined nutritional markers of undiagnosed type 2 diabetes in adults via machine learning.

METHODS: A total of 16429 men and non-pregnant women ≥ 20 years of age were analysed from five consecutive cycles of the National Health and Nutrition Examination Survey. Cohorts from years 2013-2016 (n = 6673) was used for external validation. Undiagnosed type 2 diabetes was determined by a negative response to the question "Have you ever been told by a doctor that you have diabetes?" and a positive glycaemic response to one or more of the three diagnostic tests (HbA1c > 6.4% or FPG >125 mg/dl or 2-hr post-OGTT glucose > 200mg/dl). Following comprehensive literature search, 114 potential nutritional markers were modelled with 13 behavioural and 12 socio-economic variables. We tested three machine learning algorithms on original and resampled training datasets built using three resampling methods. From this, the derived 12 predictive models were validated on internal- and external validation cohorts. Magnitudes of associations were gauged through odds ratios in logistic models and variable importance in others. Models were benchmarked against the ADA diabetes risk test.

RESULTS: The prevalence of undiagnosed type 2 diabetes was 5.26%. Four best-performing models (AUROC range: 74.9%-75.7%) classified 39 markers of undiagnosed type 2 diabetes; 28 via one or more of the three best-performing non-linear/ensemble models and 11 uniquely by the logistic model. They comprised 14 nutrient-based, 12 anthropometry-based, 9 socio-behavioural, and 4 diet-associated markers. AUROC of all models were on a par with ADA diabetes risk test on both internal and external validation cohorts (p>0.05).

CONCLUSIONS: Models performed comparably to the chosen benchmark. Novel behavioural markers such as the number of meals not prepared from home were revealed. This approach may be useful in nutritional epidemiology to unravel new associations with type 2 diabetes.

(Less)
Please use this url to cite or link to this publication:
author
; ; ; ; ; ; and
publishing date
type
Contribution to journal
publication status
published
keywords
Adult, Algorithms, Benchmarking/methods, Biomarkers/metabolism, Blood Glucose/metabolism, Cohort Studies, Cross-Sectional Studies, Diabetes Mellitus, Type 2/metabolism, Female, Glycated Hemoglobin/metabolism, Humans, Logistic Models, Machine Learning, Male, Middle Aged, Nutrition Surveys, Prediabetic State/metabolism
in
PLoS ONE
volume
16
issue
5
article number
e0250832
pages
1 - 21
publisher
Public Library of Science (PLoS)
external identifiers
  • pmid:33951067
  • scopus:85105302917
ISSN
1932-6203
DOI
10.1371/journal.pone.0250832
language
English
LU publication?
no
id
bfbf818a-b8c5-4e36-a49a-e2bf2a0c3683
date added to LUP
2024-07-04 10:41:39
date last changed
2024-07-05 04:02:16
@article{bfbf818a-b8c5-4e36-a49a-e2bf2a0c3683,
  abstract     = {{<p>OBJECTIVES: Using a nationally-representative, cross-sectional cohort, we examined nutritional markers of undiagnosed type 2 diabetes in adults via machine learning.</p><p>METHODS: A total of 16429 men and non-pregnant women ≥ 20 years of age were analysed from five consecutive cycles of the National Health and Nutrition Examination Survey. Cohorts from years 2013-2016 (n = 6673) was used for external validation. Undiagnosed type 2 diabetes was determined by a negative response to the question "Have you ever been told by a doctor that you have diabetes?" and a positive glycaemic response to one or more of the three diagnostic tests (HbA1c &gt; 6.4% or FPG &gt;125 mg/dl or 2-hr post-OGTT glucose &gt; 200mg/dl). Following comprehensive literature search, 114 potential nutritional markers were modelled with 13 behavioural and 12 socio-economic variables. We tested three machine learning algorithms on original and resampled training datasets built using three resampling methods. From this, the derived 12 predictive models were validated on internal- and external validation cohorts. Magnitudes of associations were gauged through odds ratios in logistic models and variable importance in others. Models were benchmarked against the ADA diabetes risk test.</p><p>RESULTS: The prevalence of undiagnosed type 2 diabetes was 5.26%. Four best-performing models (AUROC range: 74.9%-75.7%) classified 39 markers of undiagnosed type 2 diabetes; 28 via one or more of the three best-performing non-linear/ensemble models and 11 uniquely by the logistic model. They comprised 14 nutrient-based, 12 anthropometry-based, 9 socio-behavioural, and 4 diet-associated markers. AUROC of all models were on a par with ADA diabetes risk test on both internal and external validation cohorts (p&gt;0.05).</p><p>CONCLUSIONS: Models performed comparably to the chosen benchmark. Novel behavioural markers such as the number of meals not prepared from home were revealed. This approach may be useful in nutritional epidemiology to unravel new associations with type 2 diabetes.</p>}},
  author       = {{De Silva, Kushan and Lim, Siew and Mousa, Aya and Teede, Helena and Forbes, Andrew and Demmer, Ryan T and Jönsson, Daniel and Enticott, Joanne}},
  issn         = {{1932-6203}},
  keywords     = {{Adult; Algorithms; Benchmarking/methods; Biomarkers/metabolism; Blood Glucose/metabolism; Cohort Studies; Cross-Sectional Studies; Diabetes Mellitus, Type 2/metabolism; Female; Glycated Hemoglobin/metabolism; Humans; Logistic Models; Machine Learning; Male; Middle Aged; Nutrition Surveys; Prediabetic State/metabolism}},
  language     = {{eng}},
  number       = {{5}},
  pages        = {{1--21}},
  publisher    = {{Public Library of Science (PLoS)}},
  series       = {{PLoS ONE}},
  title        = {{Nutritional markers of undiagnosed type 2 diabetes in adults : Findings of a machine learning analysis with external validation and benchmarking}},
  url          = {{http://dx.doi.org/10.1371/journal.pone.0250832}},
  doi          = {{10.1371/journal.pone.0250832}},
  volume       = {{16}},
  year         = {{2021}},
}