Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

A machine learning tool for identifying patients with newly diagnosed diabetes in primary care

Wändell, Per LU ; Carlsson, Axel C. ; Wierzbicka, Marcelina LU ; Sigurdsson, Karolina LU ; Ärnlöv, Johan ; Eriksson, Julia ; Wachtler, Caroline LU and Ruge, Toralph LU (2024) In Primary Care Diabetes 18(5). p.501-505
Abstract

Background and aim: It is crucial to identify a diabetes diagnosis early. Create a predictive model utilizing machine learning (ML) to identify new cases of diabetes in primary health care (PHC). Methods: A case-control study utilizing data on PHC visits for sex-, age, and PHC-matched controls. Stochastic gradient boosting was used to construct a model for predicting cases of diabetes based on diagnostic codes from PHC consultations during the year before index (diagnosis) date and number of consultations. Variable importance was estimated using the normalized relative influence (NRI) score. Risks of having diabetes were calculated using odds ratios of marginal effects (ORME). Four groups by age and sex were studied,... (More)

Background and aim: It is crucial to identify a diabetes diagnosis early. Create a predictive model utilizing machine learning (ML) to identify new cases of diabetes in primary health care (PHC). Methods: A case-control study utilizing data on PHC visits for sex-, age, and PHC-matched controls. Stochastic gradient boosting was used to construct a model for predicting cases of diabetes based on diagnostic codes from PHC consultations during the year before index (diagnosis) date and number of consultations. Variable importance was estimated using the normalized relative influence (NRI) score. Risks of having diabetes were calculated using odds ratios of marginal effects (ORME). Four groups by age and sex were studied, age-groups 35–64 years and ≥ 65 years in men and women, respectively. Results: The most important predictive factors were hypertension with NRI 21.4–29.7 %, and obesity 4.8–15.2 %. The NRI for other top ten diagnoses and administrative codes generally ranged 1.0–4.2 %. Conclusions: Our data confirm the known risk patterns for predicting a new diagnosis of diabetes, and the need to test blood glucose frequently. To assess the full potential of ML for risk prediction purposes in clinical practice, future studies could include clinical data on life-style patterns, laboratory tests and prescribed medication.

(Less)
Please use this url to cite or link to this publication:
author
; ; ; ; ; ; and
organization
publishing date
type
Contribution to journal
publication status
published
subject
keywords
Artificial intelligence, Diabetes, Gradient boosting, Normalized relative influence, Prediction, Primary care
in
Primary Care Diabetes
volume
18
issue
5
pages
5 pages
publisher
Elsevier
external identifiers
  • scopus:85197041315
  • pmid:38944562
ISSN
1751-9918
DOI
10.1016/j.pcd.2024.06.010
language
English
LU publication?
yes
id
e89ac3a2-decf-49ca-9dd0-56d54b238fbf
date added to LUP
2024-12-12 11:05:04
date last changed
2025-06-13 01:16:48
@article{e89ac3a2-decf-49ca-9dd0-56d54b238fbf,
  abstract     = {{<p>Background and aim: It is crucial to identify a diabetes diagnosis early. Create a predictive model utilizing machine learning (ML) to identify new cases of diabetes in primary health care (PHC). Methods: A case-control study utilizing data on PHC visits for sex-, age, and PHC-matched controls. Stochastic gradient boosting was used to construct a model for predicting cases of diabetes based on diagnostic codes from PHC consultations during the year before index (diagnosis) date and number of consultations. Variable importance was estimated using the normalized relative influence (NRI) score. Risks of having diabetes were calculated using odds ratios of marginal effects (OR<sub>ME</sub>). Four groups by age and sex were studied, age-groups 35–64 years and ≥ 65 years in men and women, respectively. Results: The most important predictive factors were hypertension with NRI 21.4–29.7 %, and obesity 4.8–15.2 %. The NRI for other top ten diagnoses and administrative codes generally ranged 1.0–4.2 %. Conclusions: Our data confirm the known risk patterns for predicting a new diagnosis of diabetes, and the need to test blood glucose frequently. To assess the full potential of ML for risk prediction purposes in clinical practice, future studies could include clinical data on life-style patterns, laboratory tests and prescribed medication.</p>}},
  author       = {{Wändell, Per and Carlsson, Axel C. and Wierzbicka, Marcelina and Sigurdsson, Karolina and Ärnlöv, Johan and Eriksson, Julia and Wachtler, Caroline and Ruge, Toralph}},
  issn         = {{1751-9918}},
  keywords     = {{Artificial intelligence; Diabetes; Gradient boosting; Normalized relative influence; Prediction; Primary care}},
  language     = {{eng}},
  number       = {{5}},
  pages        = {{501--505}},
  publisher    = {{Elsevier}},
  series       = {{Primary Care Diabetes}},
  title        = {{A machine learning tool for identifying patients with newly diagnosed diabetes in primary care}},
  url          = {{http://dx.doi.org/10.1016/j.pcd.2024.06.010}},
  doi          = {{10.1016/j.pcd.2024.06.010}},
  volume       = {{18}},
  year         = {{2024}},
}