Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

A data-driven approach for early dementia prediction using insights from the Swedish National Study on Aging and Care

Javeed, Ashir ; Saleem, Muhammad Asim ; Anderberg, Peter ; Sanmartin Berglund, Johan ; Grande, Giulia ; Overton, Marieclaire LU and Elmståhl, Sölve LU (2025) In Intelligence-Based Medicine 12.
Abstract

Patients with dementia experience a steady deterioration in cognitive function that increases mortality and impairments. Moreover, dementia is also anticipated to increase significantly in prevalence as the world's population ages, placing a strain on healthcare systems throughout the globe. Hence, early identification and prediction of dementia are essential due to timely treatments, enhanced patient care, and the potential for preventative measures. Therefore, the aim of this project is to construct a diagnostic system that leverages patient electronic medical data to predict dementia as well as dementia risk factors. We developed a novel variable selection method (VSM) based on data mining techniques to accomplish this goal by... (More)

Patients with dementia experience a steady deterioration in cognitive function that increases mortality and impairments. Moreover, dementia is also anticipated to increase significantly in prevalence as the world's population ages, placing a strain on healthcare systems throughout the globe. Hence, early identification and prediction of dementia are essential due to timely treatments, enhanced patient care, and the potential for preventative measures. Therefore, the aim of this project is to construct a diagnostic system that leverages patient electronic medical data to predict dementia as well as dementia risk factors. We developed a novel variable selection method (VSM) based on data mining techniques to accomplish this goal by selecting the most relevant variables from the dataset that contribute to the onset of dementia in older people. We employed a random forest (RF) model to classify dementia, healthy subjects, and the hyperparameters of the selected RF model were adjusted using a random search approach. The proposed diagnostic system is based on two components that hybridize as a single system; therefore, we named it the VSM_RF model. We obtained the dataset from the Swedish National Study on Aging and Care (SNAC) to verify the reliability and accuracy of the proposed VSM_RF model. The three SNAC locations collectively yielded 8191 data observations, each including 75 variables. Numerous validation metrics, including accuracy, balance accuracy, sensitivity, specificity, and Matthew's correlation coefficient, were deployed to thoroughly assess the efficiency of the proposed VSM_RF model. Only six out of the 75 variables were used to achieve the maximum accuracy, along with balance accuracy of 98.00% and 97.29%, respectively.

(Less)
Please use this url to cite or link to this publication:
author
; ; ; ; ; and
organization
publishing date
type
Contribution to journal
publication status
published
subject
keywords
Dementia, Machine learning, Risk factors, Variable selection
in
Intelligence-Based Medicine
volume
12
article number
100298
publisher
Elsevier
external identifiers
  • scopus:105018192335
DOI
10.1016/j.ibmed.2025.100298
language
English
LU publication?
yes
id
0af5f710-c52b-418f-bef9-fa5314c0a42c
date added to LUP
2025-11-28 11:35:57
date last changed
2025-11-28 11:36:21
@article{0af5f710-c52b-418f-bef9-fa5314c0a42c,
  abstract     = {{<p>Patients with dementia experience a steady deterioration in cognitive function that increases mortality and impairments. Moreover, dementia is also anticipated to increase significantly in prevalence as the world's population ages, placing a strain on healthcare systems throughout the globe. Hence, early identification and prediction of dementia are essential due to timely treatments, enhanced patient care, and the potential for preventative measures. Therefore, the aim of this project is to construct a diagnostic system that leverages patient electronic medical data to predict dementia as well as dementia risk factors. We developed a novel variable selection method (VSM) based on data mining techniques to accomplish this goal by selecting the most relevant variables from the dataset that contribute to the onset of dementia in older people. We employed a random forest (RF) model to classify dementia, healthy subjects, and the hyperparameters of the selected RF model were adjusted using a random search approach. The proposed diagnostic system is based on two components that hybridize as a single system; therefore, we named it the VSM_RF model. We obtained the dataset from the Swedish National Study on Aging and Care (SNAC) to verify the reliability and accuracy of the proposed VSM_RF model. The three SNAC locations collectively yielded 8191 data observations, each including 75 variables. Numerous validation metrics, including accuracy, balance accuracy, sensitivity, specificity, and Matthew's correlation coefficient, were deployed to thoroughly assess the efficiency of the proposed VSM_RF model. Only six out of the 75 variables were used to achieve the maximum accuracy, along with balance accuracy of 98.00% and 97.29%, respectively.</p>}},
  author       = {{Javeed, Ashir and Saleem, Muhammad Asim and Anderberg, Peter and Sanmartin Berglund, Johan and Grande, Giulia and Overton, Marieclaire and Elmståhl, Sölve}},
  keywords     = {{Dementia; Machine learning; Risk factors; Variable selection}},
  language     = {{eng}},
  publisher    = {{Elsevier}},
  series       = {{Intelligence-Based Medicine}},
  title        = {{A data-driven approach for early dementia prediction using insights from the Swedish National Study on Aging and Care}},
  url          = {{http://dx.doi.org/10.1016/j.ibmed.2025.100298}},
  doi          = {{10.1016/j.ibmed.2025.100298}},
  volume       = {{12}},
  year         = {{2025}},
}