A data-driven approach for early dementia prediction using insights from the Swedish National Study on Aging and Care
(2025) In Intelligence-Based Medicine 12.- Abstract
Patients with dementia experience a steady deterioration in cognitive function that increases mortality and impairments. Moreover, dementia is also anticipated to increase significantly in prevalence as the world's population ages, placing a strain on healthcare systems throughout the globe. Hence, early identification and prediction of dementia are essential due to timely treatments, enhanced patient care, and the potential for preventative measures. Therefore, the aim of this project is to construct a diagnostic system that leverages patient electronic medical data to predict dementia as well as dementia risk factors. We developed a novel variable selection method (VSM) based on data mining techniques to accomplish this goal by... (More)
Patients with dementia experience a steady deterioration in cognitive function that increases mortality and impairments. Moreover, dementia is also anticipated to increase significantly in prevalence as the world's population ages, placing a strain on healthcare systems throughout the globe. Hence, early identification and prediction of dementia are essential due to timely treatments, enhanced patient care, and the potential for preventative measures. Therefore, the aim of this project is to construct a diagnostic system that leverages patient electronic medical data to predict dementia as well as dementia risk factors. We developed a novel variable selection method (VSM) based on data mining techniques to accomplish this goal by selecting the most relevant variables from the dataset that contribute to the onset of dementia in older people. We employed a random forest (RF) model to classify dementia, healthy subjects, and the hyperparameters of the selected RF model were adjusted using a random search approach. The proposed diagnostic system is based on two components that hybridize as a single system; therefore, we named it the VSM_RF model. We obtained the dataset from the Swedish National Study on Aging and Care (SNAC) to verify the reliability and accuracy of the proposed VSM_RF model. The three SNAC locations collectively yielded 8191 data observations, each including 75 variables. Numerous validation metrics, including accuracy, balance accuracy, sensitivity, specificity, and Matthew's correlation coefficient, were deployed to thoroughly assess the efficiency of the proposed VSM_RF model. Only six out of the 75 variables were used to achieve the maximum accuracy, along with balance accuracy of 98.00% and 97.29%, respectively.
(Less)
- author
- Javeed, Ashir ; Saleem, Muhammad Asim ; Anderberg, Peter ; Sanmartin Berglund, Johan ; Grande, Giulia ; Overton, Marieclaire LU and Elmståhl, Sölve LU
- organization
- publishing date
- 2025-01
- type
- Contribution to journal
- publication status
- published
- subject
- keywords
- Dementia, Machine learning, Risk factors, Variable selection
- in
- Intelligence-Based Medicine
- volume
- 12
- article number
- 100298
- publisher
- Elsevier
- external identifiers
-
- scopus:105018192335
- DOI
- 10.1016/j.ibmed.2025.100298
- language
- English
- LU publication?
- yes
- id
- 0af5f710-c52b-418f-bef9-fa5314c0a42c
- date added to LUP
- 2025-11-28 11:35:57
- date last changed
- 2025-11-28 11:36:21
@article{0af5f710-c52b-418f-bef9-fa5314c0a42c,
abstract = {{<p>Patients with dementia experience a steady deterioration in cognitive function that increases mortality and impairments. Moreover, dementia is also anticipated to increase significantly in prevalence as the world's population ages, placing a strain on healthcare systems throughout the globe. Hence, early identification and prediction of dementia are essential due to timely treatments, enhanced patient care, and the potential for preventative measures. Therefore, the aim of this project is to construct a diagnostic system that leverages patient electronic medical data to predict dementia as well as dementia risk factors. We developed a novel variable selection method (VSM) based on data mining techniques to accomplish this goal by selecting the most relevant variables from the dataset that contribute to the onset of dementia in older people. We employed a random forest (RF) model to classify dementia, healthy subjects, and the hyperparameters of the selected RF model were adjusted using a random search approach. The proposed diagnostic system is based on two components that hybridize as a single system; therefore, we named it the VSM_RF model. We obtained the dataset from the Swedish National Study on Aging and Care (SNAC) to verify the reliability and accuracy of the proposed VSM_RF model. The three SNAC locations collectively yielded 8191 data observations, each including 75 variables. Numerous validation metrics, including accuracy, balance accuracy, sensitivity, specificity, and Matthew's correlation coefficient, were deployed to thoroughly assess the efficiency of the proposed VSM_RF model. Only six out of the 75 variables were used to achieve the maximum accuracy, along with balance accuracy of 98.00% and 97.29%, respectively.</p>}},
author = {{Javeed, Ashir and Saleem, Muhammad Asim and Anderberg, Peter and Sanmartin Berglund, Johan and Grande, Giulia and Overton, Marieclaire and Elmståhl, Sölve}},
keywords = {{Dementia; Machine learning; Risk factors; Variable selection}},
language = {{eng}},
publisher = {{Elsevier}},
series = {{Intelligence-Based Medicine}},
title = {{A data-driven approach for early dementia prediction using insights from the Swedish National Study on Aging and Care}},
url = {{http://dx.doi.org/10.1016/j.ibmed.2025.100298}},
doi = {{10.1016/j.ibmed.2025.100298}},
volume = {{12}},
year = {{2025}},
}