Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

Breaking barriers : a statistical and machine learning-based hybrid system for predicting dementia

Javeed, Ashir ; Anderberg, Peter ; Ghazi, Ahmad Nauman ; Noor, Adeeb ; Elmståhl, Sölve LU and Berglund, Johan Sanmartin (2023) In Frontiers in Bioengineering and Biotechnology 11.
Abstract

Introduction: Dementia is a condition (a collection of related signs and symptoms) that causes a continuing deterioration in cognitive function, and millions of people are impacted by dementia every year as the world population continues to rise. Conventional approaches for determining dementia rely primarily on clinical examinations, analyzing medical records, and administering cognitive and neuropsychological testing. However, these methods are time-consuming and costly in terms of treatment. Therefore, this study aims to present a noninvasive method for the early prediction of dementia so that preventive steps should be taken to avoid dementia. Methods: We developed a hybrid diagnostic system based on statistical and machine learning... (More)

Introduction: Dementia is a condition (a collection of related signs and symptoms) that causes a continuing deterioration in cognitive function, and millions of people are impacted by dementia every year as the world population continues to rise. Conventional approaches for determining dementia rely primarily on clinical examinations, analyzing medical records, and administering cognitive and neuropsychological testing. However, these methods are time-consuming and costly in terms of treatment. Therefore, this study aims to present a noninvasive method for the early prediction of dementia so that preventive steps should be taken to avoid dementia. Methods: We developed a hybrid diagnostic system based on statistical and machine learning (ML) methods that used patient electronic health records to predict dementia. The dataset used for this study was obtained from the Swedish National Study on Aging and Care (SNAC), with a sample size of 43040 and 75 features. The newly constructed diagnostic extracts a subset of useful features from the dataset through a statistical method (F-score). For the classification, we developed an ensemble voting classifier based on five different ML models: decision tree (DT), naive Bayes (NB), logistic regression (LR), support vector machines (SVM), and random forest (RF). To address the problem of ML model overfitting, we used a cross-validation approach to evaluate the performance of the proposed diagnostic system. Various assessment measures, such as accuracy, sensitivity, specificity, receiver operating characteristic (ROC) curve, and Matthew’s correlation coefficient (MCC), were used to thoroughly validate the devised diagnostic system’s efficiency. Results: According to the experimental results, the proposed diagnostic method achieved the best accuracy of 98.25%, as well as sensitivity of 97.44%, specificity of 95.744%, and MCC of 0.7535. Discussion: The effectiveness of the proposed diagnostic approach is compared to various cutting-edge feature selection techniques and baseline ML models. From experimental results, it is evident that the proposed diagnostic system outperformed the prior feature selection strategies and baseline ML models regarding accuracy.

(Less)
Please use this url to cite or link to this publication:
author
; ; ; ; and
organization
publishing date
type
Contribution to journal
publication status
published
subject
keywords
dementia, F-score, feature selection, machine learning, voting classifier
in
Frontiers in Bioengineering and Biotechnology
volume
11
article number
1336255
publisher
Frontiers Media S. A.
external identifiers
  • pmid:38260734
  • scopus:85182656352
ISSN
2296-4185
DOI
10.3389/fbioe.2023.1336255
language
English
LU publication?
yes
id
c1070881-1cbe-4027-8b93-217becd3c7a4
date added to LUP
2024-02-15 14:10:12
date last changed
2024-04-16 13:11:33
@article{c1070881-1cbe-4027-8b93-217becd3c7a4,
  abstract     = {{<p>Introduction: Dementia is a condition (a collection of related signs and symptoms) that causes a continuing deterioration in cognitive function, and millions of people are impacted by dementia every year as the world population continues to rise. Conventional approaches for determining dementia rely primarily on clinical examinations, analyzing medical records, and administering cognitive and neuropsychological testing. However, these methods are time-consuming and costly in terms of treatment. Therefore, this study aims to present a noninvasive method for the early prediction of dementia so that preventive steps should be taken to avoid dementia. Methods: We developed a hybrid diagnostic system based on statistical and machine learning (ML) methods that used patient electronic health records to predict dementia. The dataset used for this study was obtained from the Swedish National Study on Aging and Care (SNAC), with a sample size of 43040 and 75 features. The newly constructed diagnostic extracts a subset of useful features from the dataset through a statistical method (F-score). For the classification, we developed an ensemble voting classifier based on five different ML models: decision tree (DT), naive Bayes (NB), logistic regression (LR), support vector machines (SVM), and random forest (RF). To address the problem of ML model overfitting, we used a cross-validation approach to evaluate the performance of the proposed diagnostic system. Various assessment measures, such as accuracy, sensitivity, specificity, receiver operating characteristic (ROC) curve, and Matthew’s correlation coefficient (MCC), were used to thoroughly validate the devised diagnostic system’s efficiency. Results: According to the experimental results, the proposed diagnostic method achieved the best accuracy of 98.25%, as well as sensitivity of 97.44%, specificity of 95.744%, and MCC of 0.7535. Discussion: The effectiveness of the proposed diagnostic approach is compared to various cutting-edge feature selection techniques and baseline ML models. From experimental results, it is evident that the proposed diagnostic system outperformed the prior feature selection strategies and baseline ML models regarding accuracy.</p>}},
  author       = {{Javeed, Ashir and Anderberg, Peter and Ghazi, Ahmad Nauman and Noor, Adeeb and Elmståhl, Sölve and Berglund, Johan Sanmartin}},
  issn         = {{2296-4185}},
  keywords     = {{dementia; F-score; feature selection; machine learning; voting classifier}},
  language     = {{eng}},
  publisher    = {{Frontiers Media S. A.}},
  series       = {{Frontiers in Bioengineering and Biotechnology}},
  title        = {{Breaking barriers : a statistical and machine learning-based hybrid system for predicting dementia}},
  url          = {{http://dx.doi.org/10.3389/fbioe.2023.1336255}},
  doi          = {{10.3389/fbioe.2023.1336255}},
  volume       = {{11}},
  year         = {{2023}},
}