Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

The advanced machine learner XGBoost did not reduce prehospital trauma mistriage compared with logistic regression : a simulation study

Larsson, Anna ; Berg, Johanna LU ; Gellerfors, Mikael and Gerdin Wärnberg, Martin (2021) In BMC Medical Informatics and Decision Making 21(1).
Abstract

Background: Accurate prehospital trauma triage is crucial for identifying critically injured patients and determining the level of care. In the prehospital setting, time and data are often scarce, limiting the complexity of triage models. The aim of this study was to assess whether, compared with logistic regression, the advanced machine learner XGBoost (eXtreme Gradient Boosting) is associated with reduced prehospital trauma mistriage. Methods: We conducted a simulation study based on data from the US National Trauma Data Bank (NTDB) and the Swedish Trauma Registry (SweTrau). We used categorized systolic blood pressure, respiratory rate, Glasgow Coma Scale and age as our predictors. The outcome was the difference in under- and... (More)

Background: Accurate prehospital trauma triage is crucial for identifying critically injured patients and determining the level of care. In the prehospital setting, time and data are often scarce, limiting the complexity of triage models. The aim of this study was to assess whether, compared with logistic regression, the advanced machine learner XGBoost (eXtreme Gradient Boosting) is associated with reduced prehospital trauma mistriage. Methods: We conducted a simulation study based on data from the US National Trauma Data Bank (NTDB) and the Swedish Trauma Registry (SweTrau). We used categorized systolic blood pressure, respiratory rate, Glasgow Coma Scale and age as our predictors. The outcome was the difference in under- and overtriage rates between the models for different training dataset sizes. Results: We used data from 813,567 patients in the NTDB and 30,577 patients in SweTrau. In SweTrau, the smallest training set of 10 events per free parameter was sufficient for model development. XGBoost achieved undertriage rates in the range of 0.314–0.324 with corresponding overtriage rates of 0.319–0.322. Logistic regression achieved undertriage rates ranging from 0.312 to 0.321 with associated overtriage rates ranging from 0.321 to 0.323. In NTDB, XGBoost required the largest training set size of 1000 events per free parameter to achieve robust results, whereas logistic regression achieved stable performance from a training set size of 25 events per free parameter. For the training set size of 1000 events per free parameter, XGBoost obtained an undertriage rate of 0.406 with an overtriage of 0.463. For logistic regression, the corresponding undertriage was 0.395 with an overtriage of 0.468. Conclusion: The under- and overtriage rates associated with the advanced machine learner XGBoost were similar to the rates associated with logistic regression regardless of sample size, but XGBoost required larger training sets to obtain robust results. We do not recommend using XGBoost over logistic regression in this context when predictors are few and categorical.

(Less)
Please use this url to cite or link to this publication:
author
; ; and
publishing date
type
Contribution to journal
publication status
published
subject
keywords
Clinical prediction model, Machine learning, Overtriage, Prehospital triage, Trauma, Undertriage
in
BMC Medical Informatics and Decision Making
volume
21
issue
1
article number
192
publisher
BioMed Central (BMC)
external identifiers
  • pmid:34148560
  • scopus:85108185043
ISSN
1472-6947
DOI
10.1186/s12911-021-01558-y
language
English
LU publication?
no
id
be66e616-e867-43c9-8370-38104ac8b594
alternative location
https://bmcmedinformdecismak.biomedcentral.com/articles/10.1186/s12911-021-01558-y
date added to LUP
2021-08-11 09:01:09
date last changed
2024-06-15 14:02:30
@article{be66e616-e867-43c9-8370-38104ac8b594,
  abstract     = {{<p>Background: Accurate prehospital trauma triage is crucial for identifying critically injured patients and determining the level of care. In the prehospital setting, time and data are often scarce, limiting the complexity of triage models. The aim of this study was to assess whether, compared with logistic regression, the advanced machine learner XGBoost (eXtreme Gradient Boosting) is associated with reduced prehospital trauma mistriage. Methods: We conducted a simulation study based on data from the US National Trauma Data Bank (NTDB) and the Swedish Trauma Registry (SweTrau). We used categorized systolic blood pressure, respiratory rate, Glasgow Coma Scale and age as our predictors. The outcome was the difference in under- and overtriage rates between the models for different training dataset sizes. Results: We used data from 813,567 patients in the NTDB and 30,577 patients in SweTrau. In SweTrau, the smallest training set of 10 events per free parameter was sufficient for model development. XGBoost achieved undertriage rates in the range of 0.314–0.324 with corresponding overtriage rates of 0.319–0.322. Logistic regression achieved undertriage rates ranging from 0.312 to 0.321 with associated overtriage rates ranging from 0.321 to 0.323. In NTDB, XGBoost required the largest training set size of 1000 events per free parameter to achieve robust results, whereas logistic regression achieved stable performance from a training set size of 25 events per free parameter. For the training set size of 1000 events per free parameter, XGBoost obtained an undertriage rate of 0.406 with an overtriage of 0.463. For logistic regression, the corresponding undertriage was 0.395 with an overtriage of 0.468. Conclusion: The under- and overtriage rates associated with the advanced machine learner XGBoost were similar to the rates associated with logistic regression regardless of sample size, but XGBoost required larger training sets to obtain robust results. We do not recommend using XGBoost over logistic regression in this context when predictors are few and categorical.</p>}},
  author       = {{Larsson, Anna and Berg, Johanna and Gellerfors, Mikael and Gerdin Wärnberg, Martin}},
  issn         = {{1472-6947}},
  keywords     = {{Clinical prediction model; Machine learning; Overtriage; Prehospital triage; Trauma; Undertriage}},
  language     = {{eng}},
  number       = {{1}},
  publisher    = {{BioMed Central (BMC)}},
  series       = {{BMC Medical Informatics and Decision Making}},
  title        = {{The advanced machine learner XGBoost did not reduce prehospital trauma mistriage compared with logistic regression : a simulation study}},
  url          = {{http://dx.doi.org/10.1186/s12911-021-01558-y}},
  doi          = {{10.1186/s12911-021-01558-y}},
  volume       = {{21}},
  year         = {{2021}},
}