Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

AP-GRIP evaluation framework for data-driven train delay prediction models : Systematic literature review

Yong, Tiong Kah ; Ma, Zhenliang and Palmqvist, Carl William LU orcid (2025) In European Transport Research Review 17(1).
Abstract

The surging demand for Intelligent Transportation Systems (ITS) to deliver advanced train-related Information for dispatchers and passengers has spurred the development of advanced train delay prediction models. Despite considerable efforts devoted to developing methodologies that can be used to model train operation conditions and produce anticipated train delays, the evaluation strategies for train delay prediction models remain under-researched, particularly evident when accuracy is always found to be the only determinant in model selection. The absence of a standardised evaluation procedure for assessing the effectiveness of these prediction models has hindered the practical implementation of these models. To bridge this gap, the... (More)

The surging demand for Intelligent Transportation Systems (ITS) to deliver advanced train-related Information for dispatchers and passengers has spurred the development of advanced train delay prediction models. Despite considerable efforts devoted to developing methodologies that can be used to model train operation conditions and produce anticipated train delays, the evaluation strategies for train delay prediction models remain under-researched, particularly evident when accuracy is always found to be the only determinant in model selection. The absence of a standardised evaluation procedure for assessing the effectiveness of these prediction models has hindered the practical implementation of these models. To bridge this gap, the study conducted a systematic literature review on data-driven train delay prediction models and introduced the novel AP-GRIP (Accuracy, Precision, Generalisability, Robustness, Interpretability, Practicality) evaluation framework. The framework covers six key aspects across overall, spatial, temporal, and train-specific dimensions, providing a systematic approach for the comprehensive assessment of train delay prediction models. Each aspect and dimension is thoroughly discussed and synthesised with its definitions, measuring metrics, and important considerations. A critical discussion clarifies several interactions, such as predetermined objectives, desired outputs, model type, benchmark models, and data availability, resulting in a logical framework for assessing train delay prediction models. The proposed framework uncovers inadequate prediction patterns, offering insights on when, where, and why the prediction models excel and fall short, assisting end-users in determining model suitability for specific prediction tasks.

(Less)
Please use this url to cite or link to this publication:
author
; and
organization
publishing date
type
Contribution to journal
publication status
published
subject
keywords
Data-driven, Machine learning, Performance evaluation, Train delay prediction
in
European Transport Research Review
volume
17
issue
1
article number
13
publisher
Springer
external identifiers
  • scopus:86000771629
ISSN
1867-0717
DOI
10.1186/s12544-024-00704-7
project
Utvärdering av ankomstprognoser för tåg
language
English
LU publication?
yes
additional info
Publisher Copyright: © The Author(s) 2025.
id
8e8821ab-2c6b-453b-969d-06e8377c594d
date added to LUP
2025-03-26 22:04:50
date last changed
2025-04-04 14:47:04
@article{8e8821ab-2c6b-453b-969d-06e8377c594d,
  abstract     = {{<p>The surging demand for Intelligent Transportation Systems (ITS) to deliver advanced train-related Information for dispatchers and passengers has spurred the development of advanced train delay prediction models. Despite considerable efforts devoted to developing methodologies that can be used to model train operation conditions and produce anticipated train delays, the evaluation strategies for train delay prediction models remain under-researched, particularly evident when accuracy is always found to be the only determinant in model selection. The absence of a standardised evaluation procedure for assessing the effectiveness of these prediction models has hindered the practical implementation of these models. To bridge this gap, the study conducted a systematic literature review on data-driven train delay prediction models and introduced the novel AP-GRIP (Accuracy, Precision, Generalisability, Robustness, Interpretability, Practicality) evaluation framework. The framework covers six key aspects across overall, spatial, temporal, and train-specific dimensions, providing a systematic approach for the comprehensive assessment of train delay prediction models. Each aspect and dimension is thoroughly discussed and synthesised with its definitions, measuring metrics, and important considerations. A critical discussion clarifies several interactions, such as predetermined objectives, desired outputs, model type, benchmark models, and data availability, resulting in a logical framework for assessing train delay prediction models. The proposed framework uncovers inadequate prediction patterns, offering insights on when, where, and why the prediction models excel and fall short, assisting end-users in determining model suitability for specific prediction tasks.</p>}},
  author       = {{Yong, Tiong Kah and Ma, Zhenliang and Palmqvist, Carl William}},
  issn         = {{1867-0717}},
  keywords     = {{Data-driven; Machine learning; Performance evaluation; Train delay prediction}},
  language     = {{eng}},
  month        = {{03}},
  number       = {{1}},
  publisher    = {{Springer}},
  series       = {{European Transport Research Review}},
  title        = {{AP-GRIP evaluation framework for data-driven train delay prediction models : Systematic literature review}},
  url          = {{http://dx.doi.org/10.1186/s12544-024-00704-7}},
  doi          = {{10.1186/s12544-024-00704-7}},
  volume       = {{17}},
  year         = {{2025}},
}