AP-GRIP evaluation framework for data-driven train delay prediction models : Systematic literature review

Yong, Tiong Kah; Ma, Zhenliang; Palmqvist, Carl William

AP-GRIP evaluation framework for data-driven train delay prediction models : Systematic literature review

Mark

Yong, Tiong Kah ; Ma, Zhenliang and Palmqvist, Carl William ^LU

(2025) In European Transport Research Review 17(1).

Abstract: The surging demand for Intelligent Transportation Systems (ITS) to deliver advanced train-related Information for dispatchers and passengers has spurred the development of advanced train delay prediction models. Despite considerable efforts devoted to developing methodologies that can be used to model train operation conditions and produce anticipated train delays, the evaluation strategies for train delay prediction models remain under-researched, particularly evident when accuracy is always found to be the only determinant in model selection. The absence of a standardised evaluation procedure for assessing the effectiveness of these prediction models has hindered the practical implementation of these models. To bridge this gap, the... (More); The surging demand for Intelligent Transportation Systems (ITS) to deliver advanced train-related Information for dispatchers and passengers has spurred the development of advanced train delay prediction models. Despite considerable efforts devoted to developing methodologies that can be used to model train operation conditions and produce anticipated train delays, the evaluation strategies for train delay prediction models remain under-researched, particularly evident when accuracy is always found to be the only determinant in model selection. The absence of a standardised evaluation procedure for assessing the effectiveness of these prediction models has hindered the practical implementation of these models. To bridge this gap, the study conducted a systematic literature review on data-driven train delay prediction models and introduced the novel AP-GRIP (Accuracy, Precision, Generalisability, Robustness, Interpretability, Practicality) evaluation framework. The framework covers six key aspects across overall, spatial, temporal, and train-specific dimensions, providing a systematic approach for the comprehensive assessment of train delay prediction models. Each aspect and dimension is thoroughly discussed and synthesised with its definitions, measuring metrics, and important considerations. A critical discussion clarifies several interactions, such as predetermined objectives, desired outputs, model type, benchmark models, and data availability, resulting in a logical framework for assessing train delay prediction models. The proposed framework uncovers inadequate prediction patterns, offering insights on when, where, and why the prediction models excel and fall short, assisting end-users in determining model suitability for specific prediction tasks.
(Less)

Please use this url to cite or link to this publication: https://lup.lub.lu.se/record/8e8821ab-2c6b-453b-969d-06e8377c594d

author

Yong, Tiong Kah ; Ma, Zhenliang and Palmqvist, Carl William ^LU

organization

publishing date

2025-03-10

type

Contribution to journal

publication status

published

subject

Transport Systems and Logistics

keywords

Data-driven, Machine learning, Performance evaluation, Train delay prediction

in

European Transport Research Review

volume

17

issue

1

article number

13

publisher

Springer

external identifiers

scopus:86000771629

ISSN

1867-0717

DOI

10.1186/s12544-024-00704-7

project

Utvärdering av ankomstprognoser för tåg

language

English

LU publication?

yes

additional info

id

8e8821ab-2c6b-453b-969d-06e8377c594d

date added to LUP

2025-03-26 22:04:50

date last changed

2026-02-23 15:26:28

@article{8e8821ab-2c6b-453b-969d-06e8377c594d,
  abstract     = {{<p>The surging demand for Intelligent Transportation Systems (ITS) to deliver advanced train-related Information for dispatchers and passengers has spurred the development of advanced train delay prediction models. Despite considerable efforts devoted to developing methodologies that can be used to model train operation conditions and produce anticipated train delays, the evaluation strategies for train delay prediction models remain under-researched, particularly evident when accuracy is always found to be the only determinant in model selection. The absence of a standardised evaluation procedure for assessing the effectiveness of these prediction models has hindered the practical implementation of these models. To bridge this gap, the study conducted a systematic literature review on data-driven train delay prediction models and introduced the novel AP-GRIP (Accuracy, Precision, Generalisability, Robustness, Interpretability, Practicality) evaluation framework. The framework covers six key aspects across overall, spatial, temporal, and train-specific dimensions, providing a systematic approach for the comprehensive assessment of train delay prediction models. Each aspect and dimension is thoroughly discussed and synthesised with its definitions, measuring metrics, and important considerations. A critical discussion clarifies several interactions, such as predetermined objectives, desired outputs, model type, benchmark models, and data availability, resulting in a logical framework for assessing train delay prediction models. The proposed framework uncovers inadequate prediction patterns, offering insights on when, where, and why the prediction models excel and fall short, assisting end-users in determining model suitability for specific prediction tasks.</p>}},
  author       = {{Yong, Tiong Kah and Ma, Zhenliang and Palmqvist, Carl William}},
  issn         = {{1867-0717}},
  keywords     = {{Data-driven; Machine learning; Performance evaluation; Train delay prediction}},
  language     = {{eng}},
  month        = {{03}},
  number       = {{1}},
  publisher    = {{Springer}},
  series       = {{European Transport Research Review}},
  title        = {{AP-GRIP evaluation framework for data-driven train delay prediction models : Systematic literature review}},
  url          = {{http://dx.doi.org/10.1186/s12544-024-00704-7}},
  doi          = {{10.1186/s12544-024-00704-7}},
  volume       = {{17}},
  year         = {{2025}},
}

Lund University Publications

LUND UNIVERSITY LIBRARIES

AP-GRIP evaluation framework for data-driven train delay prediction models : Systematic literature review