Skip to main content

LUP Student Papers

LUND UNIVERSITY LIBRARIES

(XG)Boosting Covid-19 forecast accuracy - Utilizing panel data to forecast Covid-19 test positivity in Uppsala County

Endrédi, David LU and Almfors, Noa LU (2025) DABN01 20251
Department of Economics
Department of Statistics
Abstract
The Covid-19 outbreak in early 2020 clearly underscored the importance of societal preparedness and the need for reliable forecasting in managing public health crises. To support effective policies and timely interventions, governments and health authorities require accurate short-term forecasts. In 2022, researchers at Uppsala University evaluated various forecasting techniques using Covid-19-related data from Uppsala County. Building on their work, this study applies the XGBoost algorithm to the same dataset to investigate whether it can improve forecast accuracy. We generate weekly forecasts using an expanding window approach and test three different models, each leveraging the panel structure of the data and using XGBoost as the base... (More)
The Covid-19 outbreak in early 2020 clearly underscored the importance of societal preparedness and the need for reliable forecasting in managing public health crises. To support effective policies and timely interventions, governments and health authorities require accurate short-term forecasts. In 2022, researchers at Uppsala University evaluated various forecasting techniques using Covid-19-related data from Uppsala County. Building on their work, this study applies the XGBoost algorithm to the same dataset to investigate whether it can improve forecast accuracy. We generate weekly forecasts using an expanding window approach and test three different models, each leveraging the panel structure of the data and using XGBoost as the base learner. These models incorporate varying degrees of traditional time series and panel data transformations. Our results show that the best performing model is the one without data transformations, achieving a root mean squared forecast error (RMSFE) of 0.038, slightly outperforming both the First-Difference model and prior ensemble approaches. These findings demonstrate that a single, well-tuned XGBoost model can match or exceed the performance of more complex ensembles. They also highlight the benefits of utilizing the full panel structure rather than training individual models per unit, offering practical implications for future model development in similar forecasting tasks. (Less)
Please use this url to cite or link to this publication:
author
Endrédi, David LU and Almfors, Noa LU
supervisor
organization
course
DABN01 20251
year
type
H1 - Master's Degree (One Year)
subject
keywords
Forecast, COVID-19, XGBoost, Panel Data
language
English
id
9194228
date added to LUP
2025-09-12 09:03:56
date last changed
2025-09-12 09:03:56
@misc{9194228,
  abstract     = {{The Covid-19 outbreak in early 2020 clearly underscored the importance of societal preparedness and the need for reliable forecasting in managing public health crises. To support effective policies and timely interventions, governments and health authorities require accurate short-term forecasts. In 2022, researchers at Uppsala University evaluated various forecasting techniques using Covid-19-related data from Uppsala County. Building on their work, this study applies the XGBoost algorithm to the same dataset to investigate whether it can improve forecast accuracy. We generate weekly forecasts using an expanding window approach and test three different models, each leveraging the panel structure of the data and using XGBoost as the base learner. These models incorporate varying degrees of traditional time series and panel data transformations. Our results show that the best performing model is the one without data transformations, achieving a root mean squared forecast error (RMSFE) of 0.038, slightly outperforming both the First-Difference model and prior ensemble approaches. These findings demonstrate that a single, well-tuned XGBoost model can match or exceed the performance of more complex ensembles. They also highlight the benefits of utilizing the full panel structure rather than training individual models per unit, offering practical implications for future model development in similar forecasting tasks.}},
  author       = {{Endrédi, David and Almfors, Noa}},
  language     = {{eng}},
  note         = {{Student Paper}},
  title        = {{(XG)Boosting Covid-19 forecast accuracy - Utilizing panel data to forecast Covid-19 test positivity in Uppsala County}},
  year         = {{2025}},
}