(XG)Boosting Covid-19 forecast accuracy - Utilizing panel data to forecast Covid-19 test positivity in Uppsala County
(2025) DABN01 20251Department of Economics
Department of Statistics
- Abstract
- The Covid-19 outbreak in early 2020 clearly underscored the importance of societal preparedness and the need for reliable forecasting in managing public health crises. To support effective policies and timely interventions, governments and health authorities require accurate short-term forecasts. In 2022, researchers at Uppsala University evaluated various forecasting techniques using Covid-19-related data from Uppsala County. Building on their work, this study applies the XGBoost algorithm to the same dataset to investigate whether it can improve forecast accuracy. We generate weekly forecasts using an expanding window approach and test three different models, each leveraging the panel structure of the data and using XGBoost as the base... (More)
- The Covid-19 outbreak in early 2020 clearly underscored the importance of societal preparedness and the need for reliable forecasting in managing public health crises. To support effective policies and timely interventions, governments and health authorities require accurate short-term forecasts. In 2022, researchers at Uppsala University evaluated various forecasting techniques using Covid-19-related data from Uppsala County. Building on their work, this study applies the XGBoost algorithm to the same dataset to investigate whether it can improve forecast accuracy. We generate weekly forecasts using an expanding window approach and test three different models, each leveraging the panel structure of the data and using XGBoost as the base learner. These models incorporate varying degrees of traditional time series and panel data transformations. Our results show that the best performing model is the one without data transformations, achieving a root mean squared forecast error (RMSFE) of 0.038, slightly outperforming both the First-Difference model and prior ensemble approaches. These findings demonstrate that a single, well-tuned XGBoost model can match or exceed the performance of more complex ensembles. They also highlight the benefits of utilizing the full panel structure rather than training individual models per unit, offering practical implications for future model development in similar forecasting tasks. (Less)
Please use this url to cite or link to this publication:
http://lup.lub.lu.se/student-papers/record/9194228
- author
- Endrédi, David LU and Almfors, Noa LU
- supervisor
- organization
- course
- DABN01 20251
- year
- 2025
- type
- H1 - Master's Degree (One Year)
- subject
- keywords
- Forecast, COVID-19, XGBoost, Panel Data
- language
- English
- id
- 9194228
- date added to LUP
- 2025-09-12 09:03:56
- date last changed
- 2025-09-12 09:03:56
@misc{9194228, abstract = {{The Covid-19 outbreak in early 2020 clearly underscored the importance of societal preparedness and the need for reliable forecasting in managing public health crises. To support effective policies and timely interventions, governments and health authorities require accurate short-term forecasts. In 2022, researchers at Uppsala University evaluated various forecasting techniques using Covid-19-related data from Uppsala County. Building on their work, this study applies the XGBoost algorithm to the same dataset to investigate whether it can improve forecast accuracy. We generate weekly forecasts using an expanding window approach and test three different models, each leveraging the panel structure of the data and using XGBoost as the base learner. These models incorporate varying degrees of traditional time series and panel data transformations. Our results show that the best performing model is the one without data transformations, achieving a root mean squared forecast error (RMSFE) of 0.038, slightly outperforming both the First-Difference model and prior ensemble approaches. These findings demonstrate that a single, well-tuned XGBoost model can match or exceed the performance of more complex ensembles. They also highlight the benefits of utilizing the full panel structure rather than training individual models per unit, offering practical implications for future model development in similar forecasting tasks.}}, author = {{Endrédi, David and Almfors, Noa}}, language = {{eng}}, note = {{Student Paper}}, title = {{(XG)Boosting Covid-19 forecast accuracy - Utilizing panel data to forecast Covid-19 test positivity in Uppsala County}}, year = {{2025}}, }