Skip to main content

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Forecasting Storage Demand in Multi-Tenant Cloud ERP Systems

Djärv, Ella LU and Karlsson, Sebastian LU (2026) In Master's Theses in Mathematical Sciences FMSM01 20252
Mathematical Statistics
Abstract
Accurately forecasting storage demand is a central challenge in multi-tenant, cloud-based Enterprise Resource Planning (ERP) systems, both at the level of individual users of the platform (tenants) and for the platform as a whole. This thesis investigates storage forecasting in a modern cloud ERP system using large-scale, table-level telemetry data.

At the tenant level, the thesis addresses the problem of estimating long-term storage requirements for new tenants when no historical usage data is available. Using a small set of database tables that can be reasonably estimated during tenant onboarding, several predictive models are evaluated. A gradient boosting model is selected as the final solution. The resulting forecasting tool... (More)
Accurately forecasting storage demand is a central challenge in multi-tenant, cloud-based Enterprise Resource Planning (ERP) systems, both at the level of individual users of the platform (tenants) and for the platform as a whole. This thesis investigates storage forecasting in a modern cloud ERP system using large-scale, table-level telemetry data.

At the tenant level, the thesis addresses the problem of estimating long-term storage requirements for new tenants when no historical usage data is available. Using a small set of database tables that can be reasonably estimated during tenant onboarding, several predictive models are evaluated. A gradient boosting model is selected as the final solution. The resulting forecasting tool produces practically useful predictions, with a majority of forecasts within a moderate error margin and a systematic bias toward overestimation, which is operationally safer for tenants.

At the platform level, aggregated storage usage across the entire ERP system is modelled as a time series to support capacity planning. Traditional statistical models and Prophet are evaluated, showing that overall storage growth is largely trend-driven and can be forecast reliably using relatively simple models.

In addition, the structural relationships between database tables are analysed using network dynamics measures to evaluate alternative input features for tenant-level forecasting. While this method provides valuable insight, it is primarily useful for validation rather than standalone feature selection. (Less)
Popular Abstract
Many people have experienced running out of cloud storage on their phone or computer and know how frustrating it can be. The same problem can occur for businesses, who rely on cloud-based Enterprise Resource Planning (ERP) systems to store business data and manage core operations. Unexpected storage shortages can lead to increased costs and reduced performance.

These ERP systems are often delivered as multi-tenant Software-as-a-Service (SaaS) platforms, where many organizations share the same underlying infrastructure. As a result, accurately predicting storage growth is important both for individual companies planning their future needs and for platform owners responsible for long-term capacity planning. This thesis addresses this... (More)
Many people have experienced running out of cloud storage on their phone or computer and know how frustrating it can be. The same problem can occur for businesses, who rely on cloud-based Enterprise Resource Planning (ERP) systems to store business data and manage core operations. Unexpected storage shortages can lead to increased costs and reduced performance.

These ERP systems are often delivered as multi-tenant Software-as-a-Service (SaaS) platforms, where many organizations share the same underlying infrastructure. As a result, accurately predicting storage growth is important both for individual companies planning their future needs and for platform owners responsible for long-term capacity planning. This thesis addresses this challenge by focusing on predicting storage growth in a multi-tenant cloud ERP system.

By using machine learning and statistical methods, storage growth is analysed from two perspectives: the ERP system as a whole and individual customer usage.

For the analysis focused on the ERP platform, the goal is to support capacity planning and ensure that the platform can handle future data growth. Total storage usage across the entire system is analysed over time by tracking how the amount of stored data changes from day to day. This creates a time series that makes it possible to identify long-term patterns and trends, which can then be used to forecast future storage growth. The project evaluates a range of forecasting approaches, from simple linear models to more advanced time series methods. The results show that storage growth at the platform level is largely driven by a long-term trend, meaning that future storage demand can be predicted reliably without the need for overly complex models.

The second part of the thesis focuses on individual companies using the ERP system, where having the wrong amount of storage can be both costly and inefficient. To address this, an easy-to-use forecasting tool is developed. Using only a limited set of available information about a customer’s business, the tool applies a machine learning model to predict how the company’s storage usage is likely to grow over time. The study resulted in a model and a successful prototype that can be further developed in future work.

In addition, the underlying structure of the ERP system is analysed to better understand how different types of data grow together. By representing the ERP system as a network of interconnected data, network dynamics analysis is used to study these relationships. This analysis provides new insights into the structural properties of the system
and how they relate to storage growth.

Together, these results demonstrate how data-driven forecasting methods can support more efficient and reliable resource planning in cloud ERP systems. (Less)
Please use this url to cite or link to this publication:
author
Djärv, Ella LU and Karlsson, Sebastian LU
supervisor
organization
alternative title
Prognostisering av lagringsbehov i multi-tenant molnbaserade ERP-system
course
FMSM01 20252
year
type
H2 - Master's Degree (Two Years)
subject
keywords
Enterprise Resource Planning, Cloud-Based Systems, Multi-Tenant Systems, Storage Forecasting, Capacity Planning, Time Series Analysis, Machine Learning, Tree-Based Models, Network Analysis
publication/series
Master's Theses in Mathematical Sciences
report number
LUTFMS-3550-2026
ISSN
1404-6342
other publication id
2026:E14
language
English
id
9222397
date added to LUP
2026-02-13 08:29:35
date last changed
2026-02-13 08:29:35
@misc{9222397,
  abstract     = {{Accurately forecasting storage demand is a central challenge in multi-tenant, cloud-based Enterprise Resource Planning (ERP) systems, both at the level of individual users of the platform (tenants) and for the platform as a whole. This thesis investigates storage forecasting in a modern cloud ERP system using large-scale, table-level telemetry data.

At the tenant level, the thesis addresses the problem of estimating long-term storage requirements for new tenants when no historical usage data is available. Using a small set of database tables that can be reasonably estimated during tenant onboarding, several predictive models are evaluated. A gradient boosting model is selected as the final solution. The resulting forecasting tool produces practically useful predictions, with a majority of forecasts within a moderate error margin and a systematic bias toward overestimation, which is operationally safer for tenants.

At the platform level, aggregated storage usage across the entire ERP system is modelled as a time series to support capacity planning. Traditional statistical models and Prophet are evaluated, showing that overall storage growth is largely trend-driven and can be forecast reliably using relatively simple models.

In addition, the structural relationships between database tables are analysed using network dynamics measures to evaluate alternative input features for tenant-level forecasting. While this method provides valuable insight, it is primarily useful for validation rather than standalone feature selection.}},
  author       = {{Djärv, Ella and Karlsson, Sebastian}},
  issn         = {{1404-6342}},
  language     = {{eng}},
  note         = {{Student Paper}},
  series       = {{Master's Theses in Mathematical Sciences}},
  title        = {{Forecasting Storage Demand in Multi-Tenant Cloud ERP Systems}},
  year         = {{2026}},
}