Forecasting Storage Demand in Multi-Tenant Cloud ERP Systems
(2026) In Master's Theses in Mathematical Sciences FMSM01 20252Mathematical Statistics
- Abstract
- Accurately forecasting storage demand is a central challenge in multi-tenant, cloud-based Enterprise Resource Planning (ERP) systems, both at the level of individual users of the platform (tenants) and for the platform as a whole. This thesis investigates storage forecasting in a modern cloud ERP system using large-scale, table-level telemetry data.
At the tenant level, the thesis addresses the problem of estimating long-term storage requirements for new tenants when no historical usage data is available. Using a small set of database tables that can be reasonably estimated during tenant onboarding, several predictive models are evaluated. A gradient boosting model is selected as the final solution. The resulting forecasting tool... (More) - Accurately forecasting storage demand is a central challenge in multi-tenant, cloud-based Enterprise Resource Planning (ERP) systems, both at the level of individual users of the platform (tenants) and for the platform as a whole. This thesis investigates storage forecasting in a modern cloud ERP system using large-scale, table-level telemetry data.
At the tenant level, the thesis addresses the problem of estimating long-term storage requirements for new tenants when no historical usage data is available. Using a small set of database tables that can be reasonably estimated during tenant onboarding, several predictive models are evaluated. A gradient boosting model is selected as the final solution. The resulting forecasting tool produces practically useful predictions, with a majority of forecasts within a moderate error margin and a systematic bias toward overestimation, which is operationally safer for tenants.
At the platform level, aggregated storage usage across the entire ERP system is modelled as a time series to support capacity planning. Traditional statistical models and Prophet are evaluated, showing that overall storage growth is largely trend-driven and can be forecast reliably using relatively simple models.
In addition, the structural relationships between database tables are analysed using network dynamics measures to evaluate alternative input features for tenant-level forecasting. While this method provides valuable insight, it is primarily useful for validation rather than standalone feature selection. (Less) - Popular Abstract
- Many people have experienced running out of cloud storage on their phone or computer and know how frustrating it can be. The same problem can occur for businesses, who rely on cloud-based Enterprise Resource Planning (ERP) systems to store business data and manage core operations. Unexpected storage shortages can lead to increased costs and reduced performance.
These ERP systems are often delivered as multi-tenant Software-as-a-Service (SaaS) platforms, where many organizations share the same underlying infrastructure. As a result, accurately predicting storage growth is important both for individual companies planning their future needs and for platform owners responsible for long-term capacity planning. This thesis addresses this... (More) - Many people have experienced running out of cloud storage on their phone or computer and know how frustrating it can be. The same problem can occur for businesses, who rely on cloud-based Enterprise Resource Planning (ERP) systems to store business data and manage core operations. Unexpected storage shortages can lead to increased costs and reduced performance.
These ERP systems are often delivered as multi-tenant Software-as-a-Service (SaaS) platforms, where many organizations share the same underlying infrastructure. As a result, accurately predicting storage growth is important both for individual companies planning their future needs and for platform owners responsible for long-term capacity planning. This thesis addresses this challenge by focusing on predicting storage growth in a multi-tenant cloud ERP system.
By using machine learning and statistical methods, storage growth is analysed from two perspectives: the ERP system as a whole and individual customer usage.
For the analysis focused on the ERP platform, the goal is to support capacity planning and ensure that the platform can handle future data growth. Total storage usage across the entire system is analysed over time by tracking how the amount of stored data changes from day to day. This creates a time series that makes it possible to identify long-term patterns and trends, which can then be used to forecast future storage growth. The project evaluates a range of forecasting approaches, from simple linear models to more advanced time series methods. The results show that storage growth at the platform level is largely driven by a long-term trend, meaning that future storage demand can be predicted reliably without the need for overly complex models.
The second part of the thesis focuses on individual companies using the ERP system, where having the wrong amount of storage can be both costly and inefficient. To address this, an easy-to-use forecasting tool is developed. Using only a limited set of available information about a customer’s business, the tool applies a machine learning model to predict how the company’s storage usage is likely to grow over time. The study resulted in a model and a successful prototype that can be further developed in future work.
In addition, the underlying structure of the ERP system is analysed to better understand how different types of data grow together. By representing the ERP system as a network of interconnected data, network dynamics analysis is used to study these relationships. This analysis provides new insights into the structural properties of the system
and how they relate to storage growth.
Together, these results demonstrate how data-driven forecasting methods can support more efficient and reliable resource planning in cloud ERP systems. (Less)
Please use this url to cite or link to this publication:
http://lup.lub.lu.se/student-papers/record/9222397
- author
- Djärv, Ella LU and Karlsson, Sebastian LU
- supervisor
- organization
- alternative title
- Prognostisering av lagringsbehov i multi-tenant molnbaserade ERP-system
- course
- FMSM01 20252
- year
- 2026
- type
- H2 - Master's Degree (Two Years)
- subject
- keywords
- Enterprise Resource Planning, Cloud-Based Systems, Multi-Tenant Systems, Storage Forecasting, Capacity Planning, Time Series Analysis, Machine Learning, Tree-Based Models, Network Analysis
- publication/series
- Master's Theses in Mathematical Sciences
- report number
- LUTFMS-3550-2026
- ISSN
- 1404-6342
- other publication id
- 2026:E14
- language
- English
- id
- 9222397
- date added to LUP
- 2026-02-13 08:29:35
- date last changed
- 2026-02-13 08:29:35
@misc{9222397,
abstract = {{Accurately forecasting storage demand is a central challenge in multi-tenant, cloud-based Enterprise Resource Planning (ERP) systems, both at the level of individual users of the platform (tenants) and for the platform as a whole. This thesis investigates storage forecasting in a modern cloud ERP system using large-scale, table-level telemetry data.
At the tenant level, the thesis addresses the problem of estimating long-term storage requirements for new tenants when no historical usage data is available. Using a small set of database tables that can be reasonably estimated during tenant onboarding, several predictive models are evaluated. A gradient boosting model is selected as the final solution. The resulting forecasting tool produces practically useful predictions, with a majority of forecasts within a moderate error margin and a systematic bias toward overestimation, which is operationally safer for tenants.
At the platform level, aggregated storage usage across the entire ERP system is modelled as a time series to support capacity planning. Traditional statistical models and Prophet are evaluated, showing that overall storage growth is largely trend-driven and can be forecast reliably using relatively simple models.
In addition, the structural relationships between database tables are analysed using network dynamics measures to evaluate alternative input features for tenant-level forecasting. While this method provides valuable insight, it is primarily useful for validation rather than standalone feature selection.}},
author = {{Djärv, Ella and Karlsson, Sebastian}},
issn = {{1404-6342}},
language = {{eng}},
note = {{Student Paper}},
series = {{Master's Theses in Mathematical Sciences}},
title = {{Forecasting Storage Demand in Multi-Tenant Cloud ERP Systems}},
year = {{2026}},
}