Bayesian Optimization with Applications to LPJ-GUESS
(2025) In Master’s Theses in Mathematical Sciences BERM03 20251Mathematics (Faculty of Sciences)
Centre for Mathematical Sciences
- Abstract
- This work applied Bayesian Optimization (BO) for the task of calibrating methane-related parameters in the Lund-Potsdam-Jena General Ecosystem Simulator (LPJ-GUESS v4.1). A Gaussian Process (GP) was used as the surrogate model within the BO framework. Additionally, other enhancements we applied to the BO framework such as the use of complexity-penalizing priors for GP hyperparameters and numerically stable acquisition functions, log-Expected Improvement (logEI) and log-Probability of Improvement (logPI). The BO algorithm was then validated on standard benchmark optimization test functions.
The BO framework was then applied to LPJ-GUESS through two experiments. The first, a "twin experiment," used synthetic methane flux data generated by... (More) - This work applied Bayesian Optimization (BO) for the task of calibrating methane-related parameters in the Lund-Potsdam-Jena General Ecosystem Simulator (LPJ-GUESS v4.1). A Gaussian Process (GP) was used as the surrogate model within the BO framework. Additionally, other enhancements we applied to the BO framework such as the use of complexity-penalizing priors for GP hyperparameters and numerically stable acquisition functions, log-Expected Improvement (logEI) and log-Probability of Improvement (logPI). The BO algorithm was then validated on standard benchmark optimization test functions.
The BO framework was then applied to LPJ-GUESS through two experiments. The first, a "twin experiment," used synthetic methane flux data generated by LPJ-GUESS itself to assess parameter retrievability by minimizing a root mean squared error (RMSE) loss. The second experiment applied BO to minimize the RMSE loss against real-world observed methane fluxes from wetlands in Finland.
Results showed that the BO approach significantly improved the LPJ-GUESS model's fit to observed methane data. Furthermore, the study provides insights into the loss landscape and parameter sensitivities, revealing that parameters associated with rougher loss surfaces (e.g., CH4/CO2 ratio, Acrotelm porosity) were better identified. A key finding from both LPJ-GUESS experiments is that the RMSE loss defined solely on methane fluxes, while effectively reduced, is insufficient to fully constrain all ten targeted parameters within the methane module. This suggests that the loss function is insensitive to a subset of these parameters, highlighting challenges in parameter identifiability with the current objective function. The study concludes that BO is a powerful and sample-efficient method for complex model calibration, also offering valuable diagnostics for understanding parameter sensitivities and potentially informing future experimental design. (Less) - Popular Abstract
- Predicting our planet's future and all the involved ecosystem interactions relies on sophisticated computer simulations. Such ecosystem models are vital to our understanding of complex environmental changes and for making informed decisions. One such model is the LPJ-GUESS model, which simulates how plants grow, compete, and interact with their surroundings, including crucial processes such as the release of methane and other greenhouse gases.
For the LPJ-GUESS to produce the most reliable ecosystem simulations, its parameters, which control its behavior, must be tuned or optimized. Optimization can be thought of in terms of the following analogy. Imagine trying to find the highest point in a mountainous village. Each specific location... (More) - Predicting our planet's future and all the involved ecosystem interactions relies on sophisticated computer simulations. Such ecosystem models are vital to our understanding of complex environmental changes and for making informed decisions. One such model is the LPJ-GUESS model, which simulates how plants grow, compete, and interact with their surroundings, including crucial processes such as the release of methane and other greenhouse gases.
For the LPJ-GUESS to produce the most reliable ecosystem simulations, its parameters, which control its behavior, must be tuned or optimized. Optimization can be thought of in terms of the following analogy. Imagine trying to find the highest point in a mountainous village. Each specific location (your coordinates) in the village represents a particular set of parameters for our model, and the altitude at that location represents how well the model performs with those settings. The highest peak in this imaginary village is our goal: the parameter settings that give the most accurate and useful simulations. Of course, you could visit every single spot in the village to find the highest point, but in the world of complex models, we have limited resources – time and computational power. So, the challenge is to find that peak efficiently, without exploring every corner and every alley.
A traditional approach to searching for the optimal point (highest peak) is called gradient-ascent. It operates in a way similar to sending out a blindfolded explorer with a walking stick. The explorer starts at a random spot and, at each step, feels the ground to find the steepest upward slope, then takes a step in that direction. When the ground feels flat, the explorer stops, claiming to have found the summit. With such a method, we might find a peak, but not necessarily the highest one in the entire village.
My thesis explores a more sophisticated approach called Bayesian Optimization. Picture this method as a dedicated cartographer in an archive filled with photographs of the village. The cartographer randomly picks a photo – say, from coordinates "61 North, 24 East." "Aha!" she exclaims, "This photo shows a fish market. The village must be close to water!" With each new photograph (each model run with different parameters), she refines her mental map of the village's landscape. Unlike the blindfolded explorer who operates step-by-step, our cartographer can selectively jump from one potential point of interest to another, anywhere in the village. This allows her to build a more faithful understanding of the entire terrain while searching for the true highest peak. The cartographer stops when her map is detailed enough to confidently identify the village's highest point.
We put these cartography techniques (Bayesian Optimization) to the test by using them to tune the parameters related to methane emissions in the LPJ-GUESS model. Our method was able to improve the model's predictions, making them more representative of real-world methane measurements. But our cartography method offered an additional, valuable insight. It gave us a "map" of how different parameter settings affected the model's performance.
Interestingly, we discovered that some parameters had very little, or even no, effect on the model's methane predictions for the specific setup we studied. It was like our cartographer finding the highest hill, only to realize that, in certain directions, the hilltop was completely flat. You could walk along these flat ridges without changing your altitude at all. This tells us that the way we were measuring the model's performance (our "altitude") and the data we were using weren't sensitive enough to help us tune every single parameter perfectly. Some parameters, it turned out, didn't significantly change the specific outcome we were looking at.
Our study has demonstrated the power of Bayesian Optimization not only for improving the performance of complex environmental models like LPJ-GUESS but also for revealing important insights about the models themselves and how we try to optimize them. We learned that simply trying to match methane emissions wasn't enough to fine-tune every aspect of the model. This understanding paves the way for future research to design even better optimization setups – perhaps by looking at more types of data or by defining "model performance" in a more comprehensive way, which can then be combined with the powerful Bayesian Optimization technique to achieve reliable ecosystem simulations. (Less)
Please use this url to cite or link to this publication:
http://lup.lub.lu.se/student-papers/record/9201268
- author
- Abrash, Mohamed LU
- supervisor
- organization
- course
- BERM03 20251
- year
- 2025
- type
- H2 - Master's Degree (Two Years)
- subject
- keywords
- Bayesian Optimization, Gaussian Random Field, LPJ-GUESS, Parameter Estimation, Data Assimilation, Methane Modeling, Global Optimization, Acquisitions, Surrogate Modeling, Gaussian Process, Complexity Penalizing Priors, Kriging, Automatic Relavence Determination (ARD), Sequential Decision Making, Vegetation Dynamics, Inverse Modeling, Computational Science
- publication/series
- Master’s Theses in Mathematical Sciences
- report number
- LUNFBV-3004-2025
- ISSN
- 1404-6342
- other publication id
- 2025:E62
- language
- English
- id
- 9201268
- date added to LUP
- 2025-07-01 13:50:40
- date last changed
- 2025-07-01 13:55:44
@misc{9201268, abstract = {{This work applied Bayesian Optimization (BO) for the task of calibrating methane-related parameters in the Lund-Potsdam-Jena General Ecosystem Simulator (LPJ-GUESS v4.1). A Gaussian Process (GP) was used as the surrogate model within the BO framework. Additionally, other enhancements we applied to the BO framework such as the use of complexity-penalizing priors for GP hyperparameters and numerically stable acquisition functions, log-Expected Improvement (logEI) and log-Probability of Improvement (logPI). The BO algorithm was then validated on standard benchmark optimization test functions. The BO framework was then applied to LPJ-GUESS through two experiments. The first, a "twin experiment," used synthetic methane flux data generated by LPJ-GUESS itself to assess parameter retrievability by minimizing a root mean squared error (RMSE) loss. The second experiment applied BO to minimize the RMSE loss against real-world observed methane fluxes from wetlands in Finland. Results showed that the BO approach significantly improved the LPJ-GUESS model's fit to observed methane data. Furthermore, the study provides insights into the loss landscape and parameter sensitivities, revealing that parameters associated with rougher loss surfaces (e.g., CH4/CO2 ratio, Acrotelm porosity) were better identified. A key finding from both LPJ-GUESS experiments is that the RMSE loss defined solely on methane fluxes, while effectively reduced, is insufficient to fully constrain all ten targeted parameters within the methane module. This suggests that the loss function is insensitive to a subset of these parameters, highlighting challenges in parameter identifiability with the current objective function. The study concludes that BO is a powerful and sample-efficient method for complex model calibration, also offering valuable diagnostics for understanding parameter sensitivities and potentially informing future experimental design.}}, author = {{Abrash, Mohamed}}, issn = {{1404-6342}}, language = {{eng}}, note = {{Student Paper}}, series = {{Master’s Theses in Mathematical Sciences}}, title = {{Bayesian Optimization with Applications to LPJ-GUESS}}, year = {{2025}}, }