Parameter Estimation for Hawke’s Processes
(2025) In Bachelor’s Theses in Mathematical Sciences MASK11 20251Mathematical Statistics
- Abstract
- Spatio-temporal Hawkes processes are a natural way of modelling point process data. They are used in a range of contexts for their ability to separate ”background” arrivals from secondary spread events, yet standard inference algorithms work from a complete event history. Motivated by a study looking at photographic records of an invasive rose-bush population with irregular time intervals between images, this thesis attempts to develop a forward–backward expectation–maximisation (EM) algorithm that (i) estimates the baseline rate, triggering strength and spatial spread, and (ii) reconstructs the latent intensity fields for any fully unobserved time steps and their subsequent affected time
steps. The latter E-step consists of forward... (More) - Spatio-temporal Hawkes processes are a natural way of modelling point process data. They are used in a range of contexts for their ability to separate ”background” arrivals from secondary spread events, yet standard inference algorithms work from a complete event history. Motivated by a study looking at photographic records of an invasive rose-bush population with irregular time intervals between images, this thesis attempts to develop a forward–backward expectation–maximisation (EM) algorithm that (i) estimates the baseline rate, triggering strength and spatial spread, and (ii) reconstructs the latent intensity fields for any fully unobserved time steps and their subsequent affected time
steps. The latter E-step consists of forward kernel convolutions and empirical intensity FFT-based deconvolution to estimate latent intensity fields. The M-step updates the model parameters.
In a simulation study of 30 time steps with every eighth slice hidden, repeated for 30 randomly selected seeds, the algorithm recovered the true parameters to within 30% on average, with missing time step intensity reconstructions attaining an RMSE of 22.22 (fully observed steps score 0.105). Sensitivity tests showed that bias originates from the spatial edges due to our model assumptions. Decreasing the kernel spread parameter improved parameter accuracy to within 18%, but increased the RMSE in the missing time step intensity reconstructions to 23.37 (fully observed steps score 0.076).
This model can be expanded upon, with analysis pointing to boundary correction and Monte-Carlo E-steps as directions for future research. (Less) - Popular Abstract
- Imagine watching a patch of land where new rose bushes keep popping up.
Some bushes arrive “out of the blue” (maybe birds dropped the seeds) while others grow because an earlier bush spread its seeds nearby.
A Hawkes process is a mathematical framework for telling these two stories apart: it guesses how many bushes are brand-new arrivals and how many are “children” of earlier bushes.
In real fieldwork, however, the records are often not perfect. For the rose bush study that inspired this thesis, aerial photographs were taken at irregular intervals, so whole seasons of growth, sometimes several years, are missing. If we feed those patchy pictures into an ordinary Hawkes model, it gets confused because it expects to see every single bush... (More) - Imagine watching a patch of land where new rose bushes keep popping up.
Some bushes arrive “out of the blue” (maybe birds dropped the seeds) while others grow because an earlier bush spread its seeds nearby.
A Hawkes process is a mathematical framework for telling these two stories apart: it guesses how many bushes are brand-new arrivals and how many are “children” of earlier bushes.
In real fieldwork, however, the records are often not perfect. For the rose bush study that inspired this thesis, aerial photographs were taken at irregular intervals, so whole seasons of growth, sometimes several years, are missing. If we feed those patchy pictures into an ordinary Hawkes model, it gets confused because it expects to see every single bush appear in order.
In this thesis, we build a proof-of-concept algorithm, intending to explore the ability to fill in those missing gaps for a simplified version of the true process behind the spread of the rose bushes. We do this by firstly, looking forward from the last photo we have before the gap, and predicting where new bushes should have emerged, and secondly, looking backwards from the first photo after the gap, and unpacking what must have happened in the unobserved gap.
Blending the two views, the programme repeatedly fine-tunes our best guess of three numbers we use to estimate the true model behaviour—the natural arrival rate, the typical seed-spread distance and the strength of “parent-to-child” influence—and redraws the unobserved gaps based using these numbers. In a simulated study with 30 time steps, of which every eighth was deleted, the method found the true parameter values to be within a 30% error on average. When the simulated seed-spread distance was made shorter, the error fell to 18%, most likely due to model simplifications. The filled-in gaps are not perfect, but they follow the general trends of the true growth behaviour reasonably. The analysis suggests future work could focus on corrections near the edges of the study area and other simulation-based versions of the algorithm, such as Monte-Carlo variations. (Less)
Please use this url to cite or link to this publication:
http://lup.lub.lu.se/student-papers/record/9214287
- author
- Sandström, Patrick LU
- supervisor
- organization
- course
- MASK11 20251
- year
- 2025
- type
- M2 - Bachelor Degree
- subject
- keywords
- Hawke's Processes, Self-Exciting Processes, Self-Exciting Point Processes
- publication/series
- Bachelor’s Theses in Mathematical Sciences
- report number
- LUNFMS-4087-2025
- ISSN
- 1654-6229
- other publication id
- 2025:K35
- language
- English
- id
- 9214287
- date added to LUP
- 2025-10-24 11:03:28
- date last changed
- 2025-10-24 11:03:28
@misc{9214287,
abstract = {{Spatio-temporal Hawkes processes are a natural way of modelling point process data. They are used in a range of contexts for their ability to separate ”background” arrivals from secondary spread events, yet standard inference algorithms work from a complete event history. Motivated by a study looking at photographic records of an invasive rose-bush population with irregular time intervals between images, this thesis attempts to develop a forward–backward expectation–maximisation (EM) algorithm that (i) estimates the baseline rate, triggering strength and spatial spread, and (ii) reconstructs the latent intensity fields for any fully unobserved time steps and their subsequent affected time
steps. The latter E-step consists of forward kernel convolutions and empirical intensity FFT-based deconvolution to estimate latent intensity fields. The M-step updates the model parameters.
In a simulation study of 30 time steps with every eighth slice hidden, repeated for 30 randomly selected seeds, the algorithm recovered the true parameters to within 30% on average, with missing time step intensity reconstructions attaining an RMSE of 22.22 (fully observed steps score 0.105). Sensitivity tests showed that bias originates from the spatial edges due to our model assumptions. Decreasing the kernel spread parameter improved parameter accuracy to within 18%, but increased the RMSE in the missing time step intensity reconstructions to 23.37 (fully observed steps score 0.076).
This model can be expanded upon, with analysis pointing to boundary correction and Monte-Carlo E-steps as directions for future research.}},
author = {{Sandström, Patrick}},
issn = {{1654-6229}},
language = {{eng}},
note = {{Student Paper}},
series = {{Bachelor’s Theses in Mathematical Sciences}},
title = {{Parameter Estimation for Hawke’s Processes}},
year = {{2025}},
}