Skip to main content

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Parameter Estimation for Hawke’s Processes

Sandström, Patrick LU (2025) In Bachelor’s Theses in Mathematical Sciences MASK11 20251
Mathematical Statistics
Abstract
Spatio-temporal Hawkes processes are a natural way of modelling point process data. They are used in a range of contexts for their ability to separate ”background” arrivals from secondary spread events, yet standard inference algorithms work from a complete event history. Motivated by a study looking at photographic records of an invasive rose-bush population with irregular time intervals between images, this thesis attempts to develop a forward–backward expectation–maximisation (EM) algorithm that (i) estimates the baseline rate, triggering strength and spatial spread, and (ii) reconstructs the latent intensity fields for any fully unobserved time steps and their subsequent affected time
steps. The latter E-step consists of forward... (More)
Spatio-temporal Hawkes processes are a natural way of modelling point process data. They are used in a range of contexts for their ability to separate ”background” arrivals from secondary spread events, yet standard inference algorithms work from a complete event history. Motivated by a study looking at photographic records of an invasive rose-bush population with irregular time intervals between images, this thesis attempts to develop a forward–backward expectation–maximisation (EM) algorithm that (i) estimates the baseline rate, triggering strength and spatial spread, and (ii) reconstructs the latent intensity fields for any fully unobserved time steps and their subsequent affected time
steps. The latter E-step consists of forward kernel convolutions and empirical intensity FFT-based deconvolution to estimate latent intensity fields. The M-step updates the model parameters.
In a simulation study of 30 time steps with every eighth slice hidden, repeated for 30 randomly selected seeds, the algorithm recovered the true parameters to within 30% on average, with missing time step intensity reconstructions attaining an RMSE of 22.22 (fully observed steps score 0.105). Sensitivity tests showed that bias originates from the spatial edges due to our model assumptions. Decreasing the kernel spread parameter improved parameter accuracy to within 18%, but increased the RMSE in the missing time step intensity reconstructions to 23.37 (fully observed steps score 0.076).
This model can be expanded upon, with analysis pointing to boundary correction and Monte-Carlo E-steps as directions for future research. (Less)
Popular Abstract
Imagine watching a patch of land where new rose bushes keep popping up.
Some bushes arrive “out of the blue” (maybe birds dropped the seeds) while others grow because an earlier bush spread its seeds nearby.
A Hawkes process is a mathematical framework for telling these two stories apart: it guesses how many bushes are brand-new arrivals and how many are “children” of earlier bushes.
In real fieldwork, however, the records are often not perfect. For the rose bush study that inspired this thesis, aerial photographs were taken at irregular intervals, so whole seasons of growth, sometimes several years, are missing. If we feed those patchy pictures into an ordinary Hawkes model, it gets confused because it expects to see every single bush... (More)
Imagine watching a patch of land where new rose bushes keep popping up.
Some bushes arrive “out of the blue” (maybe birds dropped the seeds) while others grow because an earlier bush spread its seeds nearby.
A Hawkes process is a mathematical framework for telling these two stories apart: it guesses how many bushes are brand-new arrivals and how many are “children” of earlier bushes.
In real fieldwork, however, the records are often not perfect. For the rose bush study that inspired this thesis, aerial photographs were taken at irregular intervals, so whole seasons of growth, sometimes several years, are missing. If we feed those patchy pictures into an ordinary Hawkes model, it gets confused because it expects to see every single bush appear in order.
In this thesis, we build a proof-of-concept algorithm, intending to explore the ability to fill in those missing gaps for a simplified version of the true process behind the spread of the rose bushes. We do this by firstly, looking forward from the last photo we have before the gap, and predicting where new bushes should have emerged, and secondly, looking backwards from the first photo after the gap, and unpacking what must have happened in the unobserved gap.
Blending the two views, the programme repeatedly fine-tunes our best guess of three numbers we use to estimate the true model behaviour—the natural arrival rate, the typical seed-spread distance and the strength of “parent-to-child” influence—and redraws the unobserved gaps based using these numbers. In a simulated study with 30 time steps, of which every eighth was deleted, the method found the true parameter values to be within a 30% error on average. When the simulated seed-spread distance was made shorter, the error fell to 18%, most likely due to model simplifications. The filled-in gaps are not perfect, but they follow the general trends of the true growth behaviour reasonably. The analysis suggests future work could focus on corrections near the edges of the study area and other simulation-based versions of the algorithm, such as Monte-Carlo variations. (Less)
Please use this url to cite or link to this publication:
author
Sandström, Patrick LU
supervisor
organization
course
MASK11 20251
year
type
M2 - Bachelor Degree
subject
keywords
Hawke's Processes, Self-Exciting Processes, Self-Exciting Point Processes
publication/series
Bachelor’s Theses in Mathematical Sciences
report number
LUNFMS-4087-2025
ISSN
1654-6229
other publication id
2025:K35
language
English
id
9214287
date added to LUP
2025-10-24 11:03:28
date last changed
2025-10-24 11:03:28
@misc{9214287,
  abstract     = {{Spatio-temporal Hawkes processes are a natural way of modelling point process data. They are used in a range of contexts for their ability to separate ”background” arrivals from secondary spread events, yet standard inference algorithms work from a complete event history. Motivated by a study looking at photographic records of an invasive rose-bush population with irregular time intervals between images, this thesis attempts to develop a forward–backward expectation–maximisation (EM) algorithm that (i) estimates the baseline rate, triggering strength and spatial spread, and (ii) reconstructs the latent intensity fields for any fully unobserved time steps and their subsequent affected time
steps. The latter E-step consists of forward kernel convolutions and empirical intensity FFT-based deconvolution to estimate latent intensity fields. The M-step updates the model parameters.
In a simulation study of 30 time steps with every eighth slice hidden, repeated for 30 randomly selected seeds, the algorithm recovered the true parameters to within 30% on average, with missing time step intensity reconstructions attaining an RMSE of 22.22 (fully observed steps score 0.105). Sensitivity tests showed that bias originates from the spatial edges due to our model assumptions. Decreasing the kernel spread parameter improved parameter accuracy to within 18%, but increased the RMSE in the missing time step intensity reconstructions to 23.37 (fully observed steps score 0.076).
This model can be expanded upon, with analysis pointing to boundary correction and Monte-Carlo E-steps as directions for future research.}},
  author       = {{Sandström, Patrick}},
  issn         = {{1654-6229}},
  language     = {{eng}},
  note         = {{Student Paper}},
  series       = {{Bachelor’s Theses in Mathematical Sciences}},
  title        = {{Parameter Estimation for Hawke’s Processes}},
  year         = {{2025}},
}