Skip to main content

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Multi-scale Bark Beetle Predictions Using Machine Learning

Øhrman Wellendorf, Albert LU (2023) In Master Thesis in Geographical Information Science GISM01 20222
Dept of Physical Geography and Ecosystem Science
Abstract
Bark beetle attacks have led to widespread tree disturbance and deaths in many parts of the world, and thereby also economic and biodiversity losses. Forest-rich Sweden has experienced periodic attacks, latest in 2018. There is a great interest in identifying the most important explanatory features behind bark beetle attacks and making spatial predictions on where attacks might happen. This could limit the reliance on expensive ad-hoc measures to diminish the negative effects of bark beetle attacks. This is especially important in future years, as bark beetle attacks are expected to increase under climate change.
Machine learning is a family of algorithms that is capable of finding complex patterns in data and making predictions on unseen... (More)
Bark beetle attacks have led to widespread tree disturbance and deaths in many parts of the world, and thereby also economic and biodiversity losses. Forest-rich Sweden has experienced periodic attacks, latest in 2018. There is a great interest in identifying the most important explanatory features behind bark beetle attacks and making spatial predictions on where attacks might happen. This could limit the reliance on expensive ad-hoc measures to diminish the negative effects of bark beetle attacks. This is especially important in future years, as bark beetle attacks are expected to increase under climate change.
Machine learning is a family of algorithms that is capable of finding complex patterns in data and making predictions on unseen data. Earlier studies have already used different types of algorithms to predict bark beetle infestation spots and detect the most important features that characterise these spots. One problem with machine learning algorithms and the earlier studies in the field, is the lack of consideration of spatial autocorrelation and heterogeneity in the modelling. The current study aims to address these limitations by looking at the differences in prediction accuracy on different spatial scales. The study area (south-eastern Sweden) is divided into different numbers of zones (2, 4, 6, 8, 10, and 15) and the prediction accuracy and spatial distribution of feature importance are assessed and compared to that of the global model (full dataset). Furthermore, the results for a drought period (2018) and a normal period (2019-2020) are compared.
Different algorithms are assessed – Random Forest, Support Vector Machine and Logistic Regression. It is found that random forest performs best, albeit only marginally, compared to support vector machines on the global dataset (normal period). Random forest is therefore also used for the local modelling in the created zones.
The results from the local modelling indicate that zooming in to a more local scale (only considering the points in a zone) can result in better predictions both for the drought (year 2018) and normal period (year 2019 and 2020). Especially in areas with a relatively even number of infested and healthy records and also not too few points, the prediction accuracy is higher than for the global dataset. In the best performing local zones, the feature importance differs compared to the global model, and other features are generally most important here. This indicates that global modelling on the full dataset may mask the fact that some features are more important in different parts of the study area.
Multi-scale modelling can be beneficial for adaptation purposes and different factors can be prioritised in different areas depending on the local feature importance. Studies, like the current one, are important in the light of the future threat of an increase in bark beetle attacks. A more dynamic approach to the local modelling has been used in other fields where local machine learning models are created in all data records. It will be an interesting addition to current bark beetle research to make such dynamic studies in the future, but it is deemed out of the scope of the current study. (Less)
Popular Abstract
Bark Beetle attacks has resulted in tree deaths and economic losses in many parts of the world. Sweden has suffered periodic outbreaks both due to storms, such as Gudrun in 2005, and dry weather, for instance in the summer of 2018. Wind-felled trees constitute a suitable habitat for bark beetles and enable rapid population increase, that can ultimately end in attacks on and deaths of healthy, standing trees. Drought decreases tree vigour and increases the sensitivity of standing trees.
Many of the current measures against bark beetle outbreaks are ad-hoc and very expensive. Accurate predictions of vulnerable trees and areas will diminish the reliance on these measures and make it possible to perform a more efficient protection through... (More)
Bark Beetle attacks has resulted in tree deaths and economic losses in many parts of the world. Sweden has suffered periodic outbreaks both due to storms, such as Gudrun in 2005, and dry weather, for instance in the summer of 2018. Wind-felled trees constitute a suitable habitat for bark beetles and enable rapid population increase, that can ultimately end in attacks on and deaths of healthy, standing trees. Drought decreases tree vigour and increases the sensitivity of standing trees.
Many of the current measures against bark beetle outbreaks are ad-hoc and very expensive. Accurate predictions of vulnerable trees and areas will diminish the reliance on these measures and make it possible to perform a more efficient protection through suitable forest management before an outbreak occurs. This will be important, especially given the expectation that bark beetle outbreaks will increase in the future in many places due to climate change and more frequent drought years.
Traditional methods to predict bark beetle infestation spots generally lack accuracy. They are not able to capture the high spatial and temporal complexity of the outbreaks. Many explanatory features have been included in earlier studies, often leading to different conclusions. More sophisticated methods such as machine learning algorithms are needed. These are very good at learning patterns and relationships from data and often used to make predictions on unseen data. The non-parametric variations of these are especially suited for complex and non-linear problems and could help to accurately predict bark beetle infestation spots as studies in the field have shown.
One potential problem is the lack of geographical dimension in machine learning algorithms, and they have been described as ‘aspatial’ in nature. Bark beetle outbreaks are spatially complex – the relationship between infestations and explanatory features differ from place to place. This means that one ‘global’ model that includes all the data points from an extensive area would not be expected to take more local variations into account. Earlier studies in the field have not included the spatial dimension to a high degree.
This study uses a novel approach that combines spatial methods with two complex machine learning algorithms (random forest and support vector machine) and a linear, less complex one (logistic regression) to predict bark beetle infestation spots in Southern Sweden. To assess the potential effect of climate change and drought on prediction accuracy, the data was divided into a drought period (2018) and a normal period (2019-2020). Instead of using only one global model to predict in what areas trees will be infected, also local models are created by dividing the study area into several smaller entities and performing modelling inside these. These local models only consider data samples in the neighbourhood whereas the global model considers all data samples in the whole area. The study area is divided into different numbers of zones (2, 4, 6, 8, 10, and 15) and the prediction accuracy are assessed and compared to that of the global model (full dataset). Furthermore, the results the drought and normal period are compared. It is found that random forest performs best, albeit only marginally, compared to support vector machines on the global dataset (normal period). Random forest is therefore also used for the local modelling in the created zones. The results from the local modelling indicate that zooming in to a more local scale can result in better predictions both for the drought and normal period.
This new approach has the potential to increase our knowledge of the factors that affect bark beetle infestation and whether the relative importance of these differ over space, thereby enabling us to make more precise predictions of infestation areas in the future. This could potentially be beneficial to the forest industry, especially under the future threat of climate change. (Less)
Please use this url to cite or link to this publication:
author
Øhrman Wellendorf, Albert LU
supervisor
organization
course
GISM01 20222
year
type
H2 - Master's Degree (Two Years)
subject
keywords
Geography, GIS, Geographically weighted regression, bark beetle, machine learning
publication/series
Master Thesis in Geographical Information Science
report number
158
language
English
id
9113534
date added to LUP
2023-04-18 09:37:57
date last changed
2024-04-01 03:42:57
@misc{9113534,
  abstract     = {{Bark beetle attacks have led to widespread tree disturbance and deaths in many parts of the world, and thereby also economic and biodiversity losses. Forest-rich Sweden has experienced periodic attacks, latest in 2018. There is a great interest in identifying the most important explanatory features behind bark beetle attacks and making spatial predictions on where attacks might happen. This could limit the reliance on expensive ad-hoc measures to diminish the negative effects of bark beetle attacks. This is especially important in future years, as bark beetle attacks are expected to increase under climate change.
Machine learning is a family of algorithms that is capable of finding complex patterns in data and making predictions on unseen data. Earlier studies have already used different types of algorithms to predict bark beetle infestation spots and detect the most important features that characterise these spots. One problem with machine learning algorithms and the earlier studies in the field, is the lack of consideration of spatial autocorrelation and heterogeneity in the modelling. The current study aims to address these limitations by looking at the differences in prediction accuracy on different spatial scales. The study area (south-eastern Sweden) is divided into different numbers of zones (2, 4, 6, 8, 10, and 15) and the prediction accuracy and spatial distribution of feature importance are assessed and compared to that of the global model (full dataset). Furthermore, the results for a drought period (2018) and a normal period (2019-2020) are compared.
Different algorithms are assessed – Random Forest, Support Vector Machine and Logistic Regression. It is found that random forest performs best, albeit only marginally, compared to support vector machines on the global dataset (normal period). Random forest is therefore also used for the local modelling in the created zones.
The results from the local modelling indicate that zooming in to a more local scale (only considering the points in a zone) can result in better predictions both for the drought (year 2018) and normal period (year 2019 and 2020). Especially in areas with a relatively even number of infested and healthy records and also not too few points, the prediction accuracy is higher than for the global dataset. In the best performing local zones, the feature importance differs compared to the global model, and other features are generally most important here. This indicates that global modelling on the full dataset may mask the fact that some features are more important in different parts of the study area.
Multi-scale modelling can be beneficial for adaptation purposes and different factors can be prioritised in different areas depending on the local feature importance. Studies, like the current one, are important in the light of the future threat of an increase in bark beetle attacks. A more dynamic approach to the local modelling has been used in other fields where local machine learning models are created in all data records. It will be an interesting addition to current bark beetle research to make such dynamic studies in the future, but it is deemed out of the scope of the current study.}},
  author       = {{Øhrman Wellendorf, Albert}},
  language     = {{eng}},
  note         = {{Student Paper}},
  series       = {{Master Thesis in Geographical Information Science}},
  title        = {{Multi-scale Bark Beetle Predictions Using Machine Learning}},
  year         = {{2023}},
}