Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

Combined analysis of satellite and ground data for winter wheat yield forecasting

Broms, Camilla ; Nilsson, Mikael LU ; Oxenstierna, Andreas ; Sopasakis, Alexandros LU and Åström, Karl LU orcid (2023) In Smart Agricultural Technology 3.
Abstract

We built machine learning and image analysis tools in order to forecast winter wheat yield based on a rich multi dimensional tensor of agricultural information spanning different scales. This information consists of satellite multi-band images, local soil samples obtained from national databases, local weather as well as field data from 23 farms cultivating winter wheat in southern Sweden. This is inherently a large multi-scale problem due to the large temporal and spatial variation of the input data. We aggregate the data on spatially averaged features over grids which temporally span a seasonal timeline from seeding to harvest. Data cleaning is performed through interpolation for satellite images due to cloud obstructions. Furthermore... (More)

We built machine learning and image analysis tools in order to forecast winter wheat yield based on a rich multi dimensional tensor of agricultural information spanning different scales. This information consists of satellite multi-band images, local soil samples obtained from national databases, local weather as well as field data from 23 farms cultivating winter wheat in southern Sweden. This is inherently a large multi-scale problem due to the large temporal and spatial variation of the input data. We aggregate the data on spatially averaged features over grids which temporally span a seasonal timeline from seeding to harvest. Data cleaning is performed through interpolation for satellite images due to cloud obstructions. Furthermore data is heavily imbalanced since the amount of satellite information far exceeds that of the ground data. Data variance therefore can be an issue which we counter by using a decision tree approach. We find that the Light Gradient Boosting decision tree trained on 262 input features is able to predict winter wheat yield with 82% accuracy. Subsequently we employ game theory in order to better understand the relational importance of specific input features towards forecasting yield. Specifically we find that some of the most important features towards the resulting predictions are the percent clay and magnesium in the soil. Similarly the most important features from the satellite data are: a) the NORM index (Euclidean distance of all bands) computed in the second week of April, b) the NORM index computed in the middle of May as well as c) the second spectral band from the last week of June.

(Less)
Please use this url to cite or link to this publication:
author
; ; ; and
organization
publishing date
type
Contribution to journal
publication status
published
subject
keywords
Decision trees, Relational importance, Satellite, Shapley values, Soil samples, Winter wheat yield
in
Smart Agricultural Technology
volume
3
article number
100107
publisher
Elsevier
external identifiers
  • scopus:85148379558
ISSN
2772-3755
DOI
10.1016/j.atech.2022.100107
language
English
LU publication?
yes
additional info
Publisher Copyright: © 2022 The Authors
id
243ccda8-3658-483d-a9e6-fa1f4572db83
date added to LUP
2023-03-05 21:06:54
date last changed
2023-11-17 10:06:36
@article{243ccda8-3658-483d-a9e6-fa1f4572db83,
  abstract     = {{<p>We built machine learning and image analysis tools in order to forecast winter wheat yield based on a rich multi dimensional tensor of agricultural information spanning different scales. This information consists of satellite multi-band images, local soil samples obtained from national databases, local weather as well as field data from 23 farms cultivating winter wheat in southern Sweden. This is inherently a large multi-scale problem due to the large temporal and spatial variation of the input data. We aggregate the data on spatially averaged features over grids which temporally span a seasonal timeline from seeding to harvest. Data cleaning is performed through interpolation for satellite images due to cloud obstructions. Furthermore data is heavily imbalanced since the amount of satellite information far exceeds that of the ground data. Data variance therefore can be an issue which we counter by using a decision tree approach. We find that the Light Gradient Boosting decision tree trained on 262 input features is able to predict winter wheat yield with 82% accuracy. Subsequently we employ game theory in order to better understand the relational importance of specific input features towards forecasting yield. Specifically we find that some of the most important features towards the resulting predictions are the percent clay and magnesium in the soil. Similarly the most important features from the satellite data are: a) the NORM index (Euclidean distance of all bands) computed in the second week of April, b) the NORM index computed in the middle of May as well as c) the second spectral band from the last week of June.</p>}},
  author       = {{Broms, Camilla and Nilsson, Mikael and Oxenstierna, Andreas and Sopasakis, Alexandros and Åström, Karl}},
  issn         = {{2772-3755}},
  keywords     = {{Decision trees; Relational importance; Satellite; Shapley values; Soil samples; Winter wheat yield}},
  language     = {{eng}},
  publisher    = {{Elsevier}},
  series       = {{Smart Agricultural Technology}},
  title        = {{Combined analysis of satellite and ground data for winter wheat yield forecasting}},
  url          = {{http://dx.doi.org/10.1016/j.atech.2022.100107}},
  doi          = {{10.1016/j.atech.2022.100107}},
  volume       = {{3}},
  year         = {{2023}},
}