Skip to main content

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Convolutional Neural Network Emulators for DGVMs - A Supervised Machine Learning Approach to Big Data Processing

Nilsson, Amanda LU (2019) In LUNFMS-4030-2019 MASK11 20191
Mathematical Statistics
Abstract
This paper investigates the possibility to train a convolutional neural network (CNN) that, by capturing temporal features in weather data, can estimate the expected amount of wheat produced during any year, at any geographical location. The aim is to establish whether a CNN can be used for emulation of simulated global crop production - as responses to changes in CO2, temperature, water, and nitrogen levels - retrieved from the dynamic global vegetation model (DGVM) Lund-Potsdam-Jena General Ecosystem Simulator (LPJ-GUESS), taking part in the Global Gridded Crop Model Intercomparison (GGCMI) study. I.e. if a CNN can be used to obtain yield estimates at a lower computational cost than those coming from the DGVM.
Before investigating... (More)
This paper investigates the possibility to train a convolutional neural network (CNN) that, by capturing temporal features in weather data, can estimate the expected amount of wheat produced during any year, at any geographical location. The aim is to establish whether a CNN can be used for emulation of simulated global crop production - as responses to changes in CO2, temperature, water, and nitrogen levels - retrieved from the dynamic global vegetation model (DGVM) Lund-Potsdam-Jena General Ecosystem Simulator (LPJ-GUESS), taking part in the Global Gridded Crop Model Intercomparison (GGCMI) study. I.e. if a CNN can be used to obtain yield estimates at a lower computational cost than those coming from the DGVM.
Before investigating different CNNs and whether they can be used for emulation of annual yield, the paper goes through some analysis of weather data, the basic concepts of convolutional neural networks, how to construct them and analyze what they learn.
The results show that a CNN can be used for emulating annual wheat at any location and year without being given spatiotemporal positional arguments and should hence be considered a worthy candidate for emulation. It could be concluded that the temporal resolution could be decreased from one to five-day averages, however, no further investigation of substituting summary statistics was conducted. We are thus left with the problem of unequally sized input series. It also raises new questions regarding whether we can rely on the assumptions of the expected day of heading and the harvest dates, as it would help the CNN, with its otherwise location invariant pattern recognition, to easier distinguish between different periods around sowing, heading and reaping. (Less)
Popular Abstract
A growing world population alongside climate change and greater uncertainties in weather increases the concern about food security and raises questions about vulnerabilities and potential adaptation strategies in the agricultural sector. This has eventuated in the need for dynamic global vegetation models (DGVMs) that can predict vegetation in many different climate scenarios, out of which many have not yet been seen in the historical record, as well as in new potential cultivation locations that have no previous record of food production.
A problem with such vegetation models is their computational burden, especially when various climate scenarios are of interest. Cheap estimates of the simulator outputs can be retrieved from a mimicking... (More)
A growing world population alongside climate change and greater uncertainties in weather increases the concern about food security and raises questions about vulnerabilities and potential adaptation strategies in the agricultural sector. This has eventuated in the need for dynamic global vegetation models (DGVMs) that can predict vegetation in many different climate scenarios, out of which many have not yet been seen in the historical record, as well as in new potential cultivation locations that have no previous record of food production.
A problem with such vegetation models is their computational burden, especially when various climate scenarios are of interest. Cheap estimates of the simulator outputs can be retrieved from a mimicking emulator, or surrogate model, which can be seen as a statistical representation of the simulator, trained to model the mapping of input data to output targets.
This paper investigates the possibility to train a convolutional neural network (CNN) that, by capturing temporal features in weather data, can estimate the expected amount of wheat produced during any year and at any geographical location. The aim is to establish whether a CNN can be used for emulation of simulated global crop yield - as responses to changes in CO2, temperature, water, and nitrogen levels - retrieved from the dynamic global vegetation model (DGVM) Lund-Potsdam-Jena General Ecosystem Simulator (LPJ-GUESS), taking part in the Global Gridded Crop Model Intercomparison (GGCMI) study. I.e., if a CNN can be used to obtain yield estimates at a lower computational cost than those coming from the DGVM.
CNNs have the ability to massively parallel process big data, with many types of well-established machine learning and statistical techniques, and has become a popular tool for pattern recognition in weather and climate-related problems like the one considered here. Neural networks can make wonders without demanding that much of the modeler in terms of understanding or statistical knowledge, but CNNs, in particular, allow for a thorough analysis of what they learn and can easily be visualized. By displaying where the convolutional neural network puts most weight, we can get a better understanding of how the weather affects the yield and on how to - if possible - reduce or aggregate the input weather data.
The results show that a CNN can be used for emulating annual wheat at any location and year without being given spatiotemporal positional arguments and should hence be considered a worthy candidate for emulation. (Less)
Please use this url to cite or link to this publication:
author
Nilsson, Amanda LU
supervisor
organization
course
MASK11 20191
year
type
M2 - Bachelor Degree
subject
keywords
Big data, convolutional neural network (CNN), emulator, surrogate model, dynamic global vegetation model (DGVM), statistical modeling, feature detection, automated pattern recognition, supervised machine learning, deep learning, predictive analytics.
publication/series
LUNFMS-4030-2019
report number
2019:K28
ISSN
1654-6229
language
English
id
9004780
date added to LUP
2020-10-05 14:28:36
date last changed
2020-10-05 14:28:36
@misc{9004780,
  abstract     = {{This paper investigates the possibility to train a convolutional neural network (CNN) that, by capturing temporal features in weather data, can estimate the expected amount of wheat produced during any year, at any geographical location. The aim is to establish whether a CNN can be used for emulation of simulated global crop production - as responses to changes in CO2, temperature, water, and nitrogen levels - retrieved from the dynamic global vegetation model (DGVM) Lund-Potsdam-Jena General Ecosystem Simulator (LPJ-GUESS), taking part in the Global Gridded Crop Model Intercomparison (GGCMI) study. I.e. if a CNN can be used to obtain yield estimates at a lower computational cost than those coming from the DGVM.
Before investigating different CNNs and whether they can be used for emulation of annual yield, the paper goes through some analysis of weather data, the basic concepts of convolutional neural networks, how to construct them and analyze what they learn.
The results show that a CNN can be used for emulating annual wheat at any location and year without being given spatiotemporal positional arguments and should hence be considered a worthy candidate for emulation. It could be concluded that the temporal resolution could be decreased from one to five-day averages, however, no further investigation of substituting summary statistics was conducted. We are thus left with the problem of unequally sized input series. It also raises new questions regarding whether we can rely on the assumptions of the expected day of heading and the harvest dates, as it would help the CNN, with its otherwise location invariant pattern recognition, to easier distinguish between different periods around sowing, heading and reaping.}},
  author       = {{Nilsson, Amanda}},
  issn         = {{1654-6229}},
  language     = {{eng}},
  note         = {{Student Paper}},
  series       = {{LUNFMS-4030-2019}},
  title        = {{Convolutional Neural Network Emulators for DGVMs - A Supervised Machine Learning Approach to Big Data Processing}},
  year         = {{2019}},
}