Skip to main content

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Hyperspectral remote sensing and machine learning for greenhouse gas flux prediction in a semi-arid savannah: key spectral features and their biophysical interpretations

Vidal, Hedwig Sophie LU (2025) In Student thesis series INES NGEM01 20251
Dept of Physical Geography and Ecosystem Science
Abstract
Dryland ecosystems are important yet understudied sinks of carbon dioxide (CO₂) and methane (CH₄), and sources of nitrous oxide (N₂O). Hence, monitoring greenhouse gas (GHG) fluxes and their responses to climate change in these systems is increasingly important. This thesis investigates the potential of hyperspectral reflectance data (between 390 and 1750nm) to predict greenhouse gas fluxes during a dry spell and a rain peak in a grazed semi-arid savannah ecosystem in Senegal, West Africa. Data from 1) non-steady state chamber flux measurements of CO₂, N₂O, and CH₄; 2) soil variable measurements (temperature and moisture); 3) hyperspectral measurements (acquired with a handheld FieldSpec3 spectroradiometer), collected at 18 sites during... (More)
Dryland ecosystems are important yet understudied sinks of carbon dioxide (CO₂) and methane (CH₄), and sources of nitrous oxide (N₂O). Hence, monitoring greenhouse gas (GHG) fluxes and their responses to climate change in these systems is increasingly important. This thesis investigates the potential of hyperspectral reflectance data (between 390 and 1750nm) to predict greenhouse gas fluxes during a dry spell and a rain peak in a grazed semi-arid savannah ecosystem in Senegal, West Africa. Data from 1) non-steady state chamber flux measurements of CO₂, N₂O, and CH₄; 2) soil variable measurements (temperature and moisture); 3) hyperspectral measurements (acquired with a handheld FieldSpec3 spectroradiometer), collected at 18 sites during the dry spell and 24 during the rain peak, were used. All fluxes showed high variability and a random spatial distribution. Multiple linear regression (MLR), using soil moisture and temperature as predictors, revealed low explanatory power. Only the effect of temperature on N₂O was significant in both campaigns, and the MLR models explained ~40% of the variance in the data.
Normalised difference spectral indices (NDSIs) were calculated. Random forest (RF) models and recursive feature elimination (RFE) were used to select the key wavelengths (single-band) or key NDSIs (two-band ratios) for ecosystem respiration (Reco), gross primary productivity (GPP), and N₂O, and CH₄ fluxes. Moreover, Spearman’s correlations were used to analyse relationships between fluxes and selected wavelengths and NDSIs, respectively. Overall, model performance improved notably with NDSI inputs, often showing significant (p<0.05) and strong correlations with the target GHGs. Predictions of dry spell fluxes were less successful than those of the rain peak. The highest R² (~0.70) was achieved for rain peak N₂O flux using NDSIs as input. The two most important NDSIs were located on the slope between the 970 nm water absorption valley and the ~1100 nm peak in the short-wave-infrared wavelengths, highlighting these feature’s relevance. NDSIs including plant and water absorption bands were also among the most important features for predicting rain peak GPP and Reco, with R² values of 0.61 and 0.45, respectively. CH₄ fluxes were extremely variable and not well captured by any model.
The better performance of rain peak models likely results from stronger hyperspectral signals from vegetated surfaces. During the growing season, plant–soil interactions driving GHG fluxes are better captured within the studied wavelength range. In contrast, dry spell processes were poorly represented, likely due to bare, quartz-rich sandy soils that mask spectral features and deeper subsurface activity driving fluxes. Despite small sample sizes and data variability, the NDSI–RF–RFE approach yielded solid model performance. Combining NDSI-based RF-RFE with correlation analysis improved model explainability and helped identify GHG flux drivers in the studied grazed savannah. (Less)
Popular Abstract
Greenhouse gases like carbon dioxide (CO₂), methane (CH₄), and nitrous oxide (N₂O) play a major role in climate change. Understanding how and why these gases are released from the land is especially important in dryland regions, which cover more than 15% of the Earth’s surface and are often under-monitored. This thesis explores how remote sensing and machine learning can help predict greenhouse gas emissions in a semi-arid savannah in Senegal.
The study uses hyperspectral remote sensing, a technique that measures how sunlight reflects off the land surface across hundreds of wavelengths. A healthy green plant reflects sunlight differently than a wilting yellow plant or bare brown soil – we can see that difference with our eyes as a change... (More)
Greenhouse gases like carbon dioxide (CO₂), methane (CH₄), and nitrous oxide (N₂O) play a major role in climate change. Understanding how and why these gases are released from the land is especially important in dryland regions, which cover more than 15% of the Earth’s surface and are often under-monitored. This thesis explores how remote sensing and machine learning can help predict greenhouse gas emissions in a semi-arid savannah in Senegal.
The study uses hyperspectral remote sensing, a technique that measures how sunlight reflects off the land surface across hundreds of wavelengths. A healthy green plant reflects sunlight differently than a wilting yellow plant or bare brown soil – we can see that difference with our eyes as a change in colour. These spectral “fingerprints” reveal valuable information about plant health, soil moisture, and greenhouse gas fluxes, not only in the visible range but also in the near- and short-wave infrared.
To identify the most relevant information from hundreds of wavelengths, the study applies machine learning, using a method called random forest in combination with recursive feature elimination. Random forest works like a group of decision-makers "voting" on the best prediction, while recursive feature elimination helps filter out wavelengths that carry little or no useful information.
Results showed that predictions worked best during the rainy season, when vegetation was active and the land surface provided stronger, distinct spectral signals. During dry periods, with sparse vegetation, less distinctive surface signals, and more subsurface processes, the models struggled to predict greenhouse gas fluxes. Among the three gases, N₂O was the most successfully predicted, followed by CO₂. CH₄ fluxes were hard to predict based on the available spectral range and data and require further investigation.
These findings demonstrate that hyperspectral remote sensing data combined with machine learning can be a powerful tool for predicting greenhouse gas fluxes, especially in remote or under-studied dryland regions. This technology could contribute to more accurate greenhouse gas inventories, ultimately supporting better-informed climate policies and land management decisions. (Less)
Please use this url to cite or link to this publication:
author
Vidal, Hedwig Sophie LU
supervisor
organization
course
NGEM01 20251
year
type
H2 - Master's Degree (Two Years)
subject
keywords
Physical geography, Ecosystem analysis, Hyperspectral remote sensing, Dimensionality reduction, Feature importance analysis, Chamber measurements, Sahel, Semi-arid savannah, Flux prediction
publication/series
Student thesis series INES
report number
722
funder
Kartografiska Sällskapet
funder
Max Weber Program
funder
German Academic Exchange Service, DAAD
language
English
id
9200929
date added to LUP
2025-06-19 11:38:49
date last changed
2025-06-19 11:38:49
@misc{9200929,
  abstract     = {{Dryland ecosystems are important yet understudied sinks of carbon dioxide (CO₂) and methane (CH₄), and sources of nitrous oxide (N₂O). Hence, monitoring greenhouse gas (GHG) fluxes and their responses to climate change in these systems is increasingly important. This thesis investigates the potential of hyperspectral reflectance data (between 390 and 1750nm) to predict greenhouse gas fluxes during a dry spell and a rain peak in a grazed semi-arid savannah ecosystem in Senegal, West Africa. Data from 1) non-steady state chamber flux measurements of CO₂, N₂O, and CH₄; 2) soil variable measurements (temperature and moisture); 3) hyperspectral measurements (acquired with a handheld FieldSpec3 spectroradiometer), collected at 18 sites during the dry spell and 24 during the rain peak, were used. All fluxes showed high variability and a random spatial distribution. Multiple linear regression (MLR), using soil moisture and temperature as predictors, revealed low explanatory power. Only the effect of temperature on N₂O was significant in both campaigns, and the MLR models explained ~40% of the variance in the data.
Normalised difference spectral indices (NDSIs) were calculated. Random forest (RF) models and recursive feature elimination (RFE) were used to select the key wavelengths (single-band) or key NDSIs (two-band ratios) for ecosystem respiration (Reco), gross primary productivity (GPP), and N₂O, and CH₄ fluxes. Moreover, Spearman’s correlations were used to analyse relationships between fluxes and selected wavelengths and NDSIs, respectively. Overall, model performance improved notably with NDSI inputs, often showing significant (p<0.05) and strong correlations with the target GHGs. Predictions of dry spell fluxes were less successful than those of the rain peak. The highest R² (~0.70) was achieved for rain peak N₂O flux using NDSIs as input. The two most important NDSIs were located on the slope between the 970 nm water absorption valley and the ~1100 nm peak in the short-wave-infrared wavelengths, highlighting these feature’s relevance. NDSIs including plant and water absorption bands were also among the most important features for predicting rain peak GPP and Reco, with R² values of 0.61 and 0.45, respectively. CH₄ fluxes were extremely variable and not well captured by any model.
The better performance of rain peak models likely results from stronger hyperspectral signals from vegetated surfaces. During the growing season, plant–soil interactions driving GHG fluxes are better captured within the studied wavelength range. In contrast, dry spell processes were poorly represented, likely due to bare, quartz-rich sandy soils that mask spectral features and deeper subsurface activity driving fluxes. Despite small sample sizes and data variability, the NDSI–RF–RFE approach yielded solid model performance. Combining NDSI-based RF-RFE with correlation analysis improved model explainability and helped identify GHG flux drivers in the studied grazed savannah.}},
  author       = {{Vidal, Hedwig Sophie}},
  language     = {{eng}},
  note         = {{Student Paper}},
  series       = {{Student thesis series INES}},
  title        = {{Hyperspectral remote sensing and machine learning for greenhouse gas flux prediction in a semi-arid savannah: key spectral features and their biophysical interpretations}},
  year         = {{2025}},
}