Hyperspectral remote sensing and machine learning for greenhouse gas flux prediction in a semi-arid savannah: key spectral features and their biophysical interpretations
(2025) In Student thesis series INES NGEM01 20251Dept of Physical Geography and Ecosystem Science
- Abstract
- Dryland ecosystems are important yet understudied sinks of carbon dioxide (CO₂) and methane (CH₄), and sources of nitrous oxide (N₂O). Hence, monitoring greenhouse gas (GHG) fluxes and their responses to climate change in these systems is increasingly important. This thesis investigates the potential of hyperspectral reflectance data (between 390 and 1750nm) to predict greenhouse gas fluxes during a dry spell and a rain peak in a grazed semi-arid savannah ecosystem in Senegal, West Africa. Data from 1) non-steady state chamber flux measurements of CO₂, N₂O, and CH₄; 2) soil variable measurements (temperature and moisture); 3) hyperspectral measurements (acquired with a handheld FieldSpec3 spectroradiometer), collected at 18 sites during... (More)
- Dryland ecosystems are important yet understudied sinks of carbon dioxide (CO₂) and methane (CH₄), and sources of nitrous oxide (N₂O). Hence, monitoring greenhouse gas (GHG) fluxes and their responses to climate change in these systems is increasingly important. This thesis investigates the potential of hyperspectral reflectance data (between 390 and 1750nm) to predict greenhouse gas fluxes during a dry spell and a rain peak in a grazed semi-arid savannah ecosystem in Senegal, West Africa. Data from 1) non-steady state chamber flux measurements of CO₂, N₂O, and CH₄; 2) soil variable measurements (temperature and moisture); 3) hyperspectral measurements (acquired with a handheld FieldSpec3 spectroradiometer), collected at 18 sites during the dry spell and 24 during the rain peak, were used. All fluxes showed high variability and a random spatial distribution. Multiple linear regression (MLR), using soil moisture and temperature as predictors, revealed low explanatory power. Only the effect of temperature on N₂O was significant in both campaigns, and the MLR models explained ~40% of the variance in the data.
Normalised difference spectral indices (NDSIs) were calculated. Random forest (RF) models and recursive feature elimination (RFE) were used to select the key wavelengths (single-band) or key NDSIs (two-band ratios) for ecosystem respiration (Reco), gross primary productivity (GPP), and N₂O, and CH₄ fluxes. Moreover, Spearman’s correlations were used to analyse relationships between fluxes and selected wavelengths and NDSIs, respectively. Overall, model performance improved notably with NDSI inputs, often showing significant (p<0.05) and strong correlations with the target GHGs. Predictions of dry spell fluxes were less successful than those of the rain peak. The highest R² (~0.70) was achieved for rain peak N₂O flux using NDSIs as input. The two most important NDSIs were located on the slope between the 970 nm water absorption valley and the ~1100 nm peak in the short-wave-infrared wavelengths, highlighting these feature’s relevance. NDSIs including plant and water absorption bands were also among the most important features for predicting rain peak GPP and Reco, with R² values of 0.61 and 0.45, respectively. CH₄ fluxes were extremely variable and not well captured by any model.
The better performance of rain peak models likely results from stronger hyperspectral signals from vegetated surfaces. During the growing season, plant–soil interactions driving GHG fluxes are better captured within the studied wavelength range. In contrast, dry spell processes were poorly represented, likely due to bare, quartz-rich sandy soils that mask spectral features and deeper subsurface activity driving fluxes. Despite small sample sizes and data variability, the NDSI–RF–RFE approach yielded solid model performance. Combining NDSI-based RF-RFE with correlation analysis improved model explainability and helped identify GHG flux drivers in the studied grazed savannah. (Less) - Popular Abstract
- Greenhouse gases like carbon dioxide (CO₂), methane (CH₄), and nitrous oxide (N₂O) play a major role in climate change. Understanding how and why these gases are released from the land is especially important in dryland regions, which cover more than 15% of the Earth’s surface and are often under-monitored. This thesis explores how remote sensing and machine learning can help predict greenhouse gas emissions in a semi-arid savannah in Senegal.
The study uses hyperspectral remote sensing, a technique that measures how sunlight reflects off the land surface across hundreds of wavelengths. A healthy green plant reflects sunlight differently than a wilting yellow plant or bare brown soil – we can see that difference with our eyes as a change... (More) - Greenhouse gases like carbon dioxide (CO₂), methane (CH₄), and nitrous oxide (N₂O) play a major role in climate change. Understanding how and why these gases are released from the land is especially important in dryland regions, which cover more than 15% of the Earth’s surface and are often under-monitored. This thesis explores how remote sensing and machine learning can help predict greenhouse gas emissions in a semi-arid savannah in Senegal.
The study uses hyperspectral remote sensing, a technique that measures how sunlight reflects off the land surface across hundreds of wavelengths. A healthy green plant reflects sunlight differently than a wilting yellow plant or bare brown soil – we can see that difference with our eyes as a change in colour. These spectral “fingerprints” reveal valuable information about plant health, soil moisture, and greenhouse gas fluxes, not only in the visible range but also in the near- and short-wave infrared.
To identify the most relevant information from hundreds of wavelengths, the study applies machine learning, using a method called random forest in combination with recursive feature elimination. Random forest works like a group of decision-makers "voting" on the best prediction, while recursive feature elimination helps filter out wavelengths that carry little or no useful information.
Results showed that predictions worked best during the rainy season, when vegetation was active and the land surface provided stronger, distinct spectral signals. During dry periods, with sparse vegetation, less distinctive surface signals, and more subsurface processes, the models struggled to predict greenhouse gas fluxes. Among the three gases, N₂O was the most successfully predicted, followed by CO₂. CH₄ fluxes were hard to predict based on the available spectral range and data and require further investigation.
These findings demonstrate that hyperspectral remote sensing data combined with machine learning can be a powerful tool for predicting greenhouse gas fluxes, especially in remote or under-studied dryland regions. This technology could contribute to more accurate greenhouse gas inventories, ultimately supporting better-informed climate policies and land management decisions. (Less)
Please use this url to cite or link to this publication:
http://lup.lub.lu.se/student-papers/record/9200929
- author
- Vidal, Hedwig Sophie LU
- supervisor
- organization
- course
- NGEM01 20251
- year
- 2025
- type
- H2 - Master's Degree (Two Years)
- subject
- keywords
- Physical geography, Ecosystem analysis, Hyperspectral remote sensing, Dimensionality reduction, Feature importance analysis, Chamber measurements, Sahel, Semi-arid savannah, Flux prediction
- publication/series
- Student thesis series INES
- report number
- 722
- funder
- Kartografiska Sällskapet
- funder
- Max Weber Program
- funder
- German Academic Exchange Service, DAAD
- language
- English
- id
- 9200929
- date added to LUP
- 2025-06-19 11:38:49
- date last changed
- 2025-06-19 11:38:49
@misc{9200929, abstract = {{Dryland ecosystems are important yet understudied sinks of carbon dioxide (CO₂) and methane (CH₄), and sources of nitrous oxide (N₂O). Hence, monitoring greenhouse gas (GHG) fluxes and their responses to climate change in these systems is increasingly important. This thesis investigates the potential of hyperspectral reflectance data (between 390 and 1750nm) to predict greenhouse gas fluxes during a dry spell and a rain peak in a grazed semi-arid savannah ecosystem in Senegal, West Africa. Data from 1) non-steady state chamber flux measurements of CO₂, N₂O, and CH₄; 2) soil variable measurements (temperature and moisture); 3) hyperspectral measurements (acquired with a handheld FieldSpec3 spectroradiometer), collected at 18 sites during the dry spell and 24 during the rain peak, were used. All fluxes showed high variability and a random spatial distribution. Multiple linear regression (MLR), using soil moisture and temperature as predictors, revealed low explanatory power. Only the effect of temperature on N₂O was significant in both campaigns, and the MLR models explained ~40% of the variance in the data. Normalised difference spectral indices (NDSIs) were calculated. Random forest (RF) models and recursive feature elimination (RFE) were used to select the key wavelengths (single-band) or key NDSIs (two-band ratios) for ecosystem respiration (Reco), gross primary productivity (GPP), and N₂O, and CH₄ fluxes. Moreover, Spearman’s correlations were used to analyse relationships between fluxes and selected wavelengths and NDSIs, respectively. Overall, model performance improved notably with NDSI inputs, often showing significant (p<0.05) and strong correlations with the target GHGs. Predictions of dry spell fluxes were less successful than those of the rain peak. The highest R² (~0.70) was achieved for rain peak N₂O flux using NDSIs as input. The two most important NDSIs were located on the slope between the 970 nm water absorption valley and the ~1100 nm peak in the short-wave-infrared wavelengths, highlighting these feature’s relevance. NDSIs including plant and water absorption bands were also among the most important features for predicting rain peak GPP and Reco, with R² values of 0.61 and 0.45, respectively. CH₄ fluxes were extremely variable and not well captured by any model. The better performance of rain peak models likely results from stronger hyperspectral signals from vegetated surfaces. During the growing season, plant–soil interactions driving GHG fluxes are better captured within the studied wavelength range. In contrast, dry spell processes were poorly represented, likely due to bare, quartz-rich sandy soils that mask spectral features and deeper subsurface activity driving fluxes. Despite small sample sizes and data variability, the NDSI–RF–RFE approach yielded solid model performance. Combining NDSI-based RF-RFE with correlation analysis improved model explainability and helped identify GHG flux drivers in the studied grazed savannah.}}, author = {{Vidal, Hedwig Sophie}}, language = {{eng}}, note = {{Student Paper}}, series = {{Student thesis series INES}}, title = {{Hyperspectral remote sensing and machine learning for greenhouse gas flux prediction in a semi-arid savannah: key spectral features and their biophysical interpretations}}, year = {{2025}}, }