Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

Ground-Based Hyperspectral Retrieval of Soil Arsenic Concentration in Pingtan Island, China

Zheng, Meiduan ; Luan, Haijun LU ; Liu, Guangsheng ; Sha, Jinming ; Duan, Zheng LU and Wang, Lanhui LU (2023) In Remote Sensing 15(17).
Abstract

The optimal selection of characteristic bands and retrieval models for the hyperspectral retrieval of soil heavy metal concentrations poses a significant challenge. Additionally, satellite-based hyperspectral retrieval encounters several issues, including atmospheric effects, limitations in temporal and radiometric resolution, and data acquisition, among others. Given this, the retrieval performance of the soil arsenic (As) concentration in Pingtan Island, the largest island in Fujian Province and the fifth largest in China, is currently unclear. This study aimed to elucidate this issue by identifying optimal characteristic bands from the full spectrum from both statistical and physical perspectives. We tested three linear models,... (More)

The optimal selection of characteristic bands and retrieval models for the hyperspectral retrieval of soil heavy metal concentrations poses a significant challenge. Additionally, satellite-based hyperspectral retrieval encounters several issues, including atmospheric effects, limitations in temporal and radiometric resolution, and data acquisition, among others. Given this, the retrieval performance of the soil arsenic (As) concentration in Pingtan Island, the largest island in Fujian Province and the fifth largest in China, is currently unclear. This study aimed to elucidate this issue by identifying optimal characteristic bands from the full spectrum from both statistical and physical perspectives. We tested three linear models, namely Multiple Linear Regression (MLR), Partial Least Squares Regression (PLSR) and Geographically Weighted Regression (GWR), as well as three nonlinear machine learning models, including Back Propagation Neural Network (BP), Support Vector Machine Regression (SVR) and Random Forest Regression (RFR). We then retrieved soil arsenic content using ground-based soil full spectrum data on Pingtan Island. Our results indicate that the RFR model consistently outperformed all others when using both original and optimal characteristic bands. This superior performance suggests a complex, nonlinear relationship between soil arsenic concentration and spectral variables, influenced by diverse landscape factors. The GWR model, which considers spatial non-stationarity and heterogeneity, outperformed traditional models such as BP and SVR. This finding underscores the potential of incorporating spatial characteristics to enhance traditional machine learning models in geospatial studies. When evaluating retrieval model accuracy based on optimal characteristic bands, the RFR model maintained its top performance, and linear models (MLR, PLSR and GWR) showed notable improvement. Specifically, the GWR model achieved the highest r value for the validation data, indicating that selecting optimal characteristic bands based on high Pearson’s correlation coefficients (e.g., abs(Pearson’s correlation coefficient) ≥0.45) and high sensitivity to soil active materials successfully mitigates uncertainties linked to characteristic band selection solely based on Pearson’s correlation coefficients. Consequently, two effective retrieval models were generated: the best-performing RFR model and the improved GWR model. Our study on Pingtan Island provides theoretical and technical support for monitoring and evaluating soil arsenic concentrations using satellite-based spectroscopy in densely populated, relatively independent island towns in China and worldwide.

(Less)
Please use this url to cite or link to this publication:
author
; ; ; ; and
organization
publishing date
type
Contribution to journal
publication status
published
subject
keywords
Geographically Weighted Regression, ground-based soil spectra, Pingtan Island, Random Forest Regression, soil arsenic concentration
in
Remote Sensing
volume
15
issue
17
article number
4349
publisher
MDPI AG
external identifiers
  • scopus:85170354563
ISSN
2072-4292
DOI
10.3390/rs15174349
language
English
LU publication?
yes
id
9396b897-9963-4b20-a1ac-b448274c102a
date added to LUP
2024-01-12 14:39:31
date last changed
2024-01-12 14:39:31
@article{9396b897-9963-4b20-a1ac-b448274c102a,
  abstract     = {{<p>The optimal selection of characteristic bands and retrieval models for the hyperspectral retrieval of soil heavy metal concentrations poses a significant challenge. Additionally, satellite-based hyperspectral retrieval encounters several issues, including atmospheric effects, limitations in temporal and radiometric resolution, and data acquisition, among others. Given this, the retrieval performance of the soil arsenic (As) concentration in Pingtan Island, the largest island in Fujian Province and the fifth largest in China, is currently unclear. This study aimed to elucidate this issue by identifying optimal characteristic bands from the full spectrum from both statistical and physical perspectives. We tested three linear models, namely Multiple Linear Regression (MLR), Partial Least Squares Regression (PLSR) and Geographically Weighted Regression (GWR), as well as three nonlinear machine learning models, including Back Propagation Neural Network (BP), Support Vector Machine Regression (SVR) and Random Forest Regression (RFR). We then retrieved soil arsenic content using ground-based soil full spectrum data on Pingtan Island. Our results indicate that the RFR model consistently outperformed all others when using both original and optimal characteristic bands. This superior performance suggests a complex, nonlinear relationship between soil arsenic concentration and spectral variables, influenced by diverse landscape factors. The GWR model, which considers spatial non-stationarity and heterogeneity, outperformed traditional models such as BP and SVR. This finding underscores the potential of incorporating spatial characteristics to enhance traditional machine learning models in geospatial studies. When evaluating retrieval model accuracy based on optimal characteristic bands, the RFR model maintained its top performance, and linear models (MLR, PLSR and GWR) showed notable improvement. Specifically, the GWR model achieved the highest r value for the validation data, indicating that selecting optimal characteristic bands based on high Pearson’s correlation coefficients (e.g., abs(Pearson’s correlation coefficient) ≥0.45) and high sensitivity to soil active materials successfully mitigates uncertainties linked to characteristic band selection solely based on Pearson’s correlation coefficients. Consequently, two effective retrieval models were generated: the best-performing RFR model and the improved GWR model. Our study on Pingtan Island provides theoretical and technical support for monitoring and evaluating soil arsenic concentrations using satellite-based spectroscopy in densely populated, relatively independent island towns in China and worldwide.</p>}},
  author       = {{Zheng, Meiduan and Luan, Haijun and Liu, Guangsheng and Sha, Jinming and Duan, Zheng and Wang, Lanhui}},
  issn         = {{2072-4292}},
  keywords     = {{Geographically Weighted Regression; ground-based soil spectra; Pingtan Island; Random Forest Regression; soil arsenic concentration}},
  language     = {{eng}},
  number       = {{17}},
  publisher    = {{MDPI AG}},
  series       = {{Remote Sensing}},
  title        = {{Ground-Based Hyperspectral Retrieval of Soil Arsenic Concentration in Pingtan Island, China}},
  url          = {{http://dx.doi.org/10.3390/rs15174349}},
  doi          = {{10.3390/rs15174349}},
  volume       = {{15}},
  year         = {{2023}},
}