Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

Inverse method using boosted regression tree and k-nearest neighbor to quantify effects of point and non-point source nitrate pollution in groundwater

Motevalli, Alireza LU ; Naghibi, Seyed Amir LU ; Hashemi, Hossein LU orcid ; Berndtsson, Ronny LU orcid ; Pradhan, Biswajeet and Gholami, Vahid (2019) In Journal of Cleaner Production 228. p.1248-1263
Abstract

Nitrate pollution of groundwater has increased dramatically worldwide due to increase of population and agricultural productivity. The resulting nitrate concentration in groundwater is usually a combination of various types of point and non-point pollutant sources. It is often difficult to distinguish between these sources since groundwater is formed in large and complex catchments with various natural processes and anthropogenic influence that contribute to a certain downstream nitrate concentration. For such conditions, this paper uses a methodology that can be used to inversely determine type and location of main nitrate pollutant source. The methodology builds on two state-of-the-art data mining techniques, boosted regression tree... (More)

Nitrate pollution of groundwater has increased dramatically worldwide due to increase of population and agricultural productivity. The resulting nitrate concentration in groundwater is usually a combination of various types of point and non-point pollutant sources. It is often difficult to distinguish between these sources since groundwater is formed in large and complex catchments with various natural processes and anthropogenic influence that contribute to a certain downstream nitrate concentration. For such conditions, this paper uses a methodology that can be used to inversely determine type and location of main nitrate pollutant source. The methodology builds on two state-of-the-art data mining techniques, boosted regression tree (BRT)and k-nearest neighbor (KNN). These techniques are used to produce a nitrate pollution vulnerability map. The methodology can mitigate effects of subjective judgement on determining importance of different sources and mechanisms for nitrate transport. The investigated mechanisms are hydrogeological, hydrological, anthropogenic, topography, and soil conditioning factors. Thus, the proposed methodology is used to separate between natural processes and anthropogenic effects on nitrate pollution. To calculate the groundwater vulnerability maps, a groundwater nitrate concentration of 40 mg/L (suggested by WHO with a 20% risk margin)was selected as a general threshold for identifying polluted areas that resulted in 96 polluted wells. Non-polluted locations were selected from well data with nitrate concentration less than 15 mg/L (96 non-polluted). The models were trained on 70% polluted and 70% non-polluted site data. The remaining data, 30% polluted and 30% non-polluted sites, were used to validate the simulation results. Results showed that the BRT produced outputs with higher performance than the KNN algorithm. The final ranking results based on the BRT model showed the higher importance of hydraulic conductivity, river density, soil, slope percent, net recharge, and distance from villages, in order, relative to other factors.

(Less)
Please use this url to cite or link to this publication:
author
; ; ; ; and
organization
publishing date
type
Contribution to journal
publication status
published
subject
keywords
Boosted regression tree, Data mining, GIS, Inverse modeling, K-nearest neighbors, Nitrate pollution
in
Journal of Cleaner Production
volume
228
pages
16 pages
publisher
Elsevier
external identifiers
  • scopus:85065126718
ISSN
0959-6526
DOI
10.1016/j.jclepro.2019.04.293
language
English
LU publication?
yes
id
f8551c70-1e8e-4ed6-8d8e-2db6a64d297b
date added to LUP
2019-05-13 11:49:55
date last changed
2023-10-07 02:59:40
@article{f8551c70-1e8e-4ed6-8d8e-2db6a64d297b,
  abstract     = {{<p>Nitrate pollution of groundwater has increased dramatically worldwide due to increase of population and agricultural productivity. The resulting nitrate concentration in groundwater is usually a combination of various types of point and non-point pollutant sources. It is often difficult to distinguish between these sources since groundwater is formed in large and complex catchments with various natural processes and anthropogenic influence that contribute to a certain downstream nitrate concentration. For such conditions, this paper uses a methodology that can be used to inversely determine type and location of main nitrate pollutant source. The methodology builds on two state-of-the-art data mining techniques, boosted regression tree (BRT)and k-nearest neighbor (KNN). These techniques are used to produce a nitrate pollution vulnerability map. The methodology can mitigate effects of subjective judgement on determining importance of different sources and mechanisms for nitrate transport. The investigated mechanisms are hydrogeological, hydrological, anthropogenic, topography, and soil conditioning factors. Thus, the proposed methodology is used to separate between natural processes and anthropogenic effects on nitrate pollution. To calculate the groundwater vulnerability maps, a groundwater nitrate concentration of 40 mg/L (suggested by WHO with a 20% risk margin)was selected as a general threshold for identifying polluted areas that resulted in 96 polluted wells. Non-polluted locations were selected from well data with nitrate concentration less than 15 mg/L (96 non-polluted). The models were trained on 70% polluted and 70% non-polluted site data. The remaining data, 30% polluted and 30% non-polluted sites, were used to validate the simulation results. Results showed that the BRT produced outputs with higher performance than the KNN algorithm. The final ranking results based on the BRT model showed the higher importance of hydraulic conductivity, river density, soil, slope percent, net recharge, and distance from villages, in order, relative to other factors.</p>}},
  author       = {{Motevalli, Alireza and Naghibi, Seyed Amir and Hashemi, Hossein and Berndtsson, Ronny and Pradhan, Biswajeet and Gholami, Vahid}},
  issn         = {{0959-6526}},
  keywords     = {{Boosted regression tree; Data mining; GIS; Inverse modeling; K-nearest neighbors; Nitrate pollution}},
  language     = {{eng}},
  pages        = {{1248--1263}},
  publisher    = {{Elsevier}},
  series       = {{Journal of Cleaner Production}},
  title        = {{Inverse method using boosted regression tree and k-nearest neighbor to quantify effects of point and non-point source nitrate pollution in groundwater}},
  url          = {{http://dx.doi.org/10.1016/j.jclepro.2019.04.293}},
  doi          = {{10.1016/j.jclepro.2019.04.293}},
  volume       = {{228}},
  year         = {{2019}},
}