APG: A novel python-based ArcGIS toolbox to generate absence-datasets for geospatial studies
(2021) In Geoscience Frontiers 12(6).- Abstract
- One important step in binary modeling of environmental problems is the generation of absence-datasets that are traditionally generated by random sampling and can undermine the quality of outputs. To solve this problem, this study develops the Absence Point Generation (APG) toolbox which is a Python-based ArcGIS toolbox for automated construction of absence-datasets for geospatial studies. The APG employs a frequency ratio analysis of four commonly used and important driving factors such as altitude, slope degree, topographic wetness index, and distance from rivers, and considers the presence locations buffer and density layers to define the low potential or susceptibility zones where absence-datasets are generated. To test the APG toolbox,... (More)
- One important step in binary modeling of environmental problems is the generation of absence-datasets that are traditionally generated by random sampling and can undermine the quality of outputs. To solve this problem, this study develops the Absence Point Generation (APG) toolbox which is a Python-based ArcGIS toolbox for automated construction of absence-datasets for geospatial studies. The APG employs a frequency ratio analysis of four commonly used and important driving factors such as altitude, slope degree, topographic wetness index, and distance from rivers, and considers the presence locations buffer and density layers to define the low potential or susceptibility zones where absence-datasets are generated. To test the APG toolbox, we applied two benchmark algorithms of random forest (RF) and boosted
regression trees (BRT) in a case study to investigate groundwater potential using three absence datasets i.e., the APG, random, and selection of absence samples (SAS) toolbox. The BRT-APG and RF-APG had the area under receiver operating curve (AUC) values of 0.947 and 0.942, while BRT and RF had weaker performances with the SAS and Random datasets. This effect resulted in AUC improvements for BRT and RF
by 7.2, and 9.7% from the Random dataset, and AUC improvements for BRT and RF by 6.1, and 5.4% from the SAS dataset, respectively. The APG also impacted the importance of the input factors and the pattern of the groundwater potential maps, which proves the importance of absence points in environmental binary issues. The proposed APG toolbox could be easily applied in other environmental hazards such as
landslides, floods, and gully erosion, and land subsidence. (Less)
Please use this url to cite or link to this publication:
https://lup.lub.lu.se/record/3debb5da-06cd-4a58-9302-9baf7f74ecf5
- author
- Naghibi, Seyed Amir
LU
; Hashemi, Hossein
LU
and Pradhan, Biswajeet
- organization
- publishing date
- 2021-06
- type
- Contribution to journal
- publication status
- published
- subject
- in
- Geoscience Frontiers
- volume
- 12
- issue
- 6
- article number
- 101232
- pages
- 15 pages
- publisher
- China University of Geosciences (Beijing) and Peking University
- external identifiers
-
- scopus:85107116789
- ISSN
- 1674-9871
- DOI
- 10.1016/j.gsf.2021.101232
- language
- English
- LU publication?
- yes
- id
- 3debb5da-06cd-4a58-9302-9baf7f74ecf5
- date added to LUP
- 2021-06-07 13:23:08
- date last changed
- 2023-10-10 20:36:29
@article{3debb5da-06cd-4a58-9302-9baf7f74ecf5, abstract = {{One important step in binary modeling of environmental problems is the generation of absence-datasets that are traditionally generated by random sampling and can undermine the quality of outputs. To solve this problem, this study develops the Absence Point Generation (APG) toolbox which is a Python-based ArcGIS toolbox for automated construction of absence-datasets for geospatial studies. The APG employs a frequency ratio analysis of four commonly used and important driving factors such as altitude, slope degree, topographic wetness index, and distance from rivers, and considers the presence locations buffer and density layers to define the low potential or susceptibility zones where absence-datasets are generated. To test the APG toolbox, we applied two benchmark algorithms of random forest (RF) and boosted<br/>regression trees (BRT) in a case study to investigate groundwater potential using three absence datasets i.e., the APG, random, and selection of absence samples (SAS) toolbox. The BRT-APG and RF-APG had the area under receiver operating curve (AUC) values of 0.947 and 0.942, while BRT and RF had weaker performances with the SAS and Random datasets. This effect resulted in AUC improvements for BRT and RF<br/>by 7.2, and 9.7% from the Random dataset, and AUC improvements for BRT and RF by 6.1, and 5.4% from the SAS dataset, respectively. The APG also impacted the importance of the input factors and the pattern of the groundwater potential maps, which proves the importance of absence points in environmental binary issues. The proposed APG toolbox could be easily applied in other environmental hazards such as<br/>landslides, floods, and gully erosion, and land subsidence.}}, author = {{Naghibi, Seyed Amir and Hashemi, Hossein and Pradhan, Biswajeet}}, issn = {{1674-9871}}, language = {{eng}}, number = {{6}}, publisher = {{China University of Geosciences (Beijing) and Peking University}}, series = {{Geoscience Frontiers}}, title = {{APG: A novel python-based ArcGIS toolbox to generate absence-datasets for geospatial studies}}, url = {{https://lup.lub.lu.se/search/files/98896251/Naghibi_et_al_2021.pdf}}, doi = {{10.1016/j.gsf.2021.101232}}, volume = {{12}}, year = {{2021}}, }