Advanced

Assessment and benchmarking of spatially enabled RDF stores for the next generation of spatial data infrastructure

Huang, Weiming LU ; Raza, Syed Amir; Mirzov, Oleg LU and Harrie, Lars LU (2019) In ISPRS International Journal of Geo-Information 8(7).
Abstract

Geospatial information is indispensable for various real-world applications and is thus a prominent part of today’s data science landscape. Geospatial data is primarily maintained and disseminated through spatial data infrastructures (SDIs). However, current SDIs are facing challenges in terms of data integration and semantic heterogeneity because of their partially siloed data organization. In this context, linked data provides a promising means to unravel these challenges, and it is seen as one of the key factors moving SDIs toward the next generation. In this study, we investigate the technical environment of the support for geospatial linked data by assessing and benchmarking some popular and well-known spatially enabled RDF stores... (More)

Geospatial information is indispensable for various real-world applications and is thus a prominent part of today’s data science landscape. Geospatial data is primarily maintained and disseminated through spatial data infrastructures (SDIs). However, current SDIs are facing challenges in terms of data integration and semantic heterogeneity because of their partially siloed data organization. In this context, linked data provides a promising means to unravel these challenges, and it is seen as one of the key factors moving SDIs toward the next generation. In this study, we investigate the technical environment of the support for geospatial linked data by assessing and benchmarking some popular and well-known spatially enabled RDF stores (RDF4J, GeoSPARQL-Jena, Virtuoso, Stardog, and GraphDB), with a focus on GeoSPARQL compliance and query performance. The tests were performed in two different scenarios. In the first scenario, geospatial data forms a part of a large-scale data infrastructure and is integrated with other types of data. In this scenario, we used ICOS Carbon Portal’s metadata—a real-world Earth Science linked data infrastructure. In the second scenario, we benchmarked the RDF stores in a dedicated SDI environment that contains purely geospatial data, and we used geospatial datasets with both crowd-sourced and authoritative data (the same test data used in a previous benchmark study, the Geographica benchmark). The assessment and benchmarking results demonstrate that the GeoSPARQL compliance of the RDF stores has encouragingly advanced in the last several years. The query performances are generally acceptable, and spatial indexing is imperative when handling a large number of geospatial objects. Nevertheless, query correctness remains a challenge for cross-database interoperability. In conclusion, the results indicate that the spatial capacity of the RDF stores has become increasingly mature, which could benefit the development of future SDIs.

(Less)
Please use this url to cite or link to this publication:
author
organization
publishing date
type
Contribution to journal
publication status
published
subject
keywords
GeoSPARQL, Geospatial data, Linked data benchmark, RDF stores, Spatial data infrastructure
in
ISPRS International Journal of Geo-Information
volume
8
issue
7
publisher
MDPI AG
external identifiers
  • scopus:85069583593
ISSN
2220-9964
DOI
10.3390/ijgi8070310
language
English
LU publication?
yes
id
f6a936e1-ee07-4dd1-b4a5-8f3ec7a5408a
date added to LUP
2019-08-05 13:01:18
date last changed
2019-08-28 04:57:45
@article{f6a936e1-ee07-4dd1-b4a5-8f3ec7a5408a,
  abstract     = {<p>Geospatial information is indispensable for various real-world applications and is thus a prominent part of today’s data science landscape. Geospatial data is primarily maintained and disseminated through spatial data infrastructures (SDIs). However, current SDIs are facing challenges in terms of data integration and semantic heterogeneity because of their partially siloed data organization. In this context, linked data provides a promising means to unravel these challenges, and it is seen as one of the key factors moving SDIs toward the next generation. In this study, we investigate the technical environment of the support for geospatial linked data by assessing and benchmarking some popular and well-known spatially enabled RDF stores (RDF4J, GeoSPARQL-Jena, Virtuoso, Stardog, and GraphDB), with a focus on GeoSPARQL compliance and query performance. The tests were performed in two different scenarios. In the first scenario, geospatial data forms a part of a large-scale data infrastructure and is integrated with other types of data. In this scenario, we used ICOS Carbon Portal’s metadata—a real-world Earth Science linked data infrastructure. In the second scenario, we benchmarked the RDF stores in a dedicated SDI environment that contains purely geospatial data, and we used geospatial datasets with both crowd-sourced and authoritative data (the same test data used in a previous benchmark study, the Geographica benchmark). The assessment and benchmarking results demonstrate that the GeoSPARQL compliance of the RDF stores has encouragingly advanced in the last several years. The query performances are generally acceptable, and spatial indexing is imperative when handling a large number of geospatial objects. Nevertheless, query correctness remains a challenge for cross-database interoperability. In conclusion, the results indicate that the spatial capacity of the RDF stores has become increasingly mature, which could benefit the development of future SDIs.</p>},
  articleno    = {310},
  author       = {Huang, Weiming and Raza, Syed Amir and Mirzov, Oleg and Harrie, Lars},
  issn         = {2220-9964},
  keyword      = {GeoSPARQL,Geospatial data,Linked data benchmark,RDF stores,Spatial data infrastructure},
  language     = {eng},
  month        = {07},
  number       = {7},
  publisher    = {MDPI AG},
  series       = {ISPRS International Journal of Geo-Information},
  title        = {Assessment and benchmarking of spatially enabled RDF stores for the next generation of spatial data infrastructure},
  url          = {http://dx.doi.org/10.3390/ijgi8070310},
  volume       = {8},
  year         = {2019},
}