The Shortcut Index
(2016) In 1650-2884 EDA920 20152Department of Computer Science
- Abstract
- With a novel path index design, called the Shortcut Index, we partially solve the problem of executing traversal queries on dense neighborhoods in a graph database. We implement our design on top of the graph database Neo4j but it could be used for any graph database that uses the labeled property graph model.
By using a B+ tree, the Shortcut Index can achieve what we call neighborhood locality and range locality of paths. This means that data that belongs to the same part of the graph is located in the same space on disk. We empirically evaluate how this affects performance in terms of response time. In our benchmarks we use two different datasets, one that simulates a real world use case and a "lab environment" that makes it possible... (More) - With a novel path index design, called the Shortcut Index, we partially solve the problem of executing traversal queries on dense neighborhoods in a graph database. We implement our design on top of the graph database Neo4j but it could be used for any graph database that uses the labeled property graph model.
By using a B+ tree, the Shortcut Index can achieve what we call neighborhood locality and range locality of paths. This means that data that belongs to the same part of the graph is located in the same space on disk. We empirically evaluate how this affects performance in terms of response time. In our benchmarks we use two different datasets, one that simulates a real world use case and a "lab environment" that makes it possible to vary neighborhood density and percent of neighborhood interest to more accurately examine how it affects performance. Our experiments show that response time of the index scales very well with neighborhood density and percent of neighborhood interest when compared to Neo4j without the index.
We put focus on making the Shortcut Index useful in an OLTP (online transaction processing) environment, which enforces restrictions on update overhead which in turn restricts how long paths that can be indexed. The presented design can index arbitrary long paths but in our implementation we only index paths of length one.
We conclude that the Shortcut Index improves response time at a reasonable cost and is especially useful when indexing dense neighborhoods that are often queried with limitations on some property value. (Less)
Please use this url to cite or link to this publication:
http://lup.lub.lu.se/student-papers/record/8572658
- author
- Persson, Anton ^{LU}
- supervisor
- organization
- alternative title
- Path Indexing in Graph Databases
- course
- EDA920 20152
- year
- 2016
- type
- H3 - Professional qualifications (4 Years - )
- subject
- keywords
- neo4j, ldbc, B+ tree, oltp, percent of neighborhood interest, range locality, neighborhood locality, graph database, labeled property graph, graph density, neighborhood density, path index
- publication/series
- 1650-2884
- report number
- LU-CS-EX 2016-03
- ISSN
- 1650-2884
- language
- English
- id
- 8572658
- date added to LUP
- 2016-02-05 09:36:08
- date last changed
- 2016-02-05 09:36:08
@misc{8572658, abstract = {With a novel path index design, called the Shortcut Index, we partially solve the problem of executing traversal queries on dense neighborhoods in a graph database. We implement our design on top of the graph database Neo4j but it could be used for any graph database that uses the labeled property graph model. By using a B+ tree, the Shortcut Index can achieve what we call neighborhood locality and range locality of paths. This means that data that belongs to the same part of the graph is located in the same space on disk. We empirically evaluate how this affects performance in terms of response time. In our benchmarks we use two different datasets, one that simulates a real world use case and a "lab environment" that makes it possible to vary neighborhood density and percent of neighborhood interest to more accurately examine how it affects performance. Our experiments show that response time of the index scales very well with neighborhood density and percent of neighborhood interest when compared to Neo4j without the index. We put focus on making the Shortcut Index useful in an OLTP (online transaction processing) environment, which enforces restrictions on update overhead which in turn restricts how long paths that can be indexed. The presented design can index arbitrary long paths but in our implementation we only index paths of length one. We conclude that the Shortcut Index improves response time at a reasonable cost and is especially useful when indexing dense neighborhoods that are often queried with limitations on some property value.}, author = {Persson, Anton}, issn = {1650-2884}, keyword = {neo4j,ldbc,B+ tree,oltp,percent of neighborhood interest,range locality,neighborhood locality,graph database,labeled property graph,graph density,neighborhood density,path index}, language = {eng}, note = {Student Paper}, series = {1650-2884}, title = {The Shortcut Index}, year = {2016}, }