Please use this identifier to cite or link to this item:
http://theses.ncl.ac.uk/jspui/handle/10443/4416
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Firth, Hugo Edward Boswell | - |
dc.date.accessioned | 2019-08-13T15:33:05Z | - |
dc.date.available | 2019-08-13T15:33:05Z | - |
dc.date.issued | 2018 | - |
dc.identifier.uri | http://theses.ncl.ac.uk/jspui/handle/10443/4416 | - |
dc.description | PhD Thesis | en_US |
dc.description.abstract | Many modern applications, from social networks to network security tools, rely upon the graph data model, using it as part of an offline analytics pipeline or, increasingly, for storing and querying data online, e.g. in a graph database management system (GDBMS). Unfortunately, effective horizontal scaling of this graph data reduces to the NP-Hard problem of “k-way balanced graph partitioning”. Owing to the problem’s importance, several practical approaches exist, producing quality graph partitionings. However, these existing systems are unsuitable for partitioning online graphs, either introducing unnecessary network latency during query processing, being unable to efficiently adapt to changing data and query workloads, or both. In this thesis we propose partitioning techniques which are efficient and sensitive to given query workloads, suitable for application to online graphs and query workloads. To incrementally adapt partitionings in response to workload change, we propose TAPER: a graph repartitioner. TAPER uses novel datastructures to compute the probability of expensive inter -partition traversals (ipt) from each vertex, given the current workload of path queries. Subsequently, it iteratively adjusts an initial partitioning by swapping selected vertices amongst partitions, heuristically maintaining low ipt and high partition quality with respect to that workload. Iterations are inexpensive thanks to time and space optimisations in the underlying datastructures. To incrementally create partitionings in response to graph growth, we propose Loom: a streaming graph partitioner. Loom uses another novel datastructure to detect common patterns of edge traversals when executing a given workload of pattern matching queries. Subsequently, it employs a probabilistic graph isomorphism method to incrementally and efficiently compare sub-graphs in the stream of graph updates, to these common patterns. Matches are assigned within individual partitions if possible, thereby also reducing ipt and increasing partitioning quality w.r.t the given workload. - i - Both partitioner and repartitioner are extensively evaluated with real/synthetic graph datasets and query workloads. The headline results include that TAPER can reduce ipt by upto 80% over a naive existing partitioning and can maintain this reduction in the event of workload change, through additional iterations. Meanwhile, Loom reduces ipt by upto 40% over a state of the art streaming graph partitioner. | en_US |
dc.language.iso | en | en_US |
dc.publisher | Newcastle University | en_US |
dc.title | Workload-sensitive approaches to improving graph data partitioning online | en_US |
dc.type | Thesis | en_US |
Appears in Collections: | School of Computing Science |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Firth H 2018.pdf | Thesis | 1.52 MB | Adobe PDF | View/Open |
dspacelicence.pdf | Licence | 43.82 kB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.