[DB Seminar][Colloquium] System Support for Managing Graph Data

Dr. Sameh Elnikety, Microsoft Research

Wednesday November 6, 2013
3:00 pm - Sennott Square - Room 6516
(Hosted by Dr. Panos Chrysanthis)

System Support for Managing Graph Data

ABSTRACT

Many applications employ large graphs to model their data. Graph data management is hard: Developers want to use a declarative language to express both reachability queries (e.g., what is the shortest path between two nodes) and pattern matching queries (e.g., find all matches of a small graph in a larger graph). Today, we do not have a graph engine that can process both types of queries on a general graph. From the system point of view, the graph query processor visits a large number of nodes and edges in a random access pattern with little locality while executing queries. We therefore manage the graph in main memory to provide low-latency access, and if the graph is too large we partition it to fit in the main memory of a cluster of machines, forming a distributed system.

In this talk I will discuss three techniques which address these challenges. First, I will present a system that processes declarative reachability queries on a distributed graph. Second, I will extend the reachability algorithm to handle pattern matching queries, forming a unified framework for processing both types of queries. Finally, I will show how to support updates to the graph efficiently while providing consistent distributed snapshots to queries in a lock-free manner.

BIOGRAPHY OF SPEAKER

Sameh Elnikety is a researcher at Microsoft Research. His research focuses on experimental distributed systems. Sameh's work spans a number of research areas including operating systems, distributed computing and databases. Sameh earned his PhD from EPFL in 2007 and MS from Rice in 2003, and his work on database replication received the best paper award at Eurosys 2007.