Compilation and execution strategies for efficient and explainable evaluation of recursive graph queries
Project
Compilation and execution strategies for efficient and explainable evaluation of recursive graph queries
Principal Investigator
George Fletcher
Eindhoven University of Technology (TU/e), Department of Mathematics & Computer Science
Oracle Fellowship Recipient
Xuming Meng
Oracle Principal Investigator
Hassan Chafi, Vice President, Research and Advanced Development
Sungpack Hong, Vice President, Oracle Labs
Summary
Analytics on massive graphs continues to grow in importance in both academic and industrial application domains. A fundamental analytic task on graph data is to evaluate queries exhibiting recursive patterns. For example, in a bibliographical knowledge graph, a data analyst might be interested in finding all pairs of authors in the same co-authorship network, or, for each such pair, finding the shortest co-authorship paths which connect them. These common query patterns, which are supported in Oracle PGX’s PGQL language, are recursive in the sense that their evaluation requires apriori unbounded navigation along the connectivity structure of the underlying graph data set. Unfortunately, graph analytics technologies have difficulty answering these common queries as graphs continue to grow in size.
In this project, we aim to research and develop fundamental solutions to directly address this situation in the state of the art of graph analytics engines, which is a serious limiting factor in their practical deployment. In particular, we will advance the field by investigating (1) efficient solutions for evaluation of recursive graph queries which select vertices in graphs (e.g., the pairs of authors in our example above); (2) the design and engineering of effective algorithms and data structures for evaluating recursive graph queries which select paths in graphs (e.g., the shortest co-authorship paths in our example above); (3) benchmark tooling to support the principled engineering and experimental study of recursive graph query processing solutions; and, (4) effective solutions for helping data analysts to understand the execution behavior and the output of sophisticated recursive graph queries.
The project will also lead to algorithms and data structures which can be used by the Oracle graph-processing group to enhance the systems they are currently developing. By integrating the developed solutions, the Oracle product group will be able to demonstrate to customers the effectiveness of the new approaches and to efficiently compute analysis results which were previously impractical to obtain.