Compilation and execution strategies for efficient and explainable evaluation of recursive graph queries

Project

Compilation and execution strategies for efficient and explainable evaluation of recursive graph queries

Principal Investigator

George Fletcher

Eindhoven University of Technology (TU/e), Department of Mathematics & Computer Science

Oracle Fellowship Recipient

Xuming Meng

Oracle Principal Investigator

Hassan Chafi, Vice President, Research and Advanced Development
Sungpack Hong, Vice President, Oracle Labs

Summary

Analytics on massive graphs continues to grow in importance in both academic and industrial application domains.  A fundamental analytic task on graph data is to evaluate queries exhibiting recursive patterns.  For example, in a bibliographical knowledge graph, a data analyst might be interested in finding all pairs of authors in the same co-authorship network, or, for each such pair, finding the shortest co-authorship paths which connect them. These common query patterns, which are supported in Oracle PGX’s PGQL language, are recursive in the sense that their evaluation requires apriori unbounded navigation along the connectivity structure of the underlying graph data set.  Unfortunately, graph analytics technologies have difficulty answering these common queries as graphs continue to grow in size.

In this project, we aim to research and develop fundamental solutions to directly address this situation in the state of the art of graph analytics engines, which is a serious limiting factor in their practical deployment.  In particular, we will advance the field by investigating (1) efficient solutions for evaluation of recursive graph queries which select vertices in graphs (e.g., the pairs of authors in our example above); (2) the design and engineering of effective algorithms and data structures for evaluating recursive graph queries which select paths in graphs (e.g., the shortest co-authorship paths in our example above); (3) benchmark tooling to support the principled engineering and experimental study of recursive graph query processing solutions; and, (4) effective solutions for helping data analysts to understand the execution behavior and the output of sophisticated recursive graph queries.

The project will also lead to algorithms and data structures which can be used by the Oracle graph-processing group to enhance the systems they are currently developing. By integrating the developed solutions, the Oracle product group will be able to demonstrate to customers the effectiveness of the new approaches and to efficiently compute analysis results which were previously impractical to obtain.