Parallel Graph AnalytiX (PGX)

Graphs are a powerful abstraction to enable knowledge discovery from relationships in large datasets, thanks to their explicit representation of relationships as edges. Graph analysis reveals latent information that is encoded, not as fields in the data, but as direct and indirect relationships between elements of the data – information that is not obvious to the naked eye, but can have tremendous value once uncovered.

PGX is a toolkit for graph analysis that supports both running algorithms such as PageRank on graphs, and performing SQL-like pattern-matching on graphs, using the results of algorithmic analyses. Algorithms are parallelized for extreme performance. The PGX toolkit includes both a single-node in-memory engine, and a distributed engine for extremely large graphs. Graphs can be loaded from a variety of sources including flat files, SQL and NoSQL databases and Apache Spark and Hadoop; incremental updates are supported.

PGX is both already available as an option in Oracle products and an active research project at Oracle Labs, with a world-class team of researchers further advancing the capabilities of the toolkit.

 


Hardware and Software, Engineered to Work Together