Domain Global Graphs

Integrates PGX & Data Studio into solutions that investigate domain graphs, e.g., in the Financial Crime & Compliance Studio (FCC Studio), and researches improvements, e.g., by using Machine Learning.

Project Details

Domain Global Graphs

Domain Global Graphs

Integrates PGX & Data Studio into solutions that investigate domain graphs, e.g., in the Financial Crime & Compliance Studio (FCC Studio), and researches improvements, e.g., by using Machine Learning.

Project Overview

For enterprise use cases of Oracle Labs' PGX and Data Studio projects, organizations often adopts graph data model to integrate various data sources into one global view from their enterprise domain (e.g., financial, retail or healthcare data), so that they can run graph analytics and conduct investigations on them. PGX helps users to reveal latent information in the graph representation of domain data through a toolkit for graph analysis that supports running custom graph algorithms such as PageRank, or performing SQL-like pattern-matching on graphs (PGQL – property graph query language), or graph Machine Learning tools. Oracle Labs Data Studio is a web-based notebook platform that combines live code collaboration in multiple programming languages with graph analytics and rich, interactive visualizations that are also specialized for graphs by supporting filtering graphs, highlighting elements, visualizing geographical data, and expanding/contracting the graph view.

A prime example of domain global graphs is the Oracle Financial Services Crime and Compliance Studio in the financial domain. Compliance Studio is an integrated workbench for financial crime data scientists. It provides tools for financial institutions to detect and prevent financial crime. Compliance Studio can automatically link a bank's data, such as customers, transactions, and alerts to related data in external watchlists and data sources, into a global financial graph, using Entity Resolution techniques. It provides a plethora of tools to help the investigation of potential financial crimes in the financial graph:

* Anti-Money Laundering scenario authoring
* Pre-defined out-of-the-box scenarios, e.g., to calculate risk factors and red flags, and explore investigation cases
* Machine Learning enhancements
* Open tools including Apache Spark, Apache Zeppelin, R and Python
* Frequent updates to the global graph

The Domain Global Graph team helps with the integration of PGX and Data Studio into such solutions that support the investigation of domain global graphs, and helps with further research on how to produce additional insight to facilitate investigation, e.g., by using Machine Learning techniques. We handle various challenging and research topics, including:

  • Development and enhancement of machine learning and data analysis algorithms for Global Graph applications. Examples use cases:
    • Recognizing graph entities and their relationship from text and tables, employing and improving techniques such as Natural Language Processing (NLP), Named Entity Recognition (NER), Relation Extraction, and Co-Reference Resolution.
    • Generating a narrative text from a domain global graph, adapting them to the graph structure.
    • Improving Entity Resolution (ER) techniques, e.g., by considering entities' fields combinations and value thresholds, or by considering the graph structure and properties of the resolved entities, with explainability as well.
    • Improving global graph investigation, e.g., regarding financial patterns by detecting similar or relevant subgraphs, or by automating the processing and classification of cases under investigation.
    • Improving name and address parsing, standardization and matching, across different languages as well.
  • Optimization of data pipeline for integrating data sources into graph model and keeping the data synchronized.
  • Ensuring data permission and lineage control for large-scale data integration.
  • Customized visualization for interactive data analysis and explainability.
  • Optimization and validation of data analysis pipeline and its deployment with modern software architecture.

Principal Investigator

Iraklis Psaroudakis

Principal Member of Technical Staff

I am a researcher at Oracle Labs, Switzerland. My main research interests include improving the performance of analytical & graph workloads, parallel programming, and OS / runtime-system interaction. Prior to Oracle, I completed my Ph.D. at the Data-Intensive Application and Systems Laboratory of EPFL (Lausanne, Switzerland), focusing on scaling up highly concurrent analytical database workloads on multi-socket multi-core servers through (a) sharing data and work across concurrent queries, and (b) adaptive NUMA-aware data placement and task scheduling.

Publications