Machine Learning Research Group

OVERVIEW

  • Mission

    The mission of the Machine Learning Research Group is to scale Machine Learning (ML) across Oracle by evangelizing ML throughout Oracle, collaborating with product groups to develop ML-based solutions to improve their products and services, developing ML tools and systems to help them implement ML solutions on their own, and doing fundamental ML research to better understand what will be needed by product groups in the future. The Machine Learning Research group is composed of three teams that collaborate closely with each other, with other Labs research groups, and with product groups and business units across the entire company.

    Data Science Team

    The Data Science Team collaborates with product groups on specific ML projects. Members of the Data Science Team work directly with engineers and managers in other groups to determine what problems they are facing, what data they have available, how they can collect annotations to enable ML algorithms, and, ultimately, how to deploy the ML solution in their product. The Data Science Team works with the Research Team to find novel solutions to collaborator problems and with the Software Team to build solutions that can be transferred easily to the product teams. A sampling of current active collaborations includes:

    Oracle Social Cloud
    Model-based sentiment analysis for social data. The team worked with OSC on collecting and annotating data for sentiment across a dozen languages. The team has also begun investigations into named entity recognition and entity linking in social streams.
    Oracle Global Sales
    Investigating the use of Machine Learning to improve the Oracle sales process using a variety of data sources.
    Oracle Data Cloud
    Developing novel feature selection and structure learning methods for use in the ODC data processing pipeline. Investigating the use of Machine Learning to develop new applications that process data across multiple different sources.
    Health Sciences GBU
    Named entity recognition and relationship extraction in the medical domain.

    Software Team

    The Software Team builds systems and infrastructure that enables ML work in our group and throughout Oracle. There are three main focuses for their work:

    Notebook Systems
    Infrastructure Notebook interfaces have been growing in popularity for ML and Data Science work. The team’s aim is to build notebook infrastructure that will provide data scientists with a wide variety of powerful ML tools in an easy-to-use interface that makes it easy to do scalable, repeatable work.
    Hyperparameter Optimization
    Most ML models have some hyperparameters associated with them, such as the number of clusters to use for a clustering algorithm or the learning rate associated with a classification algorithm. For any given model, we want to choose the hyperparameters that lead to the best performance. The number of combinations of hyperparameters for any given model can be quite large. As a result, we need to parallelize the evaluation of sets of hyperparameters using a system that can not only select the next set of hyperparameters to try, but can also provide fair access to computational resources as well as monitoring the training and evaluation of hundreds or thousands of models at once.
    Machine Learning Tools
    The Team is responsible for the maintenance and improvement of the ML tools developed by the Machine Learning Research Group.

    Research Team

    The Research Team works on emerging technologies that may be of use to Oracle in the future. Their research interests span a number of areas, including:

    Core Machine Learning
    Feature selection, causal inference, classification, inference techniques, unsupervised learning, active learning
    Statistical Natural Language Processing
    Named entity recognition, entity linking, large-scale coreference resolution, automated knowledge base construction, product attribute extraction, and word embedding models
    Scalable Machine Learning
    Parallel inference, Machine Learning on alternative architectures
    Deep Learning
    neural variational inference, character level NLP, image labelling and segmentation

PUBLICATIONS