The mission of the Machine Learning Research Group is to scale
Machine Learning (ML) across Oracle by evangelizing ML throughout
Oracle, collaborating with product groups to develop ML-based
solutions to improve their products and services, developing ML
tools and systems to help them implement ML solutions on their own,
and doing fundamental ML research to better understand what will be
needed by product groups in the future. The Machine Learning
Research group is composed of three teams that collaborate closely
with each other, with other Labs research groups, and with product groups and business units across the entire company.
Data Science Team
The Data Science Team collaborates with product groups on specific
ML projects. Members of the Data Science Team work directly with
engineers and managers in other groups to determine what problems
they are facing, what data they have available, how they can collect
annotations to enable ML algorithms, and, ultimately, how to deploy
the ML solution in their product. The Data Science Team works with
the Research Team to find novel solutions to collaborator problems
and with the Software Team to build solutions that can be
transferred easily to the product teams. A sampling of current active
- Oracle Social Cloud
- Model-based sentiment analysis for
social data. The team worked with OSC on collecting and annotating
data for sentiment across a dozen languages. The team has also begun
investigations into named entity recognition and entity linking in
- Oracle Global Sales
- Investigating the use of Machine Learning to
improve the Oracle sales process using a variety of data sources.
- Oracle Data Cloud
- Developing novel feature selection and structure learning methods for use in the ODC data processing pipeline. Investigating the use of Machine Learning to
develop new applications that process data across multiple different sources.
- Health Sciences GBU
- Named entity recognition and
relationship extraction in the medical domain.
The Software Team builds systems and infrastructure that enables ML
work in our group and throughout Oracle. There are three main
focuses for their work:
- Notebook Systems
- Infrastructure Notebook interfaces have been growing in popularity for ML and Data Science work. The team’s aim is to build notebook infrastructure that will provide data scientists with a wide variety of powerful ML tools in an easy-to-use interface that makes it easy to do scalable, repeatable work.
- Hyperparameter Optimization
- Most ML models have some hyperparameters associated with them, such as the number of clusters to use for a clustering algorithm or the learning rate associated with a classification algorithm. For any given model, we want to choose the hyperparameters that lead to the best performance. The number of combinations of hyperparameters for any given model can be quite large. As a result, we need to parallelize the evaluation of sets of hyperparameters using a system that can not only select the next set of hyperparameters to try, but can also provide fair access to computational resources as well as monitoring the training and evaluation of hundreds or thousands of models at once.
- Machine Learning Tools
- The Team is responsible for the maintenance and improvement of the ML tools developed by the Machine Learning Research Group.
The Research Team works on emerging technologies that may be of use to
Oracle in the future. Their research interests span a number of areas,
- Core Machine Learning
- Feature selection, causal inference, classification, inference techniques, unsupervised learning, active learning
- Statistical Natural Language Processing
- Named entity recognition, entity linking, large-scale coreference resolution, automated knowledge base construction, product attribute extraction, and word embedding models
- Scalable Machine Learning
- Parallel inference, Machine Learning on alternative architectures
- Deep Learning
- neural variational inference, character level NLP, image labelling and segmentation