Oracle Labs Internship Program
If you are a student or recent graduate, an internship at Oracle Labs will help you build your skills by working on cutting-edge technology alongside our industry experts and scientists.
Opportunities For You
- Apply your skills and knowledge to build the future of technology
- Work in a distributed self-driven international team of industry experts and scientists
- Contribute to cutting-edge products and open-source projects
- Publish the result of your work
- Choose one of our research centers across the globe, or work from the comfort of your home
Your Skills
If you can tick three or more boxes from this list, go ahead and apply to work with us!
- Experience with relational data design and database queries
- Experience in modern object-oriented programming languages
- Experience in computer science fundamentals (data structures, algorithms and complexity analysis)
- Experience with parallel and distributed computing
- Experience with REST APIs and the concepts of RESTful architecture
- Experience with modern IDEs, version control (git), build management and Linux
- Experience with machine learning technologies and toolkits
- Good communication and presentation skills in English (required)
How to Apply
In order to apply, please send an email to the project's point of contact (see details below) including the following:
- Your CV or link to your home page containing your curriculum
- List your area of interest
- Your preferred location
- Link to your GitHub profile (optional)
- For current students and recent graduates: University transcripts
The duration of the internship can vary based on the candidate's constraints. The usual duration is 6 months. We pay competitive salary. The research topics listed below are informative, we are open to suggestions depending on your skills and qualifications. By sending in your application you opt-in for processing your personal information.
In case you would like to opt out from your internship application, please send an email to the project's point of contact.
GraalVM
A high-performance runtime supporting Java and JVM languages, JavaScript, Ruby, R, Python, C/C++, and more. It can run standalone or embedded in OpenJDK / OracleJDK, Node.js, and Oracle Database.
Possible Research Areas
- Implement new optimizations and features for Graal, a modern compiler for Java written in Java
- Develop new language, monitoring, and other JDK features for Native Image
- Build new profile-guided optimization (PGO) features for Native Image
- Help build GraalOS, a new cutting-edge cloud technology based on Native Image
- Work on the Graal Cloud Development Kit, a new technology for building multicloud Java microservices
- Explore new use cases for machine learning within the GraalVM project
- Work on interactive tools and visualizations that help boost developer productivity
- Explore new security features for GraalVM and GraalOS
- Extend GraalPy and many other Truffle-based language implementations with new capabilities
- Join one of many research projects within the Graal project
Point of Contact
To apply, please send an email with the required information (see How to Apply above) to graalvm-internships_ww_grp@oracle.com.
Oracle Database Multilingual Engine
The Multilingual Engine (MLE) research project investigates how to leverage programming language runtimes in database systems (DBMS). Our hypothesis is that application development and data science can benefit from running code as close to the data as possible. For example, Python workloads for training machine learning models can run directly in the DBMS, using the DBMS as a compute cluster with efficient access to data. Similarly, the best place to run data-centric applications can be the database system itself, completely eliminating performance concerns due to network round trips and reducing infrastructure costs. The focus of our work is to enable Oracle Database to execute such workloads written in modern and popular languages and frameworks. The foundation for the project is GraalVM, Oracle Labs’ high-performance, polyglot programming language runtime. A first outcome of our vision is the JavaScript support in Oracle Database 23c.
Additionally, we leverage Just-In-Time (JIT) compilation to improve the performance of database query processing. We explore making queries on relational tables and document collections faster using code generation and JIT compilation, all based on GraalVM and the Truffle framework.
Internships in the MLE project offer the opportunity to work with state-of-the-art technology at the crossroads of database systems and programming language runtimes. The MLE project conducts research with a strong focus on practical applicability.
Potential Topics
We offer various topics depending on the candidate's skills and interests. Here are some of the projects that can be explored during the internship:
- Efficient columnar data export for in-database data science
- Reliable lock-free shared-memory data structures
- Compilers for tree ensemble inference in DB
- Python parallel computing in RDBMS
- Accelerating JSON processing in the Database
- Optimizing transactional workloads under a closed-world assumption
How to apply
In order to apply, please send an email to labs-hiring_ww@oracle.com including the following:
- Your CV or link to your home page containing your curriculum
- Description of your motivation and area(s) of interest
- Availability and preferred internship duration
- Preferred location
Oracle Labs Apps
The Oracle Labs Apps team is in charge of designing, building and operating apps that follow the principles of modern app development.
The team developers apps that are used internally as well as apps that improve the developer experience of people who interact with Oracle's open-source projects. One such project is the Oracle Contributor Agreement Signing Service (OCASS). OCASS enables contributors to Oracle-sponsored open-source projects to sign the Oracle Contributor Agreement (OCA), a document which gives Oracle and the contributor joint copyright interests in the contributed code. All apps are developed and operated to adhere to high standards in terms of security, compliance, availability, and more.
Potential Topics
- Development of various features spanning the entire app stack
- Leverage database-centric architectures to simplify the app stack (e.g., transactional event queues for message querying)
- Observing business metrics
Point of Contact
To apply, please send an email with the required information (see How to Apply above) to Labs-Hiring_ww@oracle.com.
Automating Machine Learning and Explainability (AutoMLx)
Accurate, fast, and easy-to-use automated machine learning pipeline with integrated explainability techniques.
Possible Research Areas
- AutoML and/or explainability for classification, regression, anomaly detection, and forecasting tasks
- Explore support for federated learning
- Explore techniques to reduce model bias while tuning
- Extend dataset support for unstructured (e.g., NLP) and semi-structured (e.g., video/audio/graph) data
- Generic model support including GNNs, DNNs and/or RNNs
Point of Contact
To apply, please send an email with the required information (see How to Apply above) to Labs-Hiring_ww@oracle.com.
Scalable Graph Analytics and Machine Learning
Graph analytics is a powerful tool to efficiently leverage latent information stored inside data connections. As the number of connections grows exponentially in today’s increasingly common Big Data, being able to process graphs at scale becomes increasingly relevant. At Oracle Labs, we are developing scalable graph-processing solutions that cover a wide range of customer needs and applications:
- PGX: a standalone graph analytical system that supports graph algorithms, such as PageRank, graph queries with PGQL (an SQL-like graph query language), and graph ML. PGX includes both a single-machine in-memory engine and a distributed engine for very large graphs, and is already available as an option in Oracle products and an active research project at Oracle Labs. Learn more about PGX
- Graph-in-DB: scalable graph processing support in the Oracle Database. Graph-in-DB is an ambitious project which leverages knowledge from various domains of computer science, such as databases, graphs, algorithms and data structures, tuning and performance, multicore and distributed computing, machine learning, and compilation. It involves significant research and design effort, as well as challenging engineering tasks. The Graph-in-DB project is also a great opportunity to gain unique software development experience as it takes place in an exceptionally large and complex system.
- Domain Global Graphs: for enterprise use cases, organizations adopt the graph data model to integrate various data sources in one global view from their enterprise domain (e.g., financial), so that they can run graph analytics and conduct investigations. The team integrates PGX and Data Studio into solutions that support the investigation of domain global graphs, and further researches how to gain additional insight to facilitate investigations, e.g., by using (Graph) Machine Learning. Learn more about Domain Global Graphs
Potential Topics
- PGX (Learn more about potential topics)
- Distributed fault tolerance & graph snapshots – exploring various options for enhancing fault tolerance of distributed graph processing systems
- Extended distributed computations – leveraging an asynchronous depth-first runtime to support a broader scope of computations, such as graph algorithms, machine learning and relational operators
- Distributed data/graph placement – exploring distributed data/graph placement and partitioning techniques in the presence of concurrent users
- Distributed graph-based ML– retrieving graph embeddings for ML algorithms from distributed graphs
- Dynamic data loading for very large graphs – supporting dynamic loading of data that is present in offloaded systems
- Graph-in-DB (Learn more about potential topics)
- Hybrid execution modes – designing computations to efficiently operate when data is partially on disk and partially in memory
- Complex analytical queries – exploring latest techniques to autonomously exploit available graph indexes in complex analytical queries
- ML in the Database – leveraging latest machine learning techniques to improve the performance of various components of Oracle Database
- Graph algorithm compilation – extending a compiler for a graph-centric domain-specific language
- Domain Global Graphs (Learn more about potential topics)
- Text to graph conversion – recognizing graph entities and their relationship from unstructured data, employing and improving Seq2Seq, NER, RE, and CR techniques
- Assistance and automation of global graph investigation workflows – e.g., regarding financial patterns by detecting similar or relevant subgraphs or by automating crime type classification
- Advanced entity resolution at scale – exploring text embeddings for similarity search, blocking, and graph machine learning techniques
- Productization of machine learning and graph analytics research – building model serving APIs, training pipelines, and reusable components that cover the whole machine learning operations lifecycle
As an intern, you will participate in the design, implementation, and evaluation of at least one component of the system, and you will give informal and formal presentations on the progress and results obtained during the course of the internship.
The above topics are informative, we offer various topics depending on skills and qualifications of the applicant.
Point of Contact
To apply, please send an email with the required information (see How to Apply above) to Labs-Hiring_ww@oracle.com.
Software Supply Chain Security in the Cloud
Adoption of third-party open-source software (OSS) has increased significantly over the last few years. And for a good reason: OSS helps developers lower costs and reduce time to market by reusing existing components as building blocks for their applications. At the same time, vulnerabilities in OSS pose a significant risk to application security. Developers need to keep track of their (transitive) dependencies, known vulnerabilities (CVEs), and upgrade dependencies whenever a new CVE is found. Application Dependency Management (ADM) is an OCI-native service that helps managing dependencies in the customers software supply chain. ADM is exploring and researching the software composition analysis space to help users manage the risk associated to using 3rd parties.
Potential Topics
The goal of this project is to extend the Application Depedency Management cloud service with new capabilities in the areas of automated tuning and upgrades, patching security vulnerabilities in application dependencies and automated testing. We offer various topics depending on the skills and the interests of the candidate:
- Tailored security policy generation: parameters refinement: The least privilege principle states that an application should run with the least amount of privileges possible. Today, this principle can be enforced by creating a tailored security policy for mechanisms such as Seccomp, AppArmor & SELinux. However, manually creating such a security policy (i.e., defining the smallest set of privileges that is necessary for the application) is a tedious, error-prone task and one that needs to be revisited every time the application changes. Previous works have shown that it is instead possible to discover, with good accuracy, the smallest set of necessary privileges using static analysis of the source code and thus automatically generate the security policy.
- Improve Vulnerability Curation Process: Application Dependency Management provides a knowledge base of artifacts and their known vulnerabilities (CVEs). New and updated CVEs need to be processed to identify the list of vulnerable artifact versions. This curation process currently involves different tools and scripts, but there is no single solution to optimize the time to curate the CVE itself. The objective of this internship is to evaluate the potential of Data Science and Artificial Intelligence techniques, including machine learning, to automate the manual labor involved in the CVE curation process and enhance its overall efficiency. Additionally, the internship may also encompass the development of a user-friendly application to support the entire CVE curation process, further optimizing it through increased automation.
- Build Profiles of third-party libraries: Modern applications heavily depend on open-source software components (dependencies). Developers are often not aware of what capabilities these dependencies need. For example, if the dependency makes network requests, load code at runtime, start new threads or write to the file system. Giving developers insight into the capabilities that their dependencies require could help them make more informed decisions about which dependencies to include. The goal of this internship is to develop a system to automatically detect what capabilities does an open-source dependency need based on the available test suite. Third-party libraries can suffer from vulnerabilities that are associated with a score according to the Common Vulnerability Scoring System (CVSS). This score will depend on the severity of the vulnerability, however, such score can only be determined if a vulnerability for the third-party has been found. Another objective of the internship could be to evaluate a risk score (additionally to CVSS) for third-party dependency libraries. Different paths can be explored towards this idea, such as a risk score taking into account the maturity of the project and its code repository (project still maintained, license, security policy); the usage of third parties (frequency and ways of usage); how big, how complex the library is, how much is it used in other projects.
- Evaluate impact of changing a 3rd party library: At the moment, ADM remediation runs the CI/CD pipeline of the customer’s service to confirm that updating a vulnerable dependency did not break the customer’s application. The idea behind this internship topic is to improve this verification, to also include the historical data about the application and evaluate impact of changing a 3rd party by simulating an execution based on a recorded trace of an older version and see if exercised code path is the same. High level idea is to replay the history of the service , but to replace the version of the application that was used at the time with the updated version. Once this test is run, the verify step would have a better idea of how likely the dependency upgrade is to break the application, and this can be communicated to the user through something like a “disruption score”.
- Using kernel level instrumentation to automatically derive build provenance: While producing SBOM sounds easy in theory (many package managers such as Maven and NPM have functionalities to list all the declared 3rd parties), it is hard in practice. Software stacks are polyglot (multiple programming languages), composed of in-house scripting logic for building (e.g. download dependency from the Internet) and others. In this internship, we explored the feasibility of using eBPF (an advanced Linux kernel-level instrumentation mechanism) to monitor the build processes, observe file and network accesses and reconstruct all the input used for building the software.
Point of Contact
In order to apply, please send an email to Labs-Hiring_ww@oracle.com including the following:
- Your CV or link to your home page containing your curriculum
- Description of your motivation and area of interest
- Your preferred internship dates & location
Graal Cloud Service
Graal Cloud Service (GCS) uses GraalVM native image, a technology to ahead-of-time compile Java code to a standalone executable. Additionally, the service leverages GraalOS, a new virtualization technology built on top of modern hardware features, such as control flow integrity and in-process memory protection features, compilation techniques to isolate untrusted code execution.
Internship details
The goal of this project is to extend GCS with new capabilities. We offer various topics depending on the skills and the interests of the candidate:
- Detect and mitigate metastable failures for applications running on GCS: Metastable failures1 are a class of failures in distributed systems in which a “sustaining effect” prevents the system’s quick recovery after a temporary “trigger”. These failures can be sustained by so-called “Workload Amplification” or “Capacity Degradation Amplification” effects. Examples of sustaining effects include retries, garbage collection, look-aside cache. The GCS platform, acting as a cloud orchestrator managing all/parts of the distributed components of the application and with its virtualization layer, offers the opportunity to detect those issues with a better overview of the states of the components. It can also offer the possibility to correct those issues with or without the aid of the application framework.
- Secure and optimized cross-isolate communication for GraalOS: The BlackBox paper describes a technique to improve container isolation. Traditionally, container isolation is guaranteed by the operating system. However, operating systems are big code bases that occasionally have vulnerability. Instead, BlackBox runs a container security monitor (CSM) between the container and operating system. The CSM creates protected physical address spaces (PPASes) for each container such that there is no direct information flow from container to operating system or other container PPASes. The authors make clever use of the hardware’s virtualization support to run the CSM at the level of a hypervisor, i.e. a higher privilege than the kernel and user space. However, the CSM is not a hypervisor itself: it still delegates memory management and task scheduling to the OS. Containers are prohibited from accessing each other’s memory and communication between the container and OS is encrypted. In the context of GraalOS, one internship topic would be to investigate the current techniques by which Native Image Isolates are isolated, and whether the techniques from BlackBox can be used to improve this isolation.
- Platform to analyze application usage to optimize Graal VM Native Image creation: GraalVM Native Image compiles Java code ahead-of-time to a standalone executable. It has the benefit of significantly improving startup time as well as memory footprint. However, peak performance of native image is lower than peak performance running on a traditional JVM that does just-in-time compilation. GraalVM can apply profile-guided optimizations (PGO) for additional performance gain and higher throughput of native images. With PGO, one collects the profiling data in advance and then feeds it to the native image builder, which will use this information to optimize the performance of the resulting binary. The goal of this internship is to extend the Graal Cloud Service to automatically generate and apply profiling data for applications that are running on the platform, effectively simplifying the generation of optimized native images.
- Automatically derive OCI IAM policies: In this internship, we will explore automatically generating OCI IAM policies for a given native image application. First, we will use static analysis to understand the application’s usage of the OCI SDK. Second, analyzing OpenAPI specification (and source code) of OCI cloud services, we will build a database of the permissions for each REST operation. Putting the two information together will enable generating policies, optimizing for giving the least privilege possible to the application
- Query engine on top of Java heap dumps and compilation trace to aggregate and find 3rd party issues: JVMs have numerous configuration options and finding their optimal values can be challenging even for experienced developers. Analyzing runtime application metrics could come at help to recognize incorrectly set parameters and determine an optimal value for them.
The goal of this internship would be to suggest recommendations for JVM parameters by analyzing runtime metrics and applying these suggestions in a test environment to confirm performance improvement. - Large scale Java traces collection using record replay: Debugging is a time consuming task that consumes a large amount of a developers time. It is challenging to identify the root cause of an incident especially when an application is running in the cloud.
During this internship, there will be a tool developed that collects and stores java execution traces and state to be able to replay the application state on demand
How to Apply
In order to apply, please send an email to gcn-internships_us_grp@oracle.com including the following:
- Your CV or link to your home page containing your curriculum
- Description of your motivation and area of interest
- Your preferred internship dates & location
Graal Cloud Native
Graal Cloud Native (GCN) is an Open Source developer-centered platform built on the Java ecosystem to dramatically improve developer productivity when building applications and microservices leveraging Oracle Cloud. GCN accomplishes that by providing automation in writing applications, managing configurations, and allowing developers to rapidly build/test/deploy/debug their application from their IDE (Visual Studio Code). GCN contains GraalVM Native Image, the Micronaut framework, GraalVM Tools for Java and Micronaut VS Code IDE Extensions, documentation, hands-on tutorials, and Luna Labs based hands-on experience.
Possible Research Areas
- Deep integration with Multiple Clouds
- IDE-based tools for improving developer productivity
- Abstractions over Services available on Multiple Clouds
- Tools for dramatically improving tracing, logging, and debugging in Cloud Computing
Point of Contact
To apply, please send an email with the required information (see How to Apply above) to Labs-Hiring_ww@oracle.com.
ML for Security Applications – KeyBridge
Our mission is increasing the security posture of both Oracle cloud customers and the Oracle's teams operating the cloud infrastructure.
Our research vision is of automated detection and mitigation of security events, allowing security experts to scale their efforts to large infrastructures such as the cloud. In the process, we working on data preparation and data exploration, statistical analysis and anomaly detection, alert generation and presentation of results.
Our analytics toolbelt includes techniques from data exploration, statistical analysis and deep learning. Working with application logs means that representation learning, machine learning for code and embedding techniques are fundamental research topics for us. So is explainable machine learning. For the scalable processing of the high volumes of logs we get, we are developing a machine learning pipeline framework that operates on top of various state-of-the-art libraries and follows a modular design.
Our team is a motley crew of researchers with diverse backgrounds in data analytics, machine learning, network and system design, and software development. We share a passion to develop reliable, innovative solutions to security problems with practical relevance. We place high value on a collaboration spirit, an inquisitive mindset and a drive to deliver high quality.
If you share our interests and values, we would be happy to welcome you in the family. Some ideas of directions in which an internship can go:
- Explorative: investigation of what machine learning techniques are applicable to a particular security problem.
- Research in ML: development of machine learning solutions for log encoding, source code generation, information retrieval, anomaly detection, building behavior profiles.
- Research in HCI: investigation of approaches towards data presentation for operational teams.
- Software development: development of scalable machine learning pipelines
Point of Contact
To apply, please send an email with the required information (see How to Apply above) to olabs-keybridge-hiring_ww@oracle.com.
Intelligent Application Security
The Intelligent Application Security team at Oracle Labs works on innovative projects in the application security space spanning areas like program analysis, program repair, machine learning, software composition analysis, malware detection, and runtime protection. The team is based in Brisbane, Australia with a few remote members based in Austria. Internships in the IAS team offer exciting opportunities to those who are passionate about improving application security. The ideal candidate will relish the challenge of developing techniques that are precise and can be applied at scale.
Our internships cater to a wide variety of students studying computer science or software engineering including those who are in the final year of their undergraduate degree or are undertaking research at the master's or PhD level. As a research intern, you will have the opportunity to work alongside a world class team of researchers and engineers as part of one of the below projects:
Project RASPunzel aims to deliver an automated and scalable runtime application self-protection (RASP) solution for Java. RASPunzel automatically synthesizes and enforces allowlists for various sensitive operations like Java deserialization, JNDI lookups, SQL operations and crypto usage.
- Synthesis of RASP security monitors
- Automated program repair based on RASP monitors
- Policy inference and enforcement for cloud native applications
- RASP-based threat intelligence gathering and analysis
Below is a selection of research topics that you'd potentially be working on:
Macaron is an extensible supply chain security analysis framework from Oracle Labs that supports a wide range of build systems and CI/CD services. It can be used to prevent supply chain attacks or check conformance to security frameworks, such as SLSA.
Below is a selection of research topics that you'd potentially be working on:
- Automated malware analysis
- Hardening build pipelines using Cloud Confidential Computing and keyless signing
- Automated build system analysis including containerized environments
- Policy enforcement in Kubernetes deployments
Learn more about Macaron on GitHub
Project Toffee is aimed at enabling automated program repair by leveraging on program analysis techniques as well as the latest advancement in pre-trained and large language models (LLMs). The overall goal is to reduce the manual effort required in bug localization and repair by at least 50%. Automated bug localization is a stepping stone for a broader automated program repair. The objective here is to generate human-in-the-loop solutions to reduce the manual tasks involved in typical bug localization processes as much as possible. On the automated repair side, the objective is to combine program analysis with machine learning to fix bugs automatically starting with pattern driven bug fixing to more complex bugs requiring proper program analysis.
Below is a selection of research topics that you'd potentially be working on:
- Application of large language models (LLMs) for bug localization
- Automated bug reproduction leveraging on LLMs
- Automated test prioritization, and LLM driven automated program repair
Intelligent Application Security explorations combine techniques and tools from the above projects to devise applied enhancements to DevSecOps processes, thereby delivering benefits in the form of developer and SecOps efficiencies as well as advancing state of the art in application security. As an example, this includes closing the loop techniques where security alerts produced using a tool or technique can also be used to automatically synthesise targeted repairs for security issues that have been identified in code, build scripts and CI pipelines.
Point of Contact
To apply, please send an email with the required information (see How to Apply above) to ias-internships-au_au_grp@oracle.com.