CSR++: A Fast, Scalable, Update-Friendly Graph Data Structure

The graph model enables a broad range of analysis, thus graph processing is an invaluable tool in data analytics. At the heart of every graph-processing system lies a concurrent graph data structure storing the graph. Such a data structure needs to be highly efficient for both graph algorithms and queries. Due to the continuous evolution, the sparsity, and the scale-free nature of real-world graphs, graph-processing systems face the challenge of providing an appropriate graph data structure that enables both fast analytic workloads and low-memory graph mutations. Existing graph structures offer a hard trade-off between read-only performance, update friendliness, and memory consumption upon updates. In this paper, we introduce CSR++, a new graph data structure that removes these trade-offs and enables both fast read-only analytics and quick and memory-friendly mutations. CSR++ combines ideas from CSR, the fastest read-only data structure, and adjacency lists to achieve the best of both worlds. We compare CSR++ to CSR, adjacency lists from the Boost Graph Library, and LLAMA, a state-of-the-art update-friendly graph structure. In our evaluation, which is based on popular graph-processing algorithms executed over real-world graphs, we show that CSR++ remains close to CSR in read-only concurrent performance (within 10% on average), while significantly outperforming CSR (by an order of magnitude) and LLAMA (by almost 2x) with frequent updates.



A Latina in Tech

Having started my Computer Science degree while growing up in Colombia and later completing it in Australia, I went from being an overrepresented Latina to being an underrepresented one. Further, the female to male ratio in CS in both countries was also rather different.

Being a mum, a wife, a teacher, a researcher, a manager and a leader, in this talk, I provide some of my lessons learnt throughout my career, with examples of successes and failures throughout my PhD, academic life, and industrial research life.



The University of Queensland and Oracle team up to develop world-class cyber security experts

The field of cyber security is coming of age, with more than a million job openings globally, including many in Australia, and a strong move from reactive to preventative security taking form. At The University of Queensland, teaming up with industry specialists like Oracle Labs – the research and development branch of global technology firm Oracle – will ensure both industry and researchers can focus on the real issues that businesses and users care about.



Coding Practices and Recommendations of Spring Security for Enterprise Applications

Spring security is tremendously popular among practitioners for its ease of use to secure enterprise applications. In this paper, we study the application framework misconfiguration vulnerabilities in the light of Spring security, which is relatively understudied in the existing literature. Towards that goal, we identify 6 types of security anti-patterns and 4 insecure vulnerable defaults by conducting a measurement-based approach on 28 Spring applications. Our analysis shows that security risks associated with the identified security anti-patterns and insecure defaults can leave the enterprise application vulnerable to a wide range of high-risk attacks. To prevent these high-risk attacks, we also provide recommendations for practitioners. Consequently, our study has contributed one update to the official Spring security documentation while other security issues identified in this study are being considered for future major releases by Spring security community.



Women in CS panel

While women were among the first programmers in the 20th century, and contributed substantially to the industry, over the years both the CS industry and CS academia got dominated by men. In this social hour, we explore the opportunities and challenges women encounter in Computer Science through a panel discussion. Our panelists are women who have leading roles in industry, academia, and industrial research. By sharing stories via Q&A, we look forward to inspiring younger women to fulfill their highest potentials, understand how women can make it to senior positions, and enjoy their career.



The NEBULA RPC-Optimized Architecture.

Large-scale online services are commonly structured as a network of software tiers, which communicate over the datacenter network using RPCs. Ongoing trends towards software decomposition have led to the prevalence of tiers receiving and generating RPCs with runtimes of only a few microseconds. With such small software runtimes, even the smallest latency overheads in RPC handling have a significant relative performance impact.In particular, we find that growing network bandwidth introduces queuing effects within a server’s memory hierarchy, considerably hurting the response latency of fine-grained RPCs. In this work we introduce NEBULA, an architecture optimized to accelerate the most challenging microsecond-scale RPCs, by leveraging two novel mechanisms to drastically improve server throughput under strict tail latency goals...



Efficient Multi-word Compare and Swap.

Atomic lock-free multi-word compare-and-swap (MCAS) is a powerful tool for designing concurrent algorithms. Yet, its widespread usage has been limited because lock-free implementations of MCAS make heavy use of expensive compare-and-swap (CAS) instructions. Existing MCAS implementations indeed use at least 2k+1 CASes per k-CAS. This leads to the natural desire to minimize the number of CASes required to implement MCAS. We first prove in this paper that it is impossible to "pack" the information required to perform a k-word CAS (k-CAS) in less than k locations to be CASed. Then we present the first algorithm that requires k+1 CASes per call to k-CAS in the common uncontended case. We implement our algorithm and show that it outperforms a state-of-the-art baseline in a variety of benchmarks in most considered workloads. We also present a durably linearizable (persistent memory friendly) version of our MCAS algorithm using only 2 persistence fences per call, while still only requiring k+1 CASes per k-CAS.



Industrial Experience of Finding Cryptographic Vulnerabilities in Large-scale Codebases

Enterprise environments need to screen large-scale (millions of lines of code) codebases for vulnerability detection, resulting in high requirements for precision and scalability of a static analysis tool. At Oracle, Parfait is one such bug checker, providing precision and scalability of results, including inter-procedural analyses. CryptoGuard is a precise static analyzer for detecting cryptographic vulnerabilities in Java code built on Soot. In this paper, we describe how to integrate CryptoGuard into Parfait, with changing intermediate representation and relying on a demand-driven IFDS framework in Parfait, resulting in a precise and scalable tool for cryptographic vulnerabilities detection. We evaluate our tool on several large real-world applications and a comprehensive Java cryptographic vulnerability benchmark, CryptoAPI-Bench. Initial results show that the new cryptographic vulnerability detection in Parfait can detect real-world cryptographic vulnerabilities in large-scale codebases with few false positives and low runtime.



Scalable, Near-Zero Loss Disaster Recovery for Distributed Data Stores.

This paper presents a new Disaster Recovery (DR) system, called Slogger, that differs from prior works in two principle ways: (i) Slogger enables DR for a linearizable distributed data store, and (ii) Slogger adopts the continuous backup approach that strives to maintain a tiny lag on the backup site relative to the primary site,thereby restricting the data loss window, due to disasters, to milliseconds.



Leveraging Extracted Model Adversaries for Improved Black Box Attacks

We present a method for adversarial input generation against black box models for reading comprehension based question answering. Our approach is composed of two steps. First, we approximate a victim black box model via model extraction. Second, we use our own white box method to generate input perturbations that cause the approximate model to fail. These perturbed inputs are used against the victim. In experiments we find that our method improves on the efficacy of the AddAny---a white box attack---performed on the approximate model by 25% F1, and the AddSent attack---a black box attack---by 11% F1.



Gelato: Feedback-driven and Guided Security Analysis of Client-side Applications

Even though a lot of effort has been invested in analyzing client-side web applications during the past decade, the existing tools often fail to deal with the complexity of modern JavaScript applications. However, from an attacker point of view, the client side of such web applications can reveal invaluable information about the server side. In this paper, first we study the existing tools and enumerate the most crucial features a security-aware client-side analysis should be supporting. Next, we propose GELATO to detect vulnerabilities in modern client-side JavaScript applications that are built upon complex libraries and frameworks. In particular, we take the first step in closing the gap between state-aware crawling and client-side security analysis by proposing a feedback-driven security-aware guided crawler that is able to analyze complex frameworks automatically, and increase the coverage of security-sensitive parts of the program efficiently. Moreover, we propose a new lightweight client-side taint analysis that outperforms the start-of-the-art tools, requires no modification to browsers, and reports non-trivial taint flows on modern JavaScript applications.



Microsecond Consensus for Microsecond Applications.

We consider the problem of making apps fault-tolerant through replication, when apps operate at the microsecond scale, as in finance, embedded computing, and microservices apps. These apps need a replication scheme that also operates at the microsecond scale, otherwise replication becomes a burden. We propose Mu, a system that takes less than 1.3 microseconds to replicate a (small) request in memory, and less than a millisecond to fail-over the system - this cuts the replication and fail-over latencies of the prior systems by at least 61% and 90%.
Mu implements bona fide state machine replication/consensus (SMR) with strong consistency for a generic app, but it really shines on microsecond apps, where even the smallest overhead is significant. To provide this performance, Mu introduces a new SMR protocol that carefully leverages RDMA. Roughly, in Mu a leader replicates a request by simply writing it directly to the log of other replicas using RDMA, without any additional communication. Doing so, however, introduces the challenge of handling concurrent leaders, changing leaders, garbage collecting the logs, and more - challenges that we address in this paper through a judicious combination of RDMA permissions and distributed algorithmic design.
We implemented Mu and used it to replicate several systems: a financial exchange app called Liquibook, Redis, Memcached, and HERD. Our evaluation shows that Mu incurs a small replication latency, in some cases being the only viable replication system that incurs an acceptable overhead.


Hardware and Software, Engineered to Work Together