Evaluating quality of security testing of the JDK.

In this position paper we describe how mutation testing can be used to evaluate the quality of test suites from a security viewpoint. Our focus is on measuring the quality of the test suite associated with the Java Development Kit (JDK) because it provides the core security properties for all applications. We describe the challenges associated with identifying security-specific mutation operators that are specific to the Java model and ensuring that our solution can be automated for large code-bases like the JDK.



Behavior Based Approach to Misuse Detection of a Simulated SCADA System

This paper presents the initial findings in applying a behavior-based approach for detection of unauthorized activities in a simulated Supervisory Control and Data Acquisition (SCADA) system. Misuse detection of this type utilizes fault-free system telemetry to develop empirical models that learn normal system behavior. Future monitored telemetry sources that show statistically significant deviations from this learned behavior may indicate an attack or other unwanted actions. The experimental test bed consists of a set of Linux based enterprise servers that were isolated from a larger university research cluster. All servers are connected to a private network and simulate several components and tasks seen in a typical SCADA system. Telemetry sources included kernel statistics, resource usages and internal system hardware measurements. For this study, the Auto Associative Kernel Regression (AAKR) and Auto Associative Multivariate State Estimation Technique (AAMSET) are employed to develop empirical models. Prognostic efficacy of these methods for computer security used several groups of signals taken from available telemetry classes. The Sequential Probability Ratio Test (SPRT) is used along with these models for intrusion detection purposes. The different intrusion types shown include host/network discovery, DoS, brute force login, privilege escalation and malicious exfiltration actions. For this study, all intrusion types tested displayed alterations in the residuals of much of the monitored telemetry and were able to be detected in all signal groups used by both model types. The methods presented can be extended and implemented to industries besides nuclear that use SCADA or business-critical networks.



Simulation-based Code Duplication for Enhancing Compiler Optimizations

Compiler optimizations are often limited by control flow, which prohibits optimizations across basic block boundaries. Duplicating instructions from merge blocks to their prede- cessors enlarges basic blocks and can thus enable further optimizations. However, duplicating too many instructions leads to excessive code growth. Therefore, an approach is necessary that avoids code explosion and still finds beneficial duplication candidates. We present a novel approach to determine which code should be duplicated to improve peak performance. There- fore, we analyze duplication candidates for subsequent op- timizations by simulating a duplication and analyzing its impact on the compilation unit. This allows a compiler to find those duplication candidates that have the maximum optimization potential.




Poster about Simulation based Code Duplication (abstract from associated DocSymp paper) The scope of compiler optimizations is often limited by con- trol flow, which prohibits optimizations across basic block boundaries. Code duplication can solve this problem by ex- tending basic block sizes, thus enabling subsequent opti- mizations. However, duplicating code for every optimization opportunity may lead to excessive code growth. Therefore, a holistic approach is required that is capable of finding optimization opportunities and classifying their impact. This paper presents a novel approach to determine which code should be duplicated in order to improve peak perfor- mance. The approach analyzes duplication candidates for subsequent optimizations opportunities. It does so by simu- lating a duplication operation and analyzing its impact on other optimizations. This allows a compiler to weight up multiple success metrics in order to choose the code duplica- tion operations with the maximum optimization potential. We further show how to map code duplication opportunities to an optimization cost model that allows us to maximize performance while minimizing code size increase.



Detecting Malicious JavaScript in PDFs Using Conservative Abstract Interpretation

To mitigate the risk posed by JavaScript-based PDF malware, we propose a static analysis technique based on abstract interpretation. Our evaluation shows that our approach can identify 100% of malware with a low rate of false positives.



Improving Parallelism in Hardware Transactional Memory

Hardware transactional memory (HTM) is supported by recent processors from Intel and IBM. HTM is attractive because it can enhance concurrency while simplifying programming. Today's HTM systems rely on existing coherence protocols, which implement a requester-wins strategy. This, in turn, leads to very poor performance when transactions frequently conflict, causing them to resort to a non-speculative fallback path. Often, such a path severely limits concurrency. In this paper, we propose very simple architectural changes to the existing requester-wins HTM implementations. The idea is to support a special mode of execution in HTM, called power mode, which can be used to enhance conflict resolution between regular and so-called power transactions. A power transaction can run concurrently with regular transactions that do not conflict with it. This permits higher levels of concurrency in cases when a (regular) transaction cannot make progress due to conflicts and would require a non-speculative fallback path otherwise. Our idea is backward-compatible with existing HTM systems, imposing no additional cost on transactions that do not use the power mode. Furthermore, using power transactions requires no changes to target applications that employ traditional lock synchronization. Using extensive evaluation of micro- and STAMP benchmarks in a transactional memory simulator and real hardware-based emulation, we show that our technique significantly improves the performance of the baseline that does not use power mode, and performs comparably with state-of-the-art related proposals that require more substantial architectural changes.



Persistent Memcached: Bringing Legacy Code to Byte-Addressable Persistent Memory

We report our experience building and evaluating pmemcached, a version of memcached ported to byte-addressable persistent memory. Persistent memory is expected to not only improve overall performance of applications’ persistence tier, but also vastly reduce the “warm up” time needed for applications after a restart. We decided to test this hypothesis on memcached, a popular key-value store. We took the extreme view of persisting memcached’s entire state, resulting in a virtually instantaneous warm up phase. Since memcached is already optimized for DRAM, we expected our port to be a straightforward engineering effort. However, the effort turned out to be surprisingly complex during which we encountered several non-trivial problems that challenged the boundaries of memcached’s architecture. We detail these experiences and corresponding lessons learned.



FastR update: Interoperability, Graphics, Debugging, Profiling, and other hot topics

This talk present an overview of the current progress in FastR in a number of areas that saw significant progress in the last year, e.g., Interoperability, Graphics, Debugging, Compatibility, etc.



Zero-overhead R and C/C++ integration with FastR

Traditionally, C and C++ are often used to improve performance for R applications and packages. While this is usually not necessary when using FastR, because it can run R code at near-native performance, there is a large corpus of existing code that implements critical pieces of functionality in native code. Alternative implementations of R need to simulate the R native API, which is a complex API that exposes many implementation details. They spend significant effort and performance overhead to simulate the API, and there is a compilation and optimization barrier between languages. FastR can employ the Truffle framework to run native code, available as LLVM bitcode, inside the optimization scope of the polyglot environment, and thus have it integrated with no optimization and integration barriers.


Hardware and Software, Engineered to Work Together