PLDI 2017 Workshops
38th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2017)
Powered by
Conference Publishing Consulting

6th ACM SIGPLAN International Workshop on the State Of the Art in Program Analysis (SOAP 2017), June 18, 2017, Barcelona, Spain

SOAP 2017 – Proceedings

Contents - Abstracts - Authors

6th ACM SIGPLAN International Workshop on the State Of the Art in Program Analysis (SOAP 2017)

Frontmatter

Title Page


Message from the Chairs
The 6th ACM SIGPLAN International Workshop on the State Of the Art in Program Analysis (SOAP 2017) takes place on June 18, 2017, co-located with PLDI 2017, and is supported by Oracle.
Static and dynamic analysis techniques and tools for Java and other programming languages have received widespread attention for a long time. The application domains of these analyses range from core libraries to modern technologies such as web services and Android applications. Over time, analysis frameworks, such as Soot, WALA, Chord, DOOP, and Soufflé, have been developed to better support techniques for optimizing programs, ensuring code quality, and assessing security and compliance. In-cooperation with ACM SIGPLAN, the community around the Soot framework for Java analysis brought together its members and other researchers by organizing the SOAP workshop. SOAP has since broadened its scope to other analysis tools and other programming languages.
The SOAP 2017 workshop program contains three invited talks and presentations of eight accepted papers. We are grateful to our invited speakers who are all leading experts in program analysis: Julian Dolby (IBM Research), Sukyoung Ryu (KAIST), and Jingling Xue (University of New South Wales). All submitted papers were thoroughly reviewed by at least two program committee members.

Points-to Analysis

Revisiting Recency Abstraction for JavaScript: Towards an Intuitive, Compositional, and Efficient Heap Abstraction
Jihyeok ParkORCID logo, Xavier Rival, and Sukyoung RyuORCID logo
(KAIST, South Korea; CNRS, France; ENS, France; INRIA, France; PSL Research University, France)
JavaScript is one of the most widely used programming languages. To understand the behaviors of JavaScript programs and to detect possible errors in them, researchers have developed several static analyzers based on the abstract interpretation framework. However, JavaScript provides various language features that are difficult to analyze statically and precisely such as dynamic addition and removal of object properties, first-class property names, and higher-order functions. To alleviate the problem, JavaScript static analyzers often use recency abstraction, which refines address abstraction by distinguishing recent objects from summaries of old objects. We observed that while recency abstraction enables more precise analysis results by allowing strong updates on recent objects, it is not monotone in the sense that it does not preserve the precision relationship between the underlying address abstraction techniques: for an address abstraction A and a more precise abstraction B, recency abstraction on B may not be more precise than recency abstraction on A. Such an unintuitive semantics of recency abstraction makes its composition with various analysis sensitivity techniques also unintuitive. In this paper, we propose a new singleton abstraction technique, which distinguishes singleton objects to allow strong updates on them without changing a given address abstraction. We formally define recency and singleton abstractions, and explain the unintuitive behaviors of recency abstraction. Our preliminary experiments show promising results for singleton abstraction.

A Datalog Model of Must-Alias Analysis
George Balatsouras, Kostas Ferles, George Kastrinis, and Yannis Smaragdakis ORCID logo
(University of Athens, Greece; University of Texas at Austin, USA)
We give a declarative model of a rich family of must-alias analyses. Our emphasis is on careful and compact modeling, while exposing the key points where the algorithm can adjust its inference power. The model is executable, in the Datalog language, and forms the basis of a full-fledged must-alias analysis of Java bytecode in the Doop framework.

An Efficient Tunable Selective Points-to Analysis for Large Codebases
Behnaz Hassanshahi, Raghavendra Kagalavadi Ramesh, Padmanabhan Krishnan ORCID logo, Bernhard ScholzORCID logo, and Yi Lu
(Oracle Labs, Australia; University of Sydney, Australia)
Points-to analysis is a fundamental static program analysis technique for tools including compilers and bug-checkers. Although object-based context sensitivity is known to improve precision of points-to analysis, scaling it for large Java codebases remains a challenge.
In this work, we develop a tunable, client-independent, object-sensitive points-to analysis framework where heap cloning is applied selectively. This approach is aimed at large codebases where standard analysis is typically expensive. Our design includes a pre-analysis that determines program points that contribute to the cost of an object-sensitive points-to analysis. A subsequent analysis then determines the context depth for each allocation site. While our framework can run standalone, it is also possible to tune it – the user of the framework can use the knowledge of the codebase being analysed to influence the selection of expensive program points as well as the process to differentiate the required context-depth. Overall, the approach determines where the cloning is beneficial and where the cloning is unlikely to be beneficial.
We have implemented our approach using Soufflé (a Datalog compiler) and an extension of the DOOP framework. Our experiments on large programs, including OpenJDK, show that our technique is efficient and precise. For the OpenJDK, our analysis reduces 27% of runtime and 18% of memory usage in comparison with 2O1H points-to analysis for a negligible loss of precision, while for Jython from the DaCapo benchmark suite, the same analysis reduces 91% of runtime for no loss of precision.

Modular Analysis

SootKeeper: Runtime Reusability for Modular Static Analysis
Florian Kübler, Patrick Müller, and Ben Hermann
(TU Darmstadt, Germany)
In order to achieve a higher reusability and testability, static analyses are increasingly being build as modular pipelines of analysis components. However, to build, debug, test, and evaluate these components the complete pipeline has to be executed every time. This process recomputes intermediate results which have already been computed in a previous run but are lost because the preceding process ended and removed them from memory. We propose to leverage runtime reusability for static analysis pipelines and introduce SootKeeper, a framework to modularize static analyses into OSGi (Open Service Gateway initiative) bundles, which takes care of the automatic caching of intermediate results. Little to no change to the original analysis is necessary to use SootKeeper while speeding up the execution of code-build-debug cycles or evaluation pipelines significantly.

Info
Porting Doop to Soufflé: A Tale of Inter-Engine Portability for Datalog-Based Analyses
Tony Antoniadis, Konstantinos Triantafyllou, and Yannis Smaragdakis ORCID logo
(University of Athens, Greece)
We detail our experience of porting the static analysis framework to the recently introduced Datalog engine. The port addresses the idiosynchrasies of the Datalog dialects involved (w.r.t. the type system, value construction, and fact updates) and differences in the runtime systems (w.r.t. parallelism, transactional execution, and optimization methodologies). The overall porting effort is interesting in many ways: as an instance of the benefits of specifying static analyses declaratively, gaining benefits (e.g., parallelism) from a mere porting to a new runtime system; as a study of the effort required to migrate a substantial Datalog codebase (of several thousand rules) to a different dialect. By exploiting shared-memory parallelism, the version of the framework achieves speedups of up to 4x, over already high single-threaded performance.

Soundness and Precision

Systematic Approaches for Increasing Soundness and Precision of Static Analyzers
Esben Sparre Andreasen, Anders MøllerORCID logo, and Benjamin Barslev Nielsen
(Aarhus University, Denmark)
Building static analyzers for modern programming languages is difficult. Often soundness is a requirement, perhaps with some well-defined exceptions, and precision must be adequate for producing useful results on realistic input programs. Formally proving such properties of a complex static analysis implementation is rarely an option in practice, which raises the challenge of how to identify causes and importance of soundness and precision problems.
Through a series of examples, we present our experience with semi-automated methods based on delta debugging and dynamic analysis for increasing soundness and precision of a static analyzer for JavaScript. The individual methods are well known, but to our knowledge rarely used systematically and in combination.

On the Construction of Soundness Oracles
Jens Dietrich, Li Sui, Shawn Rasheed, and Amjed Tahir
(Massey University, New Zealand)
One of the inherent advantages of static analysis is that it can create and reason about models of an entire program. However, mainstream languages such as Java use numerous dynamic language features designed to boost programmer productivity, but these features are notoriously difficult to capture by static analysis, leading to unsoundness in practice. While existing research has focused on providing sound handling for selected language features (mostly reflection) based on anecdotal evidence and case studies, there is little empirical work to investigate the extent to which particular features cause unsoundness of static analysis in practice. In this paper, we (1) discuss language features that may cause unsoundness and (2) discuss a methodology that can be used to check the (un)soundness of a particular static analysis, call-graph construction, based on soundness oracles. These oracles can also be used for hybrid analyses.

Benchmarking

Hermes: Assessment and Creation of Effective Test Corpora
Michael Reif, Michael Eichberg, Ben Hermann, and Mira MeziniORCID logo
(TU Darmstadt, Germany)
An integral part of developing a new analysis is to validate the correctness of its implementation and to demonstrate its usefulness when applied to real-world code. As a foundation for addressing both challenges developers typically use custom or well-established collections of Java projects. The hope is that the collected projects are representative for the analysis’ target domain and therefore ensure a sound evaluation. But, without proper means to understand how and to which degree the features relevant to an analysis are found in the projects, the evaluation necessarily remains inconclusive. Additionally, it is likely that the collection contains many projects which are – w.r.t. the developed analysis – basically identical and therefore do not help the overall evaluation/testing of the analysis, but still cost evaluation time. To overcome these limitations we propose Hermes, a framework that enables the systematic assessment of given corpora and the creation of new corpora of Java projects. To show the usefulness of Hermes, we used it to comprehend the nature of the projects belonging to the Qualitas Corpus (QC) and then used it to compute a minimal subset of all QC projects useful for generic data- and control-flow analyses. This subset enables effective and efficient integration test suites.

Info

proc time: 1.3