SPLASH Companion 2019
2019 ACM SIGPLAN International Conference on Systems, Programming, Languages, and Applications: Software for Humanity (SPLASH Companion 2019)
Powered by
Conference Publishing Consulting

2019 ACM SIGPLAN International Conference on Systems, Programming, Languages, and Applications: Software for Humanity (SPLASH Companion 2019), October 20–25, 2019, Athens, Greece

SPLASH Companion 2019 – Preliminary Table of Contents

Contents - Abstracts - Authors


Title Page

Message from the Chairs




Component-Based Computation-Energy Modeling for Embedded Systems
Adam Seewald, Ulrik Pagh Schultz, Julius Roeder, Benjamin Rouxel, and Clemens Grelck
(University of Southern Denmark, Denmark; University of Amsterdam, Netherlands)
Computational energy-efficiency is a critical aspect of many modern embedded devices as it impacts the level of autonomy for numerous scenarios. We present a component-based energy modeling approach to abstract per-component energy in a dataflow computational network executed according to a given scheduling policy. The approach is based on a modeling tool and ultimately relies on battery state to support a wider range of energy-optimization strategies for power-critical devices.

Article Search
Toward a Benchmark Repository for Software Maintenance Tool Evaluations with Humans
Matúš Sulír
(Technical University of Košice, Slovakia)

Article Search
NAB: Automated Large-Scale Multi-language Dynamic Program Analysis in Public Code Repositories
Alex Villazón, Haiyang Sun, Andrea Rosà, Eduardo Rosales, Daniele Bonetta, Isabella Defilippis, Sergio Oporto, and Walter Binder
(Universidad Privada Boliviana, Bolivia; USI Lugano, Switzerland; Oracle Labs, USA)
Analyzing today's large code repositories has become an important research area for understanding and improving different aspects of modern software systems. Despite the presence of a large body of work on mining code repositories through static analysis, studies applying dynamic analysis to open-source projects are scarce and of limited scale. Nonetheless, being able to apply dynamic analysis to the projects hosted in public code repositories is fundamental for large-scale studies on the runtime behavior of applications, which can greatly benefit the programming-language and software-engineering communities.
To enable large-scale studies on the wild requiring dynamic analysis, we propose NAB, a novel, distributed, container-based infrastructure for massive dynamic analysis on code repositories hosting open-source projects, which may be implemented in different programming languages. NAB automatically looks for available executable code in a repository, instruments it according to a user-defined dynamic analysis, and runs the instrumented code. Such executable code could correspond to existing benchmarks (e.g., workloads defined by the developers via the Java Microbenchmark Harness (JMH)) or software tests (e.g., defined in the default test entry of a Node.js project managed by Node Package Manager (NPM), or based on popular testing frameworks such as JUnit).
NAB resorts to containerization for efficient sandboxing, for the parallelization of dynamic analysis execution, and for simplifying the deployment on clusters or in the Cloud. Sandboxing is important to isolate the underlying execution environment and operating system, since NAB executes unverified projects that may contain buggy or even harmful code. Also, parallelizing dynamic analysis execution is an important feature for massive analysis, as sequential analysis of massive code repository would take prohibitive time.
At its core, NAB features a microservice architecture based on a master-worker pattern relying on a publish-subscribe communication layer using the MQTT protocol, allowing asynchronous events to be exchanged between its internal components. NAB uses existing containerized services and introduces four new components, three of them running in containers: NAB-Crawler, NAB-Analyzer, and NAB-Master; as well as one external service, NAB-Dashboard. The NAB-Crawler instances are responsible for mining and crawling code repositories, collecting metadata that allows making a decision on which projects to analyze. The NAB-Analyzer instances are responsible for downloading the code, applying some filtering based on user-defined criteria and eventually running the dynamic analysis. The results generated by the dynamic analysis (such as profiles containing various dynamic metrics) are stored in a NoSQL MongoDB database. NAB provides a plugin mechanism to integrate different dynamic analyses in NAB-Analyzer instances.
NAB-Master orchestrates the distribution of crawling and dynamic analysis activities with NAB-Crawler and NAB-Analyzer instances. NAB-Dashboard is responsible for the deployment of NAB components through the Docker Swarm orchestration service and monitors the progress of an ongoing dynamic analysis. Finally, NAB supports different build systems, testing frameworks, and runtimes, thus enabling multi-language support. Moreover, it can easily integrate existing dynamic analyses.
We used NAB to conduct three large-scale case studies applying dynamic analysis on more than 56K open-source projects hosted on GitHub, leveraging unit tests that can be automatically executed and analyzed. We performed a novel analysis that sheds light on the usage of the Promise API in open-source Node.js projects. We found many projects with long promise chains, which can potentially be considered for benchmarking promises on Node.js. Moreover, the results of our analysis could be useful for Node.js developers to find projects and popular modules that use promises for asynchronous executions, whose optimization could be beneficial to several existing applications. We also conducted a large-scale study on the presence of JIT-unfriendly code on Node.js projects. Our study revealed that Node.js developers frequently use code patterns that could prevent or jeopardize dynamic optimizations and have a potential negative impact on applications performance. Finally, we performed a large-scale analysis on Java and Scala projects, searching for task-parallel workloads suitable for inclusion in a benchmark suite. We identified five candidate workloads (two in Java and three in Scala) that may be used for benchmarking task parallelism on the JVM. Overall, our case studies confirm that NAB can be used for applying dynamic analysis massively on public code repositories, and that the large-scale analyses enabled by NAB provide insights that are of practical interest.
A preliminary version of NAB can be downloaded at http://dag.inf.usi.ch/software/nab/. More information on NAB and on the described use cases can be found in our previous ECOOP'19 publication. We are actively working on an open-source release of NAB.

Article Search
Renaissance: A Modern Benchmark Suite for Parallel Applications on the JVM
Aleksandar Prokopec, Andrea Rosà, David Leopoldseder, Gilles Duboscq, Petr Tůma, Martin Studener, Lubomír Bulej, Yudi Zheng, Alex Villazón, Doug Simon, Thomas Würthinger, and Walter Binder
(Oracle Labs, n.n.; USI Lugano, Switzerland; JKU Linz, Austria; Charles University in Prague, Czechia; Universidad Privada Boliviana, Bolivia; Oracle Labs, Switzerland)
To demonstrate that a compiler optimization, a memory management algorithm, or a synchronization technique is useful, researchers need benchmarks that demonstrate the desired behavior and, at the same time, capture representative aspects of real-world applications. During the last decade, multiple new programming paradigms have appeared on the Java Virtual Machine (JVM), including functional programming, big-data processing, parallel and concurrent programming, message passing, stream processing, and machine learning. The JVM has evolved as a platform, too. New features---such as method handles, variable handles, the invokedynamic instruction, lambdas, atomic and relaxed memory operations---present new challenges for just-in-time (JIT) compilers and runtime environments.
Workloads exercising the above features potentially present new optimization opportunities for compilers and virtual machines. Unfortunately, existing benchmark suites (such as DaCapo, ScalaBench, or SPECjvm2008) do not capture these new features, because they were made at the time when such workloads did not exist. Moreover, such suites do not specifically focus on concurrency and parallelism.
To bridge this gap, we propose Renaissance, a new representative set of benchmarks that covers modern JVM concurrency and parallelism paradigms. Renaissance consists of 25 benchmarks representative of common modern patterns (including, but not limited to, big data, machine learning, and functional programming) relying on multiple existing state-of-the-art Java and Scala frameworks. The suite can be useful to optimize just-in-time (JIT) compilers, interpreters, garbage collectors, as well as tools such as profilers, debuggers, or static analyzers.
To obtain these benchmarks, we gathered more than 100 candidate workloads, both manually and by scanning an online corpus of GitHub projects. We then defined and collected a set of metrics able to capture the use of concurrency-related features as well as the use of code patterns commonly associated with abstractions of object-oriented programs. We used these metrics to detect potentially interesting workloads, and to ensure that the selection is sufficiently diverse. Thanks to PCA analysis, we have shown that the set of benchmarks selected for inclusion covers the metric space differently than the existing benchmark suites.
To confirm that the selected benchmarks are useful, we analyzed them for performance-critical patterns. We demonstrated that the proposed benchmarks reveal new opportunities for JIT compilers by implementing four new optimizations in the Graal JIT compiler. These optimizations have a considerably smaller impact on the existing suites, indicating that Renaissance helped in identifying new compiler optimizations. We also identified three existing optimizations whose performance impact is prominent. Furthermore, by comparing two production-quality JIT compilers, Graal and HotSpot C2, we determined that performance varies much more on Renaissance than on other benchmark suites.
We also compared the complexity of the Renaissance workloads with those of other suites, by evaluating six Chidamber and Kemerer metrics, and by inspecting the compiled code size and the hot method count of all the benchmarks. Our results show that the proposed benchmark suite is as complex as DaCapo and ScalaBench, and much more complex than SPECjvm2008.
Renaissance is intended to be an open-source, collaborative project, in which the community can propose and improve benchmark workloads. Renaissance is publicly available at https://renaissance.dev. More information on the suite and on the described analyses can be found in our previous PLDI’19 publication.

Article Search
Distributed Object-Oriented Programming with Multiple Consistency Levels in ConSysT
Nafise Eskandani Masoule, Mirko Köhler, Guido Salvaneschi, and Alessandro Margara
(TU Darmstadt, Germany; Politecnico di Milano, Italy)

Article Search
Towards a WebAssembly Standalone Runtime on GraalVM
Salim S. Salim, Andy Nisbet, and Mikel Luján
(University of Manchester, UK)

Article Search
MetaDL: Declarative Program Analysis for the Masses
Alexandru Dura and Hampus Balldin
(Lund University, Sweden)

Article Search
Language-Parametric Semantic Editor Services Based on Declarative Type System Specifications
Daniel A. A. Pelsmaeker, Hendrik van Antwerpen, and Eelco Visser
(Delft University of Technology, Netherlands)

Article Search
A Domain-Specific Compiler for N-Body Problems
Shigeyuki Sato
(University of Tokyo, Japan)

Article Search

Doctoral Symposium

Performance, Portability, and Productivity for Data-Parallel Applications on Multi- and Many-Core Architectures
Ari Rasch
(University of Münster, Germany)
We present a novel approach to performance, portability, and productivity of data-parallel computations on multi- and many-core architectures. Our approach is based on Multi-Dimensional Homomorphisms (MDHs)~--~a formally defined class of functions that cover important data-parallel computations, e.g., linear algebra routines (BLAS) and stencil computations. For MDHs, we present a high-level Domain-Specific Language (DSL) that contributes to high user productivity, and we propose a corresponding DSL compiler which automatically generates optimized (auto-tuned) OpenCL code, thereby providing high, portable performance, over different architectures and input sizes, for programs in our DSL. Our experimental results, on Intel CPU and NVIDIA GPU, demonstrate competitive and often significantly better performance of our approach as compared to state-of-practice approaches, e.g., Intel MKL/MKL-DNN and NVIDIA cuBLAS/cuDNN.

Article Search
Exploiting Models for Scalable and High Throughput Distributed Software
Tim Soethout
(ING Bank, Netherlands)
Enterprise software systems are large, complex, and hard to maintain. Many applications communicate, operate independently, and need to change frequently. Domain Specific Languages (DSLs) are an approach to control the complexity by capturing domain knowledge in a non-ambiguous, single-source, and traceable way. DSLs enables automatically generating optimized code, where domain knowledge can be used which is not available to a general purpose programming language compiler.
The DSL Rebel describes state machines for enterprise products, which communicate using atomic synchronized actions. These specifications are generated into a horizontally scalable distributed application, built on the Akka actor toolkit. Generating code for Rebel's distributed synchronization in a generic scalable fashion is hard, because high-contention specifications result in bottlenecks in throughput and latency. Atomic synchronized actions, formalized as Atomic Commit, guarantee that actions on multiple objects are a single atomic step, where all or none should happen. A well-known generic blocking atomic commitment protocol is Two-Phase Commit (2PC).
Improvements in scalability and throughput of Atomic Commit implementations and other optimizations related to consistency, are widely applicable to databases, programming languages and distributed systems in general.
This dissertation research is done in close collaboration with ING Bank and focusses on the opportunity of leveraging application specific knowledge captured by model driven engineering approaches, to increase application performance in high-contention scenarios, while maintaining functional application-level consistency.

Article Search
Debugging Support for Multi-paradigm Concurrent Programs
Dominik Aumayr
(JKU Linz, Austria)
With the widespread adoption of concurrent programming, debugging of non-deterministic failures becomes increasingly important. Record & replay debugging aids developers in this effort by reliably reproducing recorded bugs. Because each concurrency model (e.g., threads vs actors) is particularly suited for different tasks, developers started combining them within the same application. Record & replay solutions are typically designed for one concurrency model only. In this paper we propose a novel multi-paradigm record & replay that is based on abstracting concurrency models to a common set of concepts and events.

Article Search
Retaining Semantic Information in the Static Analysis of Real-World Software
Gábor Horváth
(Eötvös Loránd University, Hungary)
Static analysis is the analysis of a program through inspection of the source code, usually carried out by an automated tool. One of the greatest challenges posed by real-world applications is that the whole program is rarely available at any point of the analysis process. One reason for this is separate compilation, where the source code of some libraries might not be accessible in order to protect intellectual property. But even if we have a complete view of the source code including the underlying operating system, we might still have trouble fitting the representation of the entire software into memory. Thus, industrial tools need to deal with uncertainty due to the lack of information.
In my dissertation I discuss state-of-the-art methods to deal with this uncertainty and attempt to improve upon each method to retain information that would otherwise be unavailable. I also propose guidelines on which methods to choose to solve certain problems.

Article Search
Improving Performance and Quality of Database-Backed Software
Junwen Yang
(University of Chicago, USA)
Nowadays, database-backed software is widely used, like online shopping, social networking, and others, where increasing huge amount of user data is managed and processed. Their performance and scalability are crucial to people's daily lives --- one second's delay of a web application would cause 11% fewer page views, a 16% decrease in customer satisfaction, and 7% loss in conversions.
Unfortunately, high performance and scalability of these applications are challenging to achieve, often requiring cross-stack/server optimization that is difficult to do manually. Many of these applications consist of two parts: (1) a front-end application (e.g., a web-server application) that is developed in traditional object-oriented languages and handles user interface and application computation logic, and (2) a database management system (DBMS) that runs on a separate server and maintains persistent data. Connecting these two parts, an Object-Relational Mapping (ORM) framework is often used that provides APIs for operating persistent database data as heap objects and dynamically translating method calls into database queries. Often, developers use ORM APIs in an inefficient way, without knowing what DB queries would be issued at run time; DBMS also may not produce the most efficient execution plan, not knowing what queries will be issued later.

Article Search
Practical Second Futamura Projection
Florian Latifi
(JKU Linz, Austria)
Partial evaluation, based on the first Futamura projection, allows compiling language interpreters with given user programs to efficient target programs. GraalVM is an example system that implements this mechanism. It combines partial evaluation with profiling information and dynamic compilation, to transform interpreters into high-performance machine code at run time. However, partial evaluation is compile-time intensive, as it requires the abstract interpretation of interpreter implementations. Thus, optimizing partial evaluation is still subject to research to this day. We present an approach to speed up partial evaluation, by generating source code ahead of time, which performs partial evaluation specific to interpreter implementations. Generated code, when executed for a given user program at run time, directly emits partially evaluated interpreter instructions for language constructs it knows and sees in the program. This yields the target program faster than performing the first Futamura projection. The generated source code behaves similarly to a specialized partial evaluator deduced by performing the second Futamura projection, although no self-applying partial evaluator is involved during code generation.

Article Search

Student Research Competition

An Empirical Study of Programming Language Effect on OSS Development Effort
Muna Altherwi
(Southampton University, UK)
Dozens of programming languages are in use today, and new languages and language features are being introduced continuously. Research on programming language impact on software development has revealed a divide on whether the choice of language has a significant effect on the development process. Some studies stated that languages do not have a considerable impact on development effort, and practical programming, and that there is no hard evidence to support such claims. On the other hand, a number of empirical studies on programming languages and software development have shown that different languages have different impact on software development, and the choice of language has a noticeable effect on the development process. Thus, this study is another step to look into programming languages from an empirical perspective to examine the effect of languages on the open source software development.

Article Search
Designing Immersive Virtual Training Environments for Experiential Learning
Kalliopi Evangelia Stavroulia
(Cyprus University of Technology, Cyprus)
Virtual reality (VR) is among the key and most promising emerging technologies in the field of education. The current paper aims to present an innovative VR based approach for teacher education. The development of the VR application followed a full design cycle, with active involvement of education experts during the whole development process. The evaluation results indicate a positive impact of the VR intervention on the cultivation of empathy skills. Moreover, the results are statistically significant related to other parameters under investigation, including the sense of presence, embodiment and user’s emotional experiences.

Article Search
Gradual Program Analysis
Samuel Estep
(Liberty University, n.n.)
The designers of static analyses for null safety often try to reduce the number of false positives reported by the analysis through increased engineering effort, user-provided annotations, and/or weaker soundness guarantees. To produce a null-pointer analysis with little engineering effort, reduced false positives, and strong soundness guarantees in a principled way, we adapt the “Abstracting Gradual Typing” framework to the abstract-interpretation based program analysis setting. In particular, a simple static dataflow analysis that relies on user-provided annotations and has nullability lattice N ⊑ ⊤ (where N means “definitely not null” and ⊤ means “possibly null”) is gradualized producing a new lattice N ⊑ ? ⊑ ⊤. Question mark explicitly represents “optimistic uncertainty” in the analysis itself, supporting a formal soundness property and the “gradual guarantees” laid out in the gradual typing literature. We then implement a prototype of our gradual null-pointer analysis as a Facebook Infer checker, and compare it to existing null-pointer analyses via a suite of GitHub repositories used originally by Uber to evaluate their NullAway tool. Our prototype has architecture and output very similar to these existing tools, suggesting the value of applying our approach to more sophisticated program analyses in the future.

Article Search
Linear Capabilities for CHERI: An Exploration of the Design Space
Aaron Lippeveldts
(Vrije Universiteit Brussel, Belgium)
Capabilities can be used to replace pointers for referring to regions of memory. We then call this capability-based addressing. CHERI is an instruction set extension that adds capability-based addressing. Capability-based addressing enables privilege separation with a fine memory granularity and large numbers of compartments. On a regular capability machine, it is not possible to temporarily grant authority (except by adding an extra level of indirection). One of the solutions here is the use of linear capabilities. In this work we explore the design space for adding linear capabilities to CHERI by designing an actual instruction set extension. We discuss the options for the default linearity types. There are several aspects we do not consider, such as concurrency.

Article Search
Is Mutation Score a Fair Metric?
Beatriz Souza
(Federal University of Campina Grande, Brazil)
Comparing the mutation scores achieved for test suites, one is able to judge which test suite is more effective. However, it is not known if the mutation score is a fair metric to do such comparison. In this paper, we present an empirical study, which compares developer-written and automatically generated test suites in terms of mutation score and in relation to the detection ratios of 7 mutation types. Our results indicate fairness on the mutation score.

Article Search
Incremental Scannerless Generalized LR Parsing
Maarten P. Sijm
(Delft University of Technology, Netherlands)
We present the Incremental Scannerless Generalized LR (ISGLR) parsing algorithm, which combines the benefits of Incremental Generalized LR (IGLR) parsing and Scannerless Generalized LR (SGLR) parsing. The parser preprocesses the input by modifying the previously saved parse forest. This allows the input to the parser to be a stream of parse nodes, instead of a stream of characters. Scannerless parsing relies heavily on non-determinism during parsing, negatively impacting the incrementality of ISGLR parsing. We evaluated the ISGLR parsing algorithm using file histories from Git, achieving a speedup of up to 25 times over non-incremental SGLR.

Article Search

Workshop Summaries

Summary of the 6th ACM SIGPLAN International Workshop on AI-Inspired and Empirical Methods for Software Engineering on Parallel Computing Systems (AI-SEPS 2019)
Ehsan Atoofian and Hiroyuki Takizawa
(Lakehead University, Canada; Tohoku University, Japan)

Article Search
Summary of the 17th ACM SIGPLAN International Workshop on Domain-Specific Modeling (DSM 2019)
Jeff Gray, Matti Rossi, Jonathan Sprinkle, and Juha-Pekka Tolvanen
(University of Alabama, USA; Aalto University School of Business, Finland; University of Arizona, USA; MetaCase, Finland)

Article Search
Summary of the 6th ACM SIGPLAN International Workshop on Reactive and Event-Based Languages and Systems (REBLS 2019)
Guido Salvaneschi, Wolfgang De Meuter, Patrick Eugster, Francisco Sant’Anna, and Lukasz Ziarek
(TU Darmstadt, Germany; Vrije Universiteit Brussel, Belgium; USI Lugano, Switzerland; Rio de Janeiro State University, Brazil; SUNY Buffalo, USA)

Article Search

proc time: 1.8