Powered by
15th ACM SIGPLAN International Workshop on the State Of the Art in Program Analysis (SOAP 2026), June 15–19, 2026,
Boulder, CO, USA
15th ACM SIGPLAN International Workshop on the State Of the Art in Program Analysis (SOAP 2026)
Frontmatter
15th ACM SIGPLAN International Workshop on the State Of the Art in Program Analysis (SOAP 2026) Papers
Compile-Time Java Stream Fusion via mapMulti
Yegor Bugayenko,
Maxim Trunnikov, and
Vladimir Zakharov
(Huawei, China; Innopolis University, Russian Federation)
The Java Stream API, introduced in Java~8, makes data processing more expressive and concise compared to imperative loops. However, this abstraction can come with significant performance overhead, often due to the creation of multiple intermediate objects during pipeline execution. In functional languages such as Haskell, this problem is addressed through stream fusion, a compile-time optimization that eliminates unnecessary intermediate structures. Inspired by this idea, Streamliner was the first tool to perform ahead-of-time, bytecode-to-bytecode stream optimization for Java by unrolling stream pipelines into imperative loops. In this paper, we introduce an open-source optimizer that takes a different approach. Instead of unrolling streams into loops, it merges consecutive map() and filter() operations into a single mapMulti() call, available since Java 16. Our method avoids several limitations of Streamliner, including its sensitivity to escaping objects in lambda expressions and its restrictions on assigning or passing streams as variables. We evaluated our optimizer on nine benchmarks and observed superior performance in two cases and comparable results in most others. We also applied it to the bytecode of Apache Kafka, successfully executing all 31,799 unit tests without failures.
Article Search
Article: pldiws26soapmain-p1-p
On the Effectiveness of Modular Testing in EvoSuite
Elizabeth Dinella
(Bryn Mawr College, USA)
This paper explores the effectiveness of modular randomized testing for object oriented programs in Java. Modular testing involves testing individual components of a program in isolation. Often times, for effective test generation, a series of non-target setup calls must be included to obtain high coverage of the target component. In this work, we evaluate and improve modular testing with the EvoSuite test generator. We find that due to strict restrictions that disallow calls to non-target setup methods, EvoSuite’s modular testing mode is ineffective and often results in low branch coverage. We propose emote (Effective Modular Testing with EvoSuite): an enhancement to EvoSuite that relaxes this restriction, allowing non-target methods to be included in the test prefixes. This modification draws inspiration from developer-written fuzz drivers, which often invoke setup methods to properly initialize the state before testing the target method. To ensure meaningful test generation, we modify EvoSuite’s fitness function to focus branch coverage contributions on the call chain originating from the target method. emote is evaluated on a subset of the SF100 benchmark, showing a 15.15% improvement in coverage of the target methods.
Article Search
Article: pldiws26soapmain-p2-p
Scaling Static Code Analysis Adoption at WhatsApp iOS
Ákos Hajdu,
Jorge Mendez,
Sander van Valkenburg,
Artem Kupriianets,
Matteo Marescotti,
Dulma Churchill, and
Sopot Cela
(Meta, UK)
We present an industrial case study on deploying static code analysis to foster developer adoption, which indirectly supports the reliability of the WhatsApp iOS app, with a particular focus on memory management issues. Over the course of a year, we utilized Infer and Clang Static Analyzer to proactively prevent over 1700 regressions. In parallel, we mobilized 50+ engineers and harnessed generative AI to remediate over 1000 long-standing, pre-existing issues within the code base. We share insights on both the technical and people challenges faced, emphasizing the socio-technical strategies required for large-scale static analysis adoption.
Article Search
Article: pldiws26soapmain-p3-p
Ravencheck: Effectively-Propositional Reasoning for Rust
Kunha Kim,
Nicholas V. Lewchenko,
Bor-Yuh Evan Chang, and
Gowtham Kaki
(University of Colorado Boulder, USA; Amazon, USA)
SMT-based automated verification is a promising approach for ensuring
software correctness, but it often suffers from unpredictability and
proof instability due to the undecidability of the underlying logic.
While tools like Liquid Haskell or F* provide automation, they either
restrict logical expressivity to ensure decidability or sacrifice
decidability and predictability for richer logic. In this paper, we
present Ravencheck, a verification tool for Rust programs. Ravencheck
occupies a unique design space: it supports specifications within an
expressive fragment of Higher-Order Logic (HOL) while guaranteeing
decidable verification outcomes. To achieve this, Ravencheck
automatically encodes high-level Rust programs into the Extended
Effectively Propositional (EEPR) fragment of first-order logic with
relational abstraction.
Ravencheck builds on prior EEPR-based modeling
languages (e.g., Ivy) by integrating with the standard Rust toolchain.
This integration supports a one-language workflow in which verification
conditions are expressed using the same purely functional Rust syntax as
the verified executable code, rather than introducing a separate modeling
language.
Article Search
Article: pldiws26soapmain-p5-p
Conditional Execution of Transpiler Passes Based on Per-Script Feature Detection
Rishipal Singh Bhatia
(Google, USA)
As the ECMAScript specification evolves, industrial-scale JavaScript compilers face the challenge of supporting modern language syntax while maintaining compatibility for diverse execution environments. Traditionally, compilers solve this by running transpilation passes in a monolithic pipeline, where the transpilation passes are chosen to execute strictly based on a target language level. This results in significant computational waste, as compilers perform expensive Abstract Syntax Tree (AST) traversals to lower language features that may not exist in the actual input source code. We present a compiler improvement that conditionally executes transpiler passes based on accurately tracking and dynamically maintaining the exact set of language features present in the compilation unit throughout the transpilation process. It is implemented in the production open-sourced Google Closure Compiler. We also detail the architectural safeguards - including strategic pass ordering and dynamic validation of the transpiled code for feature-correctness. Evaluation on large-scale production applications and synthetic benchmarks produced a considerable reduction in compilation time and saved compute and memory usage.
Article Search
Artifacts Available
Article: pldiws26soapmain-p6-p
LLM-Integrated Declarative Program Analysis
Sara Baradaran,
Amirmohammad Nazari, and
Mukund Raghothaman
(University of Southern California, USA)
Program analysis tools such as CodeQL enable programmers to express their questions about codebases in the form of declarative queries, which are then evaluated over structured representations of the code. These versatile tools have broad applications in bug finding, vulnerability discovery, and codebase exploration. Still, they are limited in their ability to answer questions that rely on semantic judgments which cannot be expressed or decided using program analysis tools alone, e.g., identifying string literals that contain private information or violations of naming conventions.
In this paper, we present SemQL, a system which extends declarative program analysis frameworks with the ability to invoke an LLM as an external oracle. SemQL allows developers to write queries which combine structural reasoning with semantic (extra-analytic) judgments. We show the real-world value of such a system by collecting a set of analytic questions that require semantic reasoning beyond what is deducible simply from the structure of the code. We also present an algorithm which efficiently evaluates these queries while minimizing costly oracle invocations, and demonstrate its effectiveness in practical program analysis tasks.
Article Search
Article: pldiws26soapmain-p7-p
Exploring Locally Bounded Translation Validation
Alborz Jelvani,
Richard P. Martin, and
Santosh Nagarakatte
(Rutgers University, USA)
Bounded translation validation (TV) is a common method used to check for the equivalence
of two programs by unrolling all loops according to a user-supplied global bound.
Unfortunately, large unroll bounds with multiple loops can increase the state space
and the complexity of the TV. This paper proposes a systematic strategy for
applying local unroll bounds for bounded TV of program pairs with loops.
Our technique increases the coverage of the bounded TV in cases where the TV
would otherwise timeout with the existing global unroll bounds. We introduce a
lattice-based formulation of locally bounded program pairs and demonstrate that
random sampling of local unroll bounds from this lattice admits a search-and-prune
strategy that is effective for verification.
We have modified the Alive2 bounded TV to support local unroll bounds and evaluated
our approach on benchmarks such as TSVC. The results show an increase in
verification coverage with our lattice search-and-prune strategy combined with
local unroll bounds.
Article Search
Article: pldiws26soapmain-p8-p
Detecting Data Leaks in Multi-user LLM Apps via Automated User-Scoped Taint Analysis
Sanjib Kumar Sen and
Bozhen Liu
(Texas A&M University - Corpus Christi, USA)
LLM-Powered Apps (LPAs) are increasingly being deployed on shared cloud servers where multiple users interact at the same time. This creates data leak risks where one user's private data can flow into another user's session through shared caches, misconfigured access control policies, or concurrency bugs in multi-threaded and event-driven code. Existing taint analysis tools do not handle these risks well because they need manual source-sink marking, only track control and dataflow, and do not account for access control policies, complex app structures, or runtime environments.
In this paper, we propose Access Flow Guard (AFG), a framework that automatically identifies taint sources based on user identity and permission level to detect cross-user data leaks in LPAs. Instead of manually marking sources and sinks as existing taint approaches require, AFG introduces a Multi-User, Multi-Permission (MUMP) setup. MUMP automatically identifies user input entry points by matching function signatures against patterns derived from our dataset of popular open-source LPAs. Each identified input is tagged with the user's identity and permission level. AFG then applies Scoped Taint Pointer Analysis (STPA), building on existing pointer-based taint analysis techniques, to propagate user-tagged objects through a Pointer Assignment Graph. STPA detects where different users' data scopes overlap. We discuss key research challenges, including treating LLM components as black boxes and refining static results through dynamic testing. This approach aims to serve as a basis for systematic security testing of LLM-integrated software in shared multi-user environments.
Article Search
Article: pldiws26soapmain-p9-p
15th ACM SIGPLAN International Workshop on the State Of the Art in Program Analysis (SOAP 2026)
proc time: 8.58