PROGJB – Journal Issue

Contents - Abstracts - Authors

Frontmatter

Title Page
Article: progjb25foreword-fm000-p doi:10.22152/programming-journal.org/2025/10/0

Editorial Message
Article: progjb25foreword-fm001-p doi:10.22152/programming-journal.org/2025/10/0

Sponsors
Article: progjb25foreword-fm003-p doi:10.22152/programming-journal.org/2025/10/0

Papers

Generating Inputs for Grammar Mining using Dynamic Symbolic Execution
Andreas Pointner, Josef Pichler, and Herbert Prähofer
(University of Applied Sciences Upper Austria, Austria; Johannes Kepler University Linz, Austria)
A vast number of software systems include components that parse and process structured input. In addition to programming languages, which are analyzed by compilers or interpreters, there are numerous components that process standardized or proprietary data formats of varying complexity. Even if such components were initially developed and tested based on a specification, such as a grammar, numerous modifications and adaptations over the course of software evolution can make it impossible to precisely determine which inputs they actually accept.
In this situation, grammar mining can be used to reconstruct the specification in the form of a grammar. Established approaches already produce useful results, provided that sufficient input data is available to fully cover the input language. However, achieving this completeness is a major challenge. In practice, only input data recorded during the operation of the software systems is available. If this data is used for grammar mining, the resulting grammar reflects only the actual processed inputs but not the complete grammar of the input language accepted by the software component. As a result, edge cases or previously supported features that no longer appear in the available input data are missing from the generated grammar.
This work addresses this challenge by introducing a novel approach for the automatic generation of inputs for grammar mining. Although input generators have already been used for fuzz testing, it remains unclear whether they are also suitable for grammar miners. Building on the grammar miner Mimid, this work presents a fully automated approach to input generation. The approach leverages Dynamic Symbolic Execution (DSE) and extends it with two mechanisms to overcome the limitations of DSE regarding structured input parsers. First, the search for new inputs is guided by an iterative expansion that starts with a single-character input and gradually extends it. Second, input generation is structured into a novel three-phase approach, which separates the generation of inputs for parser functions.
The proposed method was evaluated against a diverse set of eleven benchmark applications from the existing literature. Results demonstrate that the approach achieves precision and recall for extracted grammars close to those derived from state-of-the-art grammar miners such as Mimid. Notably, it successfully uncovers subtle features and edge cases in parsers that are typically missed by such grammar miners. The effectiveness of the method is supported by empirical evidence, showing that it can achieve high performance in various domains without requiring prior input samples.
This contribution is significant for researchers and practitioners in software engineering, offering an automated, scalable, and precise solution for grammar mining. By eliminating the need for manual input generation, the approach not only reduces workload but also enhances the robustness and comprehensiveness of the extracted grammars. Following this approach, software engineers can reconstruct specification from existing (legacy) parsers.

Publisher's Version Article: progjb25main-p1-p doi:10.22152/programming-journal.org/2025/10/16

If-T: A Benchmark for Type Narrowing
Hanwen Guo and Ben Greenman
(University of Utah, USA)
Context The design of static type systems that can validate dynamically-typed programs (gradually) is an ongoing challenge. A key difficulty is that dynamic code rarely follows datatype-driven design. Programs instead use runtime tests to narrow down the proper usage of incoming data. Type systems for dynamic languages thus need a type narrowing mechanism that refines the type environment along individual control paths based on dominating tests, a form of flow-sensitive typing. In order to express refinements, the type system must have some notion of sets and subsets. Since set-theoretic types are computationally and ergonomically complex, the need for type narrowing raises design questions about how to balance precision and performance.
Inquiry To date, the design of type narrowing systems has been driven by intuition, past experience, and examples from users in various language communities. There is no standard that captures desirable and undesirable behaviors. Prior formalizations of narrowing are also significantly more complex than a standard type system, and it is unclear how the extra complexity pays off in terms of concrete examples. This paper addresses the problems through If-T, a language-agnostic design benchmark for type narrowing that characterizes the abilities of implementations using simple programs that draw attention to fundamental questions. Unlike a traditional performance-focused benchmark, If-T measures a narrowing system’s ability to validate correct code and reject incorrect code. Unlike a test suite, systems are not required to fully conform to If-T. Deviations are acceptable provided they are justified by well-reasoned design considerations, such as compile-time performance.
Approach If-T is guided by the literature on type narrowing, the documentation of gradual languages such as TypeScript, and experiments with typechecker implementations. We have identified a set of core technical dimensions for type narrowing. For each dimension, the benchmark contains a set of topics and (at least) two characterizing programs per topic: one that should typecheck and one that should not typecheck.
Knowledge If-T provides a baseline to measure type narrowing systems. For researchers, it provides criteria to categorize future designs via its collection of positive and negative examples. For language designers, the benchmark demonstrates the payoff of typechecker complexity in terms of concrete examples. Designers can use the examples to decide whether supporting a particular example is worthwhile. Both the benchmark and its implementations are freely available online.
Grounding We have implemented the benchmark for five typecheckers: TypeScript, Flow, Typed Racket, mypy, and Pyright. The results highlight important differences, such as the ability to track logical implications among program variables and typechecking for user-defined narrowing predicates.
Importance Type narrowing is essential for gradual type systems, but the tradeoffs between systems with different complexity have been unclear. If-T clarifies these tradeoffs by illustrating the benefits and limitations of each level of complexity. With If-T as a way to assess implementations in a fair, cross-language manner, future type system designs can strive for a better balance among precision, annotation burden, and performance.

Publisher's Version

Published Artifact Artifact Available v2.0 Artifact Supports Claims v2.0 Article: progjb25main-p5-p doi:10.22152/programming-journal.org/2025/10/17

A Type System for Data Privacy Compliance in Active Object Languages
Chinmayi Prabhu Baramashetru, Paola Giannini, Silvia Lizeth Tapia Tarifa, and Olaf Owe
(University of Oslo, Norway; Universita' del Piemonte Orientale, Italy)

Publisher's Version Article: progjb25main-p6-p doi:10.22152/programming-journal.org/2025/10/18

proc time: 0.09