PEPM 2017
2017 ACM SIGPLAN Workshop on Partial Evaluation and Program Manipulation (PEPM 2017)
Powered by
Conference Publishing Consulting

2017 ACM SIGPLAN Workshop on Partial Evaluation and Program Manipulation (PEPM 2017), January 16–17, 2017, Paris, France

PEPM 2017 – Proceedings

Contents - Abstracts - Authors

Frontmatter

Title Page


Message from the Workshop Chairs
We are pleased to present the proceedings of the 2017 ACM SIGPLAN Workshop on Partial Evaluation and Program Manipulation (PEPM'17), held in Paris, France, on January 16-17, 2017 in affiliation with the annual Symposium on Principles of Programming Languages (POPL'17). The PEPM series brings together researchers and practitioners working in the broad area of semantics-based program manipulation, an area which spans from refactoring, partial evaluation, supercompilation, staged programming, fusion and other meta-programming to model-driven development, program analysis, inductive programming, decompilation, program generation and abstract interpretation.

Invited Paper

Compiling Untyped Lambda Calculus to Lower-Level Code by Game Semantics and Partial Evaluation (Invited Paper)
Daniil Berezun and Neil D. Jones
(JetBrains, Russia; St. Petersburg State University, Russia; University of Copenhagen, Denmark)
Any expression M in ULC (the untyped λ-calculus) can be compiled into a rather low-level language we call LLL, whose programs contain none of the traditional implementation devices for functional languages: environments, thunks, closures, etc. A compiled program is first-order functional and has a fixed set of working variables, whose number is independent of M. The generated LLL code in effect traverses the subexpressions of M.
We apply the techniques of game semantics to the untyped λ-calculus, but take a more operational viewpoint that uses less mathematical machinery than traditional presentations of game semantics. Further, the untyped lambda calculus ULC is compiled into LLL by partially evaluating a traversal algorithm for ULC.

Programming Languages

Detecting Code Clones with Gaps by Function Applications
Tsubasa Matsushita and Isao Sasano
(Shibaura Institute of Technology, Japan)
Code clones are pairs or groups of code segments which are identical or similar to each other. Generally the existence of code clones is considered to make it cumbersome to maintain the source code, so that various kinds of code clone detection tools have been developed. Simple ones divide the source code into a sequence of lines or tokens and find identical or similar sub-sequences. Differences among code clones, called gaps, may prevent clones from being detected or may make clones to be detected with being fragmented. In order to cope with gaps, various tools have been developed by using abstract syntax trees, comparing some metrics, or using program dependency graphs. In this paper we present a novel algorithm for detecting clones by focusing on gaps by function applications. Based on the algorithm we have implemented a tool for detecting code clones on programs written in Standard ML.

Info
Lightweight Soundness for Towers of Language Extensions
Alejandro Serrano and Jurriaan Hage ORCID logo
(Utrecht University, Netherlands)
It is quite natural to define a software language as an extension of a base language. A compiler builder usually prefers to work on a representation in the base language, while programmers prefer to program in the extended language. As we define a language extension, we want to ensure that desugaring it into the base language is provably sound.

We present a lightweight approach to verifying soundness by embedding the base language and its extensions in Haskell. The embedding uses the final tagless style, encoding each language as a type class. As a result, combination and enhancement of language extensions are expressed in a natural way. Soundness of the language extension corresponds to well-typedness of the Haskell terms, so no extra tool but the compiler is needed.

PEG Parsing in Less Space using Progressive Tabling and Dynamic Analysis
Fritz Henglein ORCID logo and Ulrik Terp Rasmussen
(University of Copenhagen, Denmark)
Tabular top-down parsing and its lazy variant, Packrat, are linear-time execution models for the TDPL family of recursive descent parsers with limited backtracking. Exponential work due to backtracking is avoided by tabulating the result of each (nonterminal, offset)-pair at the expense of always using space proportional to the product of the input length and grammar size. Current methods for limiting the space usage rely either on manual annotations or on static analyses that are sensitive to the syntactic structure of the grammar.
We present progressive tabular parsing (PTP), a new execution model which progressively computes parse tables for longer prefixes of the input and simultaneously generates a leftmost expansion of the parts of the parse tree that can be resolved. Table columns can be discarded on-the-fly as the expansion progresses through the input string, providing best-case constant and worst-case linear memory use. Furthermore, semantic actions are scheduled before the parser has seen the end of the input. The scheduling is conservative in the sense that no action has to be “undone” in the case of backtracking.
The time complexity is O(dmn) where m is the size of the parser specification, n is the size of the input string, and d is either a configured constant or the maximum parser stack depth.
For common data exchange formats such as JSON, we demonstrate practically constant space usage.

Transformation

Interactive Data Representation Migration: Exploiting Program Dependence to Aid Program Transformation
Krishna Narasimhan, Christoph Reichenbach, and Julia Lawall
(Goethe University Frankfurt, Germany; R² Software & Systeme, Germany; Sorbonne, France; UPMC, France; Inria, France; LIP6, France)
Data representation migration is a program transformation that involves changing the type of a particular data structure, and then updating all of the operations that somehow depend on that data structure according to the new type. Changing the data representation can provide benefits such as improving efficiency and improving the quality of the computed results. Performing such a transformation is challenging, because it requires applying data-type specific changes to code fragments that may be widely scattered throughout the source code, connected by dataflow dependencies. Refactoring systems are typically sensitive to dataflow dependencies, but are not programmable with respect to the features of particular data types. Existing program transformation languages provide the needed flexibility, but do not concisely support reasoning about dataflow dependencies. To address the needs of data representation migration, we propose a new approach to program transformation that relies on a notion of semantic dependency: every transformation step propagates the transformation process onward to code that somehow depends on the transformed code. Our approach provides a declarative transformation-specification language, for expressing typespecific transformation rules. We further provide scoped rules, a mechanism for guiding rule application, and tags, a device for simple program analysis within our framework, to enable more powerful program transformations. We have implemented a prototype transformation system based on these ideas for C and C++ code and evaluate it against three example specifications, including vectorization, transformation of integers to big integers, and transformation of array-of-structures data types to structure-of-arrays format. Our evaluation shows that our approach can improve program performance and the precision of the computed results, and that it scales to programs of up to 3700 lines.

Verification of Code Generators via Higher-Order Model Checking
Takashi Suwa, Takeshi Tsukada, Naoki KobayashiORCID logo, and Atsushi IgarashiORCID logo
(University of Tokyo, Japan; Kyoto University, Japan)
Dynamic code generation is useful for optimizing code with respect to information available only at run-time. Writing a code generator is, however, difficult and error prone. We consider a simple language for writing code generators and propose an automated method for verifying code generators. Our method is based on higher-order model checking, and can check that a given code generator can generate only closed, well-typed programs. Compared with typed multi-stage programming languages, our approach is less conservative on the typability of generated programs (i.e., can accept valid code generators that would be rejected by typical multi-stage languages) and can check a wider range of properties of code generators. We have implemented the proposed method and confirmed its effectiveness through experiments.

A Functional Reformulation of UnCAL Graph-Transformations: Or, Graph Transformation as Graph Reduction
Kazutaka Matsuda ORCID logo and Kazuyuki Asada
(Tohoku University, Japan; University of Tokyo, Japan)
This paper proposes FUnCAL, a functional redesign of the graph transformation language UnCAL. A large amount of graph-structured data are widely used, including biological database, XML with IDREFs, WWW, and UML diagrams in software engineering. UnCAL is a language designed for graph transformations, i.e., extracting a subpart of a graph data and converting it to a suitable form, as what XQuery does for XMLs. A distinguished feature of UnCAL is its semantics that respects bisimulation on graphs; this enables us to reason about UnCAL graph transformations as recursive functions, which is useful for reasoning as well as optimization. However, there is still a gap to apply the program-manipulation techniques studied in the programming language literature directly to UnCAL programs, due to some special features in UnCAL, especially markers. In this paper, following the observation that markers can be emulated by tuples and λ-abstractions, we transform UnCAL programs to a restricted class of usual (thus, marker-free) functional ones. By this translation, we can reason, analyze or optimize UnCAL programs as usual functional programs. Moreover, we introduce a type system for showing that a small modification to the usual lazy semantics is enough to run well-typed functional programs as finite-graph transformations in a terminating way.

Functional Parallels of Sequential Imperatives (Short Paper)
Tiark Rompf ORCID logo and Kevin J. Brown
(Purdue University, USA; Stanford University, USA)
Symbolic parallelism is a fresh look at the decade-old problem of turning sequential, imperative, code into associative reduction kernels, based on the realization that map/reduce is at its core a staging problem: how can programs be separated so that part of the computation can be done before loop-carried dependencies become available? Previous work has investigated dynamic approaches that build symbolic summaries while the actual data is processed. In this short paper, we approach the problem from the static side, and show that with simple syntax- or type-driven transformations, we can readily transform large classes of imperative groupby-aggregate programs into map/reduce parallelism with deterministic overhead.

Types

Cost versus Precision for Approximate Typing for Python
Levin Fritz and Jurriaan Hage ORCID logo
(Utrecht University, Netherlands)
In this paper we describe a variation of monotone frameworks that enables us to perform approximate typing of Python, in particular for dealing with some of its more dynamic features such as first-class functions and Python's dynamic class system. We additionally introduce a substantial number of variants of the basic analysis in order to experimentally discover which configurations attain the best balance of cost and precision. For example, the analysis allows us to be selectively flow-insensitive for certain classes of identifiers, and the amount of call-site context is configurable. Results of our evaluation include that adding call-site sensitivity and parameterized types has little effect on precision; in terms of speed call-site sensitivity is very costly. On the other hand, flow-insensitive treatment of module scope identifiers has a strongly positive effect, often both in terms of precision and speed.

Predicting Resource Consumption of Higher-Order Workflows
Markus Klinik, Jurriaan Hage ORCID logo, Jan Martin Jansen, and Rinus Plasmeijer
(Radboud University Nijmegen, Netherlands; Utrecht University, Netherlands; Netherlands Defence Academy, Netherlands)
We present a type and effect system for static analysis of programs written in a simplified version of iTasks. iTasks is a workflow specification language embedded in Clean, a general-purpose functional programming language. Given costs for basic tasks, our analysis calculates an upper bound of the total cost of a workflow. The analysis has to deal with the domain-specific features of iTasks, in particular parallel and sequential composition of tasks, as well as the general-purpose features of Clean, in particular let-polymorphism, higher-order functions, recursion and lazy evaluation. Costs are vectors of natural numbers where every element represents some resource, either consumable or reusable.

Refining Types using Type Guards in TypeScript
Ivo Gabe de Wolff and Jurriaan Hage ORCID logo
(Utrecht University, Netherlands)
We discuss two adaptations of the implementation of type guards and narrowing in the TypeScript compiler. The first is an improvement on the original syntax-directed implementation, and has now replaced the original one in the TypeScript compiler. It is specifically suited for the scenario in which an IDE requests the type of a particular variable in the program. The second implementation is defined as a whole program analysis, and is therefore able to compute more precise narrowed types, but at the price of a higher run-time cost.

Poster

Language-Integrated Query with Ordering, Grouping and Outer Joins (Poster Paper)
Tatsuya Katsushima and Oleg Kiselyov ORCID logo
(Tohoku University, Japan)
Language-integrated query systems like T-LINQ or QUEΛ make relational operations on (generally external) data feel like the ordinary iteration over native arrays. As ordinary programs, queries are type-checked, can be abstracted over and composed. To access relational database systems, queries are eventually translated into well-formed, well-typed and efficient SQL. However, most existing language-integrated query systems implement only a small subset of relational operations supported by modern databases.
To make QUEΛ full-featured, we add to it the operations corresponding to SQL’s ORDER BY, LIMIT, OUTER JOIN, GROUP BY and HAVING. We describe the type system and the normalization rules to produce the efficient SQL code. The type system not only ensures by construction the intricate SQL validity constraints. It also prevents the accidental composition of hard-to-optimize queries.
Our extended QUEΛ is embedded in OCaml in the tagless-final style.

proc time: 2.59