Types for Tables: A Language Design Benchmark
Kuang-Chen Lu, Ben Greenman

, and Shriram Krishnamurthi
(Brown University, USA)
Context
Tables are ubiquitous formats for data. Therefore, techniques for writing
correct programs over tables, and debugging incorrect ones, are vital. Our
specific focus in this paper is on rich types that articulate the properties
of tabular operations. We wish to study both their expressive power
and diagnostic quality.
Inquiry
There is no "standard library" of table operations. As a result, every paper
(and project) is free to use its own (sub)set of operations. This makes
artifacts very difficult to compare, and it can be hard to tell whether omitted
operations were left out by oversight or because they cannot actually be
expressed. Furthermore, virtually no papers discuss the quality of type error
feedback.
Approach
We combed through several existing languages and libraries to create a
"standard library" of table operations. Each entry is accompanied by a detailed
specification of its "type," expressed independent of (and hence not
constrained by) any type language. We also studied and categorized a corpus of
(student) program edits that resulted in table-related errors. We used this to
generate a suite of erroneous programs. Finally, we adapted the concept of a
datasheet to facilitate comparisons of different implementations.
Knowledge
Our benchmark creates a common ground to frame work in this area. Language
designers who claim to support typed programming over tables have a clear suite
against which to demonstrate their system's expressive power. Our family of
errors also gives them a chance to demonstrate the quality of feedback.
Researchers who improve one aspect---especially error reporting---without
changing the other can demonstrate their improvement, as can those who engage
in trade-offs between the two. The net result should be much better science in
both expressiveness and diagnostics. We also introduce a datasheet format for
presenting this knowledge in a methodical way.
Grounding
We have generated our benchmark from real languages, libraries, and programs,
as well as personal experience conducting and teaching data science. We have
drawn on experience in engineering and, more recently, in data science to
generate the datasheet.
Importance
Claims about type support for tabular programming are hard to evaluate.
However, tabular programming is ubiquitous, and the expressive power of type
systems keeps growing. Our benchmark and datasheet can help lead to more
orderly science. It also benefits programmers trying to choose a language.
@Article{Programming Journal, Volume22p8,
author = {Kuang-Chen Lu and Ben Greenman and Shriram Krishnamurthi},
title = {Types for Tables: A Language Design Benchmark},
journal = {},
volume = {},
number = {},
articleno = {8},
numpages = {30},
doi = {10.22152/programming-journal.org/2022/6/8},
year = {2022},
}
Publisher's Version
Info
Automated, Targeted Testing of Property-Based Testing Predicates
Tim Nelson, Elijah Rivera, Sam Soucie, Thomas Del Vecchio, John Wrenn, and Shriram Krishnamurthi
(Brown University, USA; Massachusetts Institute of Technology, USA; Indiana University, USA)
Context This work is based on property-based testing (PBT). PBT is an increasingly important form of software testing. Furthermore, it serves as a concrete gateway into the abstract area of formal methods. Specifically, we focus on students learning PBT methods.
Inquiry How well do students do at PBT? Our goal is to assess the quality of the predicates they write as part of PBT. Prior work introduced the idea of decomposing the predicate’s property into a conjunction of independent subproperties. Testing the predicate against each subproperty gives a “semantic” understanding of their performance.
Approach The notion of independence of subproperties both seems intuitive and was an important condition in prior work. First, we show that this condition is overly restrictive and might hide valuable information: it both undercounts errors and makes it hard to capture misconceptions. Second, we introduce two forms of automation, one based on PBT tools and the other on SAT-solving, to enable testing of student predicates. Third, we compare the output of these automated tools against manually-constructed tests. Fourth, we also measure the performance of those tools. Finally, we re-assess student performance reported in prior work.
Knowledge We show the difficulty caused by the independent subproperty requirement. We provide insight into how to use automation effectively to assess PBT predicates. In particular, we discuss the steps we had to take to beat human performance. We also provide insight into how to make the automation work efficiently. Finally, we present a much richer account than prior work of how students did.
Grounding Our methods are grounded in mathematical logic. We also make use of well-understood principles of test generation from more formal specifications. This combination ensures the soundness of our work. We use standard methods to measure performance.
Importance As both educators and programmers, we believe PBT is a valuable tool for students to learn, and its importance will only grow as more developers appreciate its value. Effective teaching requires a clear understanding of student knowledge and progress. Our methods enable a rich and automated analysis of student performance on PBT that yields insight into their understanding and can capture misconceptions. We therefore expect these results to be valuable to educators.
@Article{Programming Journal, Volume22p10,
author = {Tim Nelson and Elijah Rivera and Sam Soucie and Thomas Del Vecchio and John Wrenn and Shriram Krishnamurthi},
title = {Automated, Targeted Testing of Property-Based Testing Predicates},
journal = {},
volume = {},
number = {},
articleno = {10},
numpages = {29},
doi = {10.22152/programming-journal.org/2022/6/10},
year = {2022},
}
Publisher's Version
Info