Powered by
1st ACM SIGPLAN Workshop on Machine Learning and Programming Languages (MAPL 2017),
June 18, 2017,
Barcelona, Spain
1st ACM SIGPLAN Workshop on Machine Learning and Programming Languages (MAPL 2017)
Frontmatter
Message from the Chairs
Welcome to the first ACM SIGPLAN workshop on Machine Learning and Programming Languages (MAPL), co-located with PLDI in Barcelona, Spain, June 18, 2017. The focus of MAPL is to advance interdisciplinary research across the fields of machine learning (ML) and programming languages (PL).
Although significant advances have been made in both ML and PL, individually, we believe there are many open research problems that are likely to be best solved using a combination of the two. Some of those areas aim to simplify programmer intention, use ML for algorithmic invention, as well as provide infrastructure for the autonomous generation and optimization of software programs. It is for these reasons that we formed MAPL -- to help usher in a new wave of research in these domains.
Languages and Frameworks
A Computational Model for TensorFlow: An Introduction
Martín Abadi, Michael Isard, and Derek G. Murray
(Google Brain, USA)
TensorFlow is a powerful, programmable system for machine learning.
This paper aims to provide the basics of a conceptual framework for
understanding the behavior of TensorFlow models during training and inference:
it describes an operational semantics, of the kind common in
the literature on programming languages. More broadly, the paper
suggests that a programming-language perspective is fruitful in
designing and in explaining systems such as TensorFlow.
@InProceedings{MAPL17p1,
author = {Martín Abadi and Michael Isard and Derek G. Murray},
title = {A Computational Model for TensorFlow: An Introduction},
booktitle = {Proc.\ MAPL},
publisher = {ACM},
pages = {1--7},
doi = {},
year = {2017},
}
Dyna: Toward a Self-Optimizing Declarative Language for Machine Learning Applications
Tim Vieira, Matthew Francis-Landau, Nathaniel Wesley Filardo, Farzad Khorasani, and Jason Eisner
(Johns Hopkins University, USA; Rice University, USA)
Declarative programming is a paradigm that allows programmers to specify what they want to compute, leaving how to compute it to a solver. Our declarative programming language, Dyna, is designed to compactly specify computations like those that are frequently encountered in machine learning. As a declarative language, Dyna’s solver has a large space of (correct) strategies available to it. We describe a reinforcement learning framework for adaptively choosing among these strategies to maximize efficiency for a given workload. Adaptivity in execution is especially important for software that will run under a variety of workloads, where no fixed policy works well. We hope that reinforcement learning will identify good policies reasonably quickly—offloading the burden of writing efficient code from human programmers.
@InProceedings{MAPL17p8,
author = {Tim Vieira and Matthew Francis-Landau and Nathaniel Wesley Filardo and Farzad Khorasani and Jason Eisner},
title = {Dyna: Toward a Self-Optimizing Declarative Language for Machine Learning Applications},
booktitle = {Proc.\ MAPL},
publisher = {ACM},
pages = {8--17},
doi = {},
year = {2017},
}
Debugging, Analysis, and Verification
Debugging Probabilistic Programs
Chandrakana Nandi, Dan Grossman, Adrian Sampson, Todd Mytkowicz, and
Kathryn S. McKinley
(University of Washington, USA; Cornell University, USA; Microsoft Research, USA; Google, USA)
Many applications compute with estimated and uncertain data. While
advances in probabilistic programming help developers build such
applications, debugging them remains extremely challenging. New types of
errors in probabilistic programs include 1) ignoring dependencies and
correlation between random variables and in training data, 2) poorly
chosen inference hyper-parameters, and 3) incorrect statistical models. A
partial solution to prevent these errors in some languages forbids
developers from explicitly invoking inference. While this prevents some
dependence errors, it limits composition and control over inference, and
does not guarantee absence of other types of errors. This paper presents
the FLEXI programming model which supports constructs for invoking
inference in the language and reusing the results in other statistical
computations. We define a novel formalism for inference with a
Decorated Bayesian Network and present a tool, DePP, that
analyzes this representation to identify the above errors. We evaluate
DePP on a range of prototypical examples to show how it helps developers
to detect errors.
@InProceedings{MAPL17p18,
author = {Chandrakana Nandi and Dan Grossman and Adrian Sampson and Todd Mytkowicz and Kathryn S. McKinley},
title = {Debugging Probabilistic Programs},
booktitle = {Proc.\ MAPL},
publisher = {ACM},
pages = {18--26},
doi = {},
year = {2017},
}
Combining the Logical and the Probabilistic in Program Analysis
Xin Zhang, Xujie Si, and
Mayur Naik
(Georgia Institute of Technology, USA; University of Pennsylvania, USA)
Conventional program analyses have made great strides by leveraging logical reasoning. However, they cannot handle uncertain knowledge, and they lack the ability to learn and adapt. This in turn hinders the accuracy, scalability, and usability of program analysis tools in practice. We seek to address these limitations by proposing a methodology and framework for incorporating probabilistic reasoning directly into existing program analyses that are based on logical reasoning. We demonstrate that the combined approach can benefit a number of important applications of program analysis and thereby facilitate more widespread adoption of this technology.
@InProceedings{MAPL17p27,
author = {Xin Zhang and Xujie Si and Mayur Naik},
title = {Combining the Logical and the Probabilistic in Program Analysis},
booktitle = {Proc.\ MAPL},
publisher = {ACM},
pages = {27--34},
doi = {},
year = {2017},
}
Learning a Classifier for False Positive Error Reports Emitted by Static Code Analysis Tools
Ugur Koc, Parsa Saadatpanah, Jeffrey S. Foster, and Adam A. Porter
(University of Maryland at College Park, USA)
The large scale and high complexity of modern software systems make perfectly precise static code analysis (SCA) infeasible. Therefore SCA tools often over-approximate, so not to miss any real problems. This, however, comes at the expense of raising false alarms, which, in practice, reduces the usability of these tools.
To partially address this problem, we propose a novel learning process whose goal is to discover program structures that cause a given SCA tool to emit false error reports, and then to use this information to predict whether a new error report is likely to be a false positive as well. To do this, we first preprocess code to isolate the locations that are related to the error report. Then, we apply machine learning techniques to the preprocessed code to discover correlations and to learn a classifier.
We evaluated this approach in an initial case study of a widely-used SCA tool for Java. Our results showed that for our dataset we could accurately classify a large majority of false positive error reports. Moreover, we identified some common coding patterns that led to false positive errors. We believe that SCA developers may be able to redesign their methods to address these patterns and reduce false positive error reports.
@InProceedings{MAPL17p35,
author = {Ugur Koc and Parsa Saadatpanah and Jeffrey S. Foster and Adam A. Porter},
title = {Learning a Classifier for False Positive Error Reports Emitted by Static Code Analysis Tools},
booktitle = {Proc.\ MAPL},
publisher = {ACM},
pages = {35--42},
doi = {},
year = {2017},
}
Verified Perceptron Convergence Theorem
Charlie Murphy, Patrick Gray, and Gordon Stewart
(Princeton University, USA; Ohio University, USA)
Frank Rosenblatt invented the perceptron algorithm in 1957 as part of an early attempt to build ``brain models'', artificial neural networks. In this paper, we apply tools from symbolic logic such as dependent type theory as implemented in Coq to build, and prove convergence of, one-layer perceptrons (specifically, we show that our Coq implementation converges to a binary classifier when trained on linearly separable datasets).
Our perceptron and proof are extensible, which we demonstrate by adapting our convergence proof to the averaged perceptron, a common variant of the basic perceptron algorithm. We perform experiments to evaluate the performance of our Coq perceptron vs. an arbitrary-precision C++ implementation and against a hybrid implementation in which separators learned in C++ are certified in Coq. We find that by carefully optimizing the extraction of our Coq perceptron, we can meet -- and occasionally exceed -- the performance of the arbitrary-precision C++ implementation. Our hybrid Coq certifier demonstrates an architecture for building high-assurance machine-learning systems that reuse existing codebases.
@InProceedings{MAPL17p43,
author = {Charlie Murphy and Patrick Gray and Gordon Stewart},
title = {Verified Perceptron Convergence Theorem},
booktitle = {Proc.\ MAPL},
publisher = {ACM},
pages = {43--50},
doi = {},
year = {2017},
}
Info
proc time: 0.9