Powered by
Conference Publishing Consulting

11th Working Conference on Mining Software Repositories (MSR 2014), May 31 – June 1, 2014, Hyderabad, India

MSR 2014 – Proceedings

Contents - Abstracts - Authors


Title Page
Welcome from the Chairs


Is Mining Software Repositories Data Science? (Keynote)
Audris Mockus
(Avaya Labs Research, USA)

Green Mining

Mining Energy-Greedy API Usage Patterns in Android Apps: An Empirical Study
Mario Linares-Vásquez, Gabriele Bavota, Carlos Bernal-Cárdenas, Rocco Oliveto, Massimiliano Di Penta, and Denys Poshyvanyk
(College of William and Mary, USA; University of Sannio, Italy; Universidad Nacional de Colombia, Colombia; University of Molise, Italy)
GreenMiner: A Hardware Based Mining Software Repositories Software Energy Consumption Framework
Abram Hindle, Alex Wilson, Kent Rasmussen, E. Jed Barlow, Joshua Charles Campbell, and Stephen Romansky
(University of Alberta, Canada)
Mining Questions about Software Energy Consumption
Gustavo Pinto, Fernando Castor, and Yu David Liu
(Federal University of Pernambuco, Brazil; SUNY Binghamton, USA)

Code Clones and Origin Analysis

Prediction and Ranking of Co-change Candidates for Clones
Manishankar Mondal, Chanchal K. Roy, and Kevin A. Schneider
(University of Saskatchewan, Canada)
Incremental Origin Analysis of Source Code Files
Daniela Steidl, Benjamin Hummel, and Elmar Juergens
(CQSE, Germany)
Oops! Where Did That Code Snippet Come From?
Lisong Guo, Julia Lawall, and Gilles Muller
(INRIA, France; LIP6, France; Sorbonne, France; UPMC, France)

Bug Characterizing

Works For Me! Characterizing Non-reproducible Bug Reports
Mona Erfani Joorabchi, Mehdi Mirzaaghaei, and Ali Mesbah
(University of British Columbia, Canada)
Characterizing and Predicting Blocking Bugs in Open Source Projects
Harold Valdivia Garcia and Emad Shihab
(Rochester Institute of Technology, USA)
An Empirical Study of Dormant Bugs
Tse-Hsun Chen, Meiyappan Nagappan, Emad Shihab, and Ahmed E. Hassan
(Queen's University, Canada; Rochester Institute of Technology, USA)

Mining Repos and QA Sites

The Promises and Perils of Mining GitHub
Eirini Kalliamvakou, Georgios Gousios, Kelly Blincoe, Leif Singer, Daniel M. German, and Daniela Damian
(University of Victoria, Canada; Delft University of Technology, Netherlands)
Mining StackOverflow to Turn the IDE into a Self-Confident Programming Prompter
Luca Ponzanelli, Gabriele Bavota, Massimiliano Di Penta, Rocco Oliveto, and Michele Lanza
(University of Lugano, Switzerland; University of Sannio, Italy; University of Molise, Italy)
Mining Questions Asked by Web Developers
Kartik Bajaj, Karthik Pattabiraman, and Ali Mesbah
(University of British Columbia, Canada)
Process Mining Multiple Repositories for Software Defect Resolution from Control and Organizational Perspective
Monika Gupta, Ashish Sureka, and Srinivas Padmanabhuni
(IIIT Delhi, India; Infosys, India)

Mining Applications

MUX: Algorithm Selection for Software Model Checkers
Varun Tulsian, Aditya Kanade, Rahul Kumar, Akash Lal, and Aditya V. Nori
(Indian Institute of Science, India; Microsoft Research, India)
Improving the Effectiveness of Test Suite through Mining Historical Data
Jeff Anderson, Saeed Salem, and Hyunsook Do
(Microsoft, USA; North Dakota State University, USA)
Finding Patterns in Static Analysis Alerts: Improving Actionable Alert Ranking
Quinn Hanam, Lin Tan, Reid Holmes, and Patrick Lam
(University of Waterloo, Canada)
Impact Analysis of Change Requests on Source Code Based on Interaction and Commit Histories
Motahareh Bahrami Zanjani, George Swartzendruber, and Huzefa Kagdi
(Wichita State University, USA)

Defect Prediction

An Empirical Study of Just-in-Time Defect Prediction using Cross-Project Models
Takafumi Fukushima, Yasutaka Kamei, Shane McIntosh, Kazuhiro Yamashita, and Naoyasu Ubayashi
(Kyushu University, Japan; Queen's University, Canada)
Towards Building a Universal Defect Prediction Model
Feng Zhang, Audris Mockus, Iman Keivanloo, and Ying Zou
(Queen's University, Canada; Avaya Labs Research, USA)

Code Review and Code Search

The Impact of Code Review Coverage and Code Review Participation on Software Quality: A Case Study of the Qt, VTK, and ITK Projects
Shane McIntosh, Yasutaka Kamei, Bram Adams, and Ahmed E. Hassan
(Queen's University, Canada; Kyushu University, Japan; Polytechnique Montréal, Canada)
Modern Code Reviews in Open-Source Projects: Which Problems Do They Fix?
Moritz Beller, Alberto Bacchelli, Andy Zaidman, and Elmar Juergens
(Delft University of Technology, Netherlands; CQSE, Germany)
Thesaurus-Based Automatic Query Expansion for Interface-Driven Code Search
Otávio A. L. Lemos, Adriano C. de Paula, Felipe C. Zanichelli, and Cristina V. Lopes
(Federal University of São Paulo, Brazil; University of California at Irvine, USA)

Effort Estimation and Reuse

Estimating Development Effort in Free/Open Source Software Projects by Mining Software Repositories: A Case Study of OpenStack
Gregorio Robles, Jesús M. González-Barahona, Carlos Cervigón, Andrea Capiluppi, and Daniel Izquierdo-Cortázar
(Universidad Rey Juan Carlos, Spain; Brunel University, UK; Bitergia, Spain)
An Industrial Case Study of Automatically Identifying Performance Regression-Causes
Thanh H. D. Nguyen, Meiyappan Nagappan, Ahmed E. Hassan, Mohamed Nasser, and Parminder Flora
(Queen's University, Canada; BlackBerry, Canada)
Revisiting Android Reuse Studies in the Context of Code Obfuscation and Library Usages
Mario Linares-Vásquez, Andrew Holtzhauer, Carlos Bernal-Cárdenas, and Denys Poshyvanyk
(College of William and Mary, USA; Universidad Nacional de Colombia, Colombia)

Mining Mix

Syntax Errors Just Aren't Natural: Improving Error Reporting with Language Models
Joshua Charles Campbell, Abram Hindle, and José Nelson Amaral
(University of Alberta, Canada)
Do Developers Feel Emotions? An Exploratory Analysis of Emotions in Software Artifacts
Alessandro Murgia, Parastou Tourani, Bram Adams, and Marco Ortu
(University of Antwerp, Belgium; Polytechnique Montréal, Canada; University of Cagliari, Italy)
How Does a Typical Tutorial for Mobile Development Look Like?
Rebecca Tiarks and Walid Maalej
(University of Hamburg, Germany)
Unsupervised Discovery of Intentional Process Models from Event Logs
Ghazaleh Khodabandelou, Charlotte Hug, Rebecca Deneckère, and Camille Salinesi
(Sorbonne, France)

Short Research/Practice Papers

Tracing Dynamic Features in Python Programs
Beatrice Åkerblom, Jonathan Stendahl, Mattias Tumlin, and Tobias Wrigstad
(Stockholm University, Sweden; Uppsala University, Sweden)
It's Not a Bug, It's a Feature: Does Misclassification Affect Bug Localization?
Pavneet Singh Kochhar, Tien-Duy B. Le, and David Lo
(Singapore Management University, Singapore)
Classifying Unstructured Data into Natural Language Text and Technical Information
Thorsten Merten, Bastian Mager, Simone Bürsner, and Barbara Paech
(Bonn-Rhein-Sieg University of Applied Sciences, Germany; University of Heidelberg, Germany)
Collaboration in Open-Source Projects: Myth or Reality?
Yuriy Tymchuk, Andrea Mocci, and Michele Lanza
(University of Lugano, Switzerland)
Improving the Accuracy of Duplicate Bug Report Detection using Textual Similarity Measures
Alina Lazar, Sarah Ritchey, and Bonita Sharif
(Youngstown State University, USA)
Undocumented and Unchecked: Exceptions That Spell Trouble
Maria Kechagia and Diomidis Spinellis
(Athens University of Economics and Business, Greece)
Innovation Diffusion in Open Source Software: Preliminary Analysis of Dependency Changes in the Gentoo Portage Package Database
Remco Bloemen, Chintan Amrit, Stefan Kuhlmann, and Gonzalo Ordóñez–Matamoros
(University of Twente, Netherlands)
A Dictionary to Translate Change Tasks to Source Code
Katja Kevic and Thomas Fritz
(University of Zurich, Switzerland)
New Features for Duplicate Bug Detection
Nathan Klein, Christopher S. Corley, and Nicholas A. Kraft
(Oberlin College, USA; University of Alabama, USA)
Mining Modern Repositories with Elasticsearch
Oleksii Kononenko, Olga Baysal, Reid Holmes, and Michael W. Godfrey
(University of Waterloo, Canada)

Mining Challenge

A Study of External Community Contribution to Open-Source Projects on GitHub
Rohan Padhye, Senthil Mani, and Vibha Singhal Sinha
(IBM Research, India)
Understanding "Watchers" on GitHub
Jyoti Sheoran, Kelly Blincoe, Eirini Kalliamvakou, Daniela Damian, and Jordan Ell
(University of Victoria, Canada)
Do Developers Discuss Design?
João Brunet, Gail C. Murphy, Ricardo Terra, Jorge Figueiredo, and Dalton Serey
(Federal University of Campina Grande, Brazil; University of British Columbia, Canada; Federal University of Lavras, Brazil)
Magnet or Sticky? An OSS Project-by-Project Typology
Kazuhiro Yamashita, Shane McIntosh, Yasutaka Kamei, and Naoyasu Ubayashi
(Kyushu University, Japan; Queen's University, Canada)
Security and Emotion: Sentiment Analysis of Security Discussions on GitHub
Daniel Pletea, Bogdan Vasilescu, and Alexander Serebrenik
(Eindhoven University of Technology, Netherlands)
Sentiment Analysis of Commit Comments in GitHub: An Empirical Study
Emitza Guzman, David Azócar, and Yang Li
(TU München, Germany)
Analysing the 'Biodiversity' of Open Source Ecosystems: The GitHub Case
Nicholas Matragkas, James R. Williams, Dimitris S. Kolovos, and Richard F. Paige
(University of York, UK)
Co-evolution of Project Documentation and Popularity within Github
Karan Aggarwal, Abram Hindle, and Eleni Stroulia
(University of Alberta, Canada)
An Insight into the Pull Requests of GitHub
Mohammad Masudur Rahman and Chanchal K. Roy
(University of Saskatchewan, Canada)

Data Showcase

A Dataset for Pull-Based Development Research
Georgios Gousios and Andy Zaidman
(Delft University of Technology, Netherlands)
The Bug Catalog of the Maven Ecosystem
Dimitris Mitropoulos, Vassilios Karakoidas, Panos Louridas, Georgios Gousios, and Diomidis Spinellis
(Athens University of Economics and Business, Greece; Delft University of Technology, Netherlands)
A Dataset of Feature Additions and Feature Removals from the Linux Kernel
Leonardo Passos and Krzysztof Czarnecki
(University of Waterloo, Canada)
Kataribe: A Hosting Service of Historage Repositories
Kenji Fujiwara, Hideaki Hata, Erina Makihara, Yusuke Fujihara, Naoki Nakayama, Hajimu Iida, and Kenichi Matsumoto
(NAIST, Japan)
Lean GHTorrent: GitHub Data on Demand
Georgios Gousios, Bogdan Vasilescu, Alexander Serebrenik, and Andy Zaidman
(Delft University of Technology, Netherlands; Eindhoven University of Technology, Netherlands)
A Code Clone Oracle
Daniel E. Krutz and Wei Le
(Rochester Institute of Technology, USA)
Generating Duplicate Bug Datasets
Alina Lazar, Sarah Ritchey, and Bonita Sharif
(Youngstown State University, USA)
FLOSS 2013: A Survey Dataset about Free Software Contributors: Challenges for Curating, Sharing, and Combining
Gregorio Robles, Laura Arjona Reina, Alexander Serebrenik, Bogdan Vasilescu, and Jesús M. González-Barahona
(Universidad Rey Juan Carlos, Spain; Universidad Politécnica de Madrid, Spain; Eindhoven University of Technology, Netherlands)
A Green Miner's Dataset: Mining the Impact of Software Change on Energy Consumption
Chenlei Zhang and Abram Hindle
(University of Alberta, Canada)
Gentoo Package Dependencies over Time
Remco Bloemen, Chintan Amrit, Stefan Kuhlmann, and Gonzalo Ordóñez–Matamoros
(University of Twente, Netherlands)
Models of OSS Project Meta-Information: A Dataset of Three Forges
James R. Williams, Davide Di Ruscio, Nicholas Matragkas, Juri Di Rocco, and Dimitris S. Kolovos
(University of York, UK; University of L'Aquila, Italy)
A Dataset of Clone References with Gaps
Hiroaki Murakami, Yoshiki Higo, and Shinji Kusumoto
(Osaka University, Japan)
A Dataset for Maven Artifacts and Bug Patterns Found in Them
Vaibhav Saini, Hitesh Sajnani, Joel Ossher, and Cristina V. Lopes
(University of California at Irvine, USA)
OpenHub: A Scalable Architecture for the Analysis of Software Quality Attributes
Gabriel Farah, Juan Sebastian Tejada, and Dario Correal
(Universidad de los Andes, Colombia)
Understanding Software Evolution: The Maisqual Ant Data Set
Boris Baldassari and Philippe Preux
(SQuORING Technologies, France; LIFL, France; CNRS, France; INRIA, France; University of Lille, France)

proc time: 1.03