Powered by
2013 9th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE),
August 18–26, 2013,
Saint Petersburg, Russia
New Ideas
Analysis and Testing
Extracting URLs from JavaScript via Program Analysis
Qi Wang, Jingyu Zhou, Yuting Chen, Yizhou Zhang, and Jianjun Zhao
(Shanghai Jiao Tong University, China; Cornell University, USA)
With the extensive use of client-side JavaScript in web applications, web contents are becoming more dynamic than ever before. This poses significant challenges for search engines, because more web URLs are now embedded or hidden inside JavaScript code and most web crawlers are script-agnostic, significantly reducing the coverage of search engines. We present a hybrid approach that combines static analysis with dynamic execution, overcoming the weakness of a purely static or dynamic approach that either lacks accuracy or suffers from huge execution cost. We also propose to integrate
program analysis techniques such as statement coverage and program slicing to improve the performance of URL mining.
@InProceedings{ESEC/FSE13p627,
author = {Qi Wang and Jingyu Zhou and Yuting Chen and Yizhou Zhang and Jianjun Zhao},
title = {Extracting URLs from JavaScript via Program Analysis},
booktitle = {Proc.\ ESEC/FSE},
publisher = {ACM},
pages = {627--630},
doi = {},
year = {2013},
}
Data Debugging with Continuous Testing
Kıvanç Muşlu,
Yuriy Brun , and Alexandra Meliou
(University of Washington, USA; University of Massachusetts at Amherst, USA)
Today, systems rely as heavily on data as on the software that
manipulates those data. Errors in these systems are incredibly costly,
annually resulting in multi-billion dollar losses, and, on multiple
occasions, in death. While software debugging and testing have
received heavy research attention, less effort has been devoted to
data debugging: discovering system errors caused by well-formed but
incorrect data. In this paper, we propose continuous data testing:
using otherwise-idle CPU cycles to run test queries, in the
background, as a user or database administrator modifies a database.
This technique notifies the user or administrator about a data bug as
quickly as possible after that bug is introduced, leading to at least
three benefits: (1) The bug is discovered quickly and can be fixed
before it is likely to cause a problem. (2) The bug is discovered
while the relevant change is fresh in the user's or administrator's
mind, increasing the chance that the underlying cause of the bug, as
opposed to only the discovered side-effect, is fixed. (3) When poor
documentation or company policies contribute to bugs, discovering the
bug quickly is likely to identify these contributing factors,
facilitating updating documentation and policies to prevent similar
bugs in the future. We describe the problem space and potential
benefits of continuous data testing, our vision for the technique,
challenges we encountered, and our prototype implementation for
PostgreSQL. The prototype's low overhead shows promise that continuous
data testing can address the important problem of data debugging.
@InProceedings{ESEC/FSE13p631,
author = {Kıvanç Muşlu and Yuriy Brun and Alexandra Meliou},
title = {Data Debugging with Continuous Testing},
booktitle = {Proc.\ ESEC/FSE},
publisher = {ACM},
pages = {631--634},
doi = {},
year = {2013},
}
Iterative Test Suites Refinement for Elastic Computing Systems
Alessio Gambi, Antonio Filieri, and Schahram Dustdar
(University of Lugano, Switzerland; University of Stuttgart, Germany; Vienna University of Technology, Austria)
Elastic computing systems can dynamically scale to continuously and cost-effectively provide their required Quality of Service in face of time-varying workloads, and they are usually implemented in the cloud. Despite their wide-spread adoption by industry, a formal definition of elasticity and suitable procedures for its assessment and verification are still missing. Both academia and industry are trying to adapt established testing procedures for functional and non-functional properties, with limited effectiveness with respect to elasticity. In this paper we propose a new methodology to automatically generate test-suites for testing the elastic properties of systems. Elasticity, plasticity, and oscillations are first formalized through a convenient behavioral abstraction of the elastic system and then used to drive an iterative test suite refinement process. The outcomes of our approach are a test suite tailored to the violation of elasticity properties and a human-readable abstraction of the system behavior to further support diagnosis and fix.
@InProceedings{ESEC/FSE13p635,
author = {Alessio Gambi and Antonio Filieri and Schahram Dustdar},
title = {Iterative Test Suites Refinement for Elastic Computing Systems},
booktitle = {Proc.\ ESEC/FSE},
publisher = {ACM},
pages = {635--638},
doi = {},
year = {2013},
}
Using Fault History to Improve Mutation Reduction
Laura Inozemtseva, Hadi Hemmati, and Reid Holmes
(University of Waterloo, Canada; University of Manitoba, Canada)
Mutation testing can be used to measure test suite quality in two
ways: by treating the kill score as a quality metric, or by treating
each surviving, non-equivalent mutant as an indicator of an inadequacy
in the test suite. The first technique relies on the assumption that
the mutation score is highly correlated with the suite's real fault
detection rate, which is not well supported by the literature. The
second technique relies only on the weaker assumption that the
"interesting" mutants (i.e., the ones that indicate an inadequacy in
the suite) are in the set of surviving mutants. Using the second
technique also makes improving the suite straightforward.
Unfortunately, mutation testing has a performance problem. At least
part of the test suite must be run on every mutant, meaning mutation
testing can be too slow for practical use. Previous work has
addressed this by reducing the number of mutants to evaluate in
various ways, including selecting a random subset of them. However,
reducing the set of mutants by random reduction is suboptimal for
developers using the second technique described above, since random
reduction will eliminate many of the interesting mutants.
We propose a new reduction method that supports the use of the second
technique by reducing the set of mutants to those generated by
altering files that have contained many faults in the past. We
performed a pilot study that suggests that this reduction method
preferentially chooses mutants that will survive mutation testing;
that is, it preserves a greater number of interesting mutants than
random reduction does.
@InProceedings{ESEC/FSE13p639,
author = {Laura Inozemtseva and Hadi Hemmati and Reid Holmes},
title = {Using Fault History to Improve Mutation Reduction},
booktitle = {Proc.\ ESEC/FSE},
publisher = {ACM},
pages = {639--642},
doi = {},
year = {2013},
}
Hunting Bugs
A Cost-Effectiveness Criterion for Applying Software Defect Prediction Models
Hongyu Zhang and S. C. Cheung
(Tsinghua University, China; ISCAS, China; Hong Kong University of Science and Technology, China)
Ideally, software defect prediction models should help organize software quality assurance (SQA) resources and reduce cost of finding defects by allowing the modules most likely to contain defects to be inspected first. In this paper, we study the cost-effectiveness of applying defect prediction models in SQA and propose a basic cost-effectiveness criterion. The criterion implies that defect prediction models should be applied with caution. We also propose a new metric FN/(FN+TN) to measure the cost-effectiveness of a defect prediction model.
@InProceedings{ESEC/FSE13p643,
author = {Hongyu Zhang and S. C. Cheung},
title = {A Cost-Effectiveness Criterion for Applying Software Defect Prediction Models},
booktitle = {Proc.\ ESEC/FSE},
publisher = {ACM},
pages = {643--646},
doi = {},
year = {2013},
}
BugMap: A Topographic Map of Bugs
Jiangtao Gong and
Hongyu Zhang
(Tsinghua University, China; ISCAS, China)
A large and complex software system could contain a large number of bugs. It is desirable for developers to understand how these bugs are distributed across the system, so they could have a better overview of software quality. In this paper, we describe BugMap, a tool we developed for visualizing large-scale bug location information. Taken source code and bug data as the input, BugMap can display bug localizations on a topographic map. By examining the topographic map, developers can understand how the components and files are affected by bugs. We apply this tool to visualize the distribution of Eclipse bugs across components/files. The results show that our tool is effective for understanding the overall quality status of a large-scale system and for identifying the problematic areas of the system.
@InProceedings{ESEC/FSE13p647,
author = {Jiangtao Gong and Hongyu Zhang},
title = {BugMap: A Topographic Map of Bugs},
booktitle = {Proc.\ ESEC/FSE},
publisher = {ACM},
pages = {647--650},
doi = {},
year = {2013},
}
Lexical Statistical Machine Translation for Language Migration
Anh Tuan Nguyen, Tung Thanh Nguyen, and Tien N. Nguyen
(Iowa State University, USA)
Prior research has shown that source code also exhibits naturalness, i.e. it is written by humans and is likely to be repetitive. The researchers also showed that the n-gram language model is useful in predicting the next token in a source file given a large corpus of existing source code. In this paper, we investigate how well statistical machine translation (SMT) models for natural languages could help in migrating source code from one programming language to another. We treat source code as a sequence of lexical tokens and apply a phrase-based SMT model on the lexemes of those tokens. Our empirical evaluation on migrating two Java projects into C# showed that lexical, phrase-based SMT
could achieve high lexical translation accuracy (BLEU from 81.3-82.6%). Users would have to manually edit only 11.9-15.8% of the total number of tokens in the resulting code to correct it. However, a high percentage of total translation methods (49.5-58.6%) is syntactically incorrect. Therefore, our result calls for a more program-oriented SMT model that is capable of better integrating the syntactic and semantic information of a program to support language migration.
@InProceedings{ESEC/FSE13p651,
author = {Anh Tuan Nguyen and Tung Thanh Nguyen and Tien N. Nguyen},
title = {Lexical Statistical Machine Translation for Language Migration},
booktitle = {Proc.\ ESEC/FSE},
publisher = {ACM},
pages = {651--654},
doi = {},
year = {2013},
}
Code Fragment Summarization
Annie T. T. Ying and
Martin P. Robillard
(McGill University, Canada)
Current research in software engineering has mostly focused on the retrieval accuracy aspect but little on the presentation aspect of code examples, e.g., how code examples are presented in a result page. We investigate the feasibility of summarizing code examples for better presenting a code example. Our algorithm based on machine learning could approximate summaries in an oracle manually generated by humans with a precision of 0.71. This result is promising as summaries with this level of precision achieved the same level of agreement as human annotators with each other.
@InProceedings{ESEC/FSE13p655,
author = {Annie T. T. Ying and Martin P. Robillard},
title = {Code Fragment Summarization},
booktitle = {Proc.\ ESEC/FSE},
publisher = {ACM},
pages = {655--658},
doi = {},
year = {2013},
}
Video
Understanding Software Development
Understanding Gamification Mechanisms for Software Development
Daniel J. Dubois and Giordano Tamburrelli
(Massachusetts Institute of Technology, USA; University of Lugano, Switzerland)
In this paper we outline the idea to adopt gamification techniques to engage, train, monitor, and motivate all the players involved in the development of complex software artifacts, from the inception to the deployment and maintenance. The paper introduces the concept of gamification and proposes a research approach to understand how its principles may be successfully applied to the process of software development. Applying gamification to software engineering is not as straightforward as it may appear since it has to be casted to the peculiarities of this domain. Existing literature in the area has already recognized the possible use of such technology in the context of software development, however how to design and use gamification in this context is still an open question. This leads to several research challenges which are organized in a fascinating research agenda that is part of the contribution of this paper. Finally, to support the proposed ideas we present a preliminary experiment that shows the effect of gamification on the performance of students involved in a software engineering project.
@InProceedings{ESEC/FSE13p659,
author = {Daniel J. Dubois and Giordano Tamburrelli},
title = {Understanding Gamification Mechanisms for Software Development},
booktitle = {Proc.\ ESEC/FSE},
publisher = {ACM},
pages = {659--662},
doi = {},
year = {2013},
}
Toward Understanding the Causes of Unanswered Questions in Software Information Sites: A Case Study of Stack Overflow
Ripon K. Saha, Avigit K. Saha, and Dewayne E. Perry
(University of Texas at Austin, USA; University of Saskatchewan, Canada)
Stack Overflow is a highly successful question-answering website in the programming community, which not only provide quick solutions to programmers’ questions but also is considered as a large repository of valuable software engineering knowledge. However, despite having a very engaged and active user community, Stack Overflow currently has more than 300K unanswered questions. In this paper, we perform an initial investigation to understand why these questions remain unanswered by applying a combination of statistical and data mining techniques. Our preliminary results indicate that although there are some topics that were never answered, most questions remained unanswered because they apparently are of little interest to the user community.
@InProceedings{ESEC/FSE13p663,
author = {Ripon K. Saha and Avigit K. Saha and Dewayne E. Perry},
title = {Toward Understanding the Causes of Unanswered Questions in Software Information Sites: A Case Study of Stack Overflow},
booktitle = {Proc.\ ESEC/FSE},
publisher = {ACM},
pages = {663--666},
doi = {},
year = {2013},
}
Where Is the Business Logic?
Yael Dubinsky, Yishai Feldman, and Maayan Goldstein
(IBM Research, Israel)
One of the challenges in maintaining legacy systems is to be able to
locate business logic in the code, and isolate it for different
purposes, including implementing requested changes, refactoring,
eliminating duplication, unit testing, and extracting business logic
into a rule engine. Our new idea is an iterative method to identify
the business logic in the code and visualize this information to gain
better understanding of the logic distribution in the code, as well as
developing a domain-specific business vocabulary. This new method
combines and extends several existing technologies, including search,
aggregation, and visualization. We evaluated the visualization method
on a large-scale application and found that it yields useful results,
provided an appropriate vocabulary is available.
@InProceedings{ESEC/FSE13p667,
author = {Yael Dubinsky and Yishai Feldman and Maayan Goldstein},
title = {Where Is the Business Logic?},
booktitle = {Proc.\ ESEC/FSE},
publisher = {ACM},
pages = {667--670},
doi = {},
year = {2013},
}
Towards Emotional Awareness in Software Development Teams
Emitza Guzman and Bernd Bruegge
(TU Munich, Germany)
Emotions play an important role in determining work results and how team members collaborate within a project. When working in large, distributed teams, members can lose awareness of the emotional state of the project. We propose an approach to improve emotional awareness in software development teams by means of quantitative emotion summaries. Our approach automatically extracts and summarizes emotions expressed in collaboration artifacts by combining probabilistic topic modeling with lexical sentiment analysis techniques. We applied the approach to 1000 collaboration artifacts produced by three development teams in a three month period. Interviews with the teams' project leaders suggest that the proposed emotion summaries have a good correlation with the emotional state of the project, and could be useful for improving emotional awareness. However, the interviews also indicate that the current state of the summaries is not detailed enough and further improvements are needed.
@InProceedings{ESEC/FSE13p671,
author = {Emitza Guzman and Bernd Bruegge},
title = {Towards Emotional Awareness in Software Development Teams},
booktitle = {Proc.\ ESEC/FSE},
publisher = {ACM},
pages = {671--674},
doi = {},
year = {2013},
}
proc time: 1.18