Powered by
Third International Workshop on Search-Driven Development: Users, Infrastructure, Tools, and Evaluation (SUITE 2011),
May 28, 2011,
Waikiki, Honolulu, HI, USA
Third International Workshop on Search-Driven Development: Users, Infrastructure, Tools, and Evaluation (SUITE 2011)
Foreword
The third international workshop on Search-Driven Development: Users, Infrastructure, Tools and Evaluation (SUITE 2011) focuses on exploring the notion of search as a fundamental activity during software development. As software development is a process of both information creation and information gathering, software developers are constantly engaged in activities that search for the pertinent information to solve their problems at hand. The information needs of software developers range from those related to code (writing, changing, fixing, communicating code) to process (design) and people (colleagues). SUITE is a workshop series that seek to understand and find solutions to address such a wide range of software developers’ information needs.
The goal of the workshop is to identify the search driven nature of software development as a key research topic. Topics related to search driven development range from core technical issues to human aspects. Therefore, SUITE 2011 aims to bring together researchers and practitioners with diverse interests and backgrounds in Software Engineering, Human Computer Interaction, Information Retrieval etc. SUITE 2011 aims to attract and foster a community of researchers who are interested in understanding and fulfilling various information needs during software development. SUITE 2011 will be a venue to discuss the problems and state-of-the-art; share ideas and results; and, set future directions in the area of Search-Driven development.
Recommending API Methods Based on Identifier Contexts
Lars Heinemann and Benjamin Hummel
(Technische Universität München, Germany)
Reuse recommendation systems suggest functions or code snippets that are useful for the programming task at hand within the IDE. These systems utilize different aspects from the context of the cursor position within the source file being edited for inferring which functionality is needed next. Current approaches are based on structural information like inheritance relations or type/method usages. We propose a novel method that utilizes the knowledge embodied in the identifiers as a basis for the recommendation of API methods. This approach has the advantage that relevant recommendations can also be made in cases where no methods are called in the context or if contexts use distinct but semantically similar types or methods. First experiments show, that the correct method is recommended in about one quarter to one third of the cases.
@InProceedings{SUITE11p1,
author = {Lars Heinemann and Benjamin Hummel},
title = {Recommending API Methods Based on Identifier Contexts},
booktitle = {Proc.\ SUITE},
publisher = {ACM},
pages = {1--4},
doi = {},
year = {2011},
}
Content-based Search of Model Repositories with Graph Matching Techniques
Bojana Bislimovska, Alessandro Bozzon, Marco Brambilla, and Piero Fraternali
(Politecnico di Milano, Italy)
Modern software project repositories provide support for both source code and design models that describe in details the data structure, behavior, and components of an application. We propose a graph matching-based technique between software models to address content-based query (a.k.a., query by example) on project repositories so as to retrieve significant model fragments for reuse. This can be extremely valuable in a scenario where the designer has a rough idea of the model or pattern he needs, he quickly sketches a coarse schema, and wants to retrieve projects that contain matching patterns (with all the details in place). Our approach encompasses the transformation of models into suitable graphs, the definition of a similarity function and an implementation within a search engine platform. In this paper we present the graph matching approach of the query model against the model repository and we evaluate different configurations of the similarity function.
@InProceedings{SUITE11p5,
author = {Bojana Bislimovska and Alessandro Bozzon and Marco Brambilla and Piero Fraternali},
title = {Content-based Search of Model Repositories with Graph Matching Techniques},
booktitle = {Proc.\ SUITE},
publisher = {ACM},
pages = {5--8},
doi = {},
year = {2011},
}
Finding Web Services via BPEL Fragment Search
Shingo Takada
(Keio University, Japan)
The development of service-oriented systems (SOS) is based on searching for services that are to be used. Much work has been done on finding individual services, and recently, work has also been done on searching for services by first searching for similar SOS, i.e., those having similar processes. But such work has focused on finding the entire process of an SOS. The developer may only want part of a process, but current work do not explicitly support it. This paper takes an approach of finding services by first finding process fragments. We take BPEL as an example of a behavioral process model that describes an SOS. We describe our approach to searching for BPEL fragments.
@InProceedings{SUITE11p9,
author = {Shingo Takada},
title = {Finding Web Services via BPEL Fragment Search},
booktitle = {Proc.\ SUITE},
publisher = {ACM},
pages = {9--12},
doi = {},
year = {2011},
}
An Algorithm Search Engine for Software Developers
Sumit Bhatia, Suppawong Tuarob, Prasenjit Mitra, and C. Lee Giles
(Pennsylvania State University, USA)
Efficient algorithms are extremely important and can be crucial for certain software projects. Even though many source code search engines have been proposed in the literature to help software developers find source code related to their needs, to our knowledge there has been no effort to develop systems that keep abreast of the latest algorithmic developments. In this paper, we describe our initial effort towards developing such an algorithm search engine. The proposed system extracts and indexes algorithms discussed in academic literature and their associated metadata. Users can search the index through a free text query interface. The source code of proposed system, being developed as a part of a larger open source toolkit, SeerSuite, will be released in due course. We also provide directions for further research and improvements of the current system.
@InProceedings{SUITE11p13,
author = {Sumit Bhatia and Suppawong Tuarob and Prasenjit Mitra and C. Lee Giles},
title = {An Algorithm Search Engine for Software Developers},
booktitle = {Proc.\ SUITE},
publisher = {ACM},
pages = {13--16},
doi = {},
year = {2011},
}
A Spontaneous Code Recommendation Tool Based on Associative Search
Watanabe Takuya and Hidehiko Masuhara
(University of Tokyo, Japan)
We present Selene, a source code recommendation tool based on an
associative search engine. It spontaneously searches and displays
example programs while the developer is editing a program text. By
using an associative search engine, it can search a repository of
two million example programs within a few seconds. This paper
discusses issues that are revealed by our ongoing implementation of Selene,
in particular those of performance, similarity measures and user
interface.
@InProceedings{SUITE11p17,
author = {Watanabe Takuya and Hidehiko Masuhara},
title = {A Spontaneous Code Recommendation Tool Based on Associative Search},
booktitle = {Proc.\ SUITE},
publisher = {ACM},
pages = {17--20},
doi = {},
year = {2011},
}
Discrepancy Discovery in Search-Enhanced Testing
Werner Janjic, Florian Barth, Oliver Hummel, and
Colin Atkinson
(University of Mannheim, Germany)
Automating software testing can significantly reduce the time and effort
required to assure the quality of software systems, and over recent years
significant strides have been made in test automation techniques. However, one
aspect of software testing that has always resisted full automation is the
determination of the expected results for given system states and input values
-- the so called ``oracle problem''. Fortunately, the recent advent of a new
generation of software search engines containing millions of reusable software
artifacts offers an elegant solution to this dilemma. Once a search engine is
able to deliver multiple results that conform to a given specification (by
searching for and adapting preexisting components), multi-version testing of
software with ``harvested'' oracles becomes a feasible alternative to manual
oracle definition. In this paper we present an approach to Search-Enhanced
Testing with a focus on the discovery of discrepancies between the results
returned by harvested test oracles and a Component Under Test for randomly
generated test invocations. Our current research focuses on validating
the hypothesis that human test engineers will find more defects when
analyzing such automatically discovered discrepancies than when developing test
cases using traditional coverage criteria.
@InProceedings{SUITE11p21,
author = {Werner Janjic and Florian Barth and Oliver Hummel and Colin Atkinson},
title = {Discrepancy Discovery in Search-Enhanced Testing},
booktitle = {Proc.\ SUITE},
publisher = {ACM},
pages = {21--24},
doi = {},
year = {2011},
}
Towards Sharing Source Code Facts Using Linked Data
Iman Keivanloo, Christopher Forbes, Juergen Rilling, and Philippe Charland
(Concordia University, Canada; Defence R&D, Canada)
Linked Data is designed to support interoperability and sharing of open datasets by allowing on the fly inter-linking of data using the basic layers of the Semantic Web and the HTTP protocol. In our research, we focus on providing a Uniform Resource Locator (URL) generation schema and a supporting ontological representation for the inter-linking of data extracted from source code ecosystems. As a result, we created the Source code ECOsystem Linked Data (SECOLD) framework that adheres to the Linked Data publication standard. The framework provides not only source code and facts that are usable by both humans and machines for browsing or querying, but it will also assist the research community at large in sharing and utilizing a standardized source code representation. The dataset has been submitted and registered to ckan.net, under the SECOLD project name, as the first source code Linked Data repository. In order to maintain its relevance to the research community, we plan to update the data set every four months.
@InProceedings{SUITE11p25,
author = {Iman Keivanloo and Christopher Forbes and Juergen Rilling and Philippe Charland},
title = {Towards Sharing Source Code Facts Using Linked Data},
booktitle = {Proc.\ SUITE},
publisher = {ACM},
pages = {25--28},
doi = {},
year = {2011},
}
A Prolog-based Framework for Search, Integration and Empirical Analysis on Software Evolution Data
Pamela Bhattacharya and Iulian Neamtiu
(UC Riverside, USA)
Software projects use different repositories for storing project and evolution information such as source code, bugs and patches. An integrated system that combines these multiple repositories and can answer a broad range of queries regarding the project’s evolution history would be beneficial to both software developers and researchers. For example, the list of source code changes or the list of developers associated with a bug fix are frequent queries for both developers and researchers. Integrating and gathering this information is a tedious, cumbersome, error-prone process when done manually, especially for large projects. Previous approaches to this problem use frameworks that limit the user to a set of pre-defined query templates, or use query languages with limited power. In this paper, we argue the need for a framework built with recursively enumerable languages, that can answer temporal queries, and sup- ports negation and recursion. As a first step toward such a frame- work, we present a Prolog-based system that we built, along with an evaluation of real-world integrated data from the Firefox project. Our system allows for elegant and concise, yet powerful queries, and can be used by developers and researchers for frequent development and empirical analysis tasks.
@InProceedings{SUITE11p29,
author = {Pamela Bhattacharya and Iulian Neamtiu},
title = {A Prolog-based Framework for Search, Integration and Empirical Analysis on Software Evolution Data},
booktitle = {Proc.\ SUITE},
publisher = {ACM},
pages = {29--32},
doi = {},
year = {2011},
}
What Do Developers Search for in Source Code and Why
Oleksandr Panchenko, Hasso Plattner, and Alexander Zeier
(Hasso Plattner Institute for Software Systems Engineering, Germany)
Source code search is an important tool used by software engineers. However, until now relatively little is known about what developers search for in source code and why. This paper addresses this knowledge gap. We present the results of a log file analysis of a source code search engine. The data from the log file was analyzed together with the change history of four development and maintenance systems. The results show that most of the search targets were not changed after being downloaded, thus we concluded that the developers conducted searches to find reusable components, to obtain coding examples or to perform impact analysis. In contrast, maintainers often change the code they have downloaded. Moreover, we automatically categorized the search queries. The most popular categories were: method name, structural pattern, and keyword. The major search target was a statement. Although the selected data set was small, the deviations between the systems were negligible, therefore we conclude that our results are valid.
@InProceedings{SUITE11p33,
author = {Oleksandr Panchenko and Hasso Plattner and Alexander Zeier},
title = {What Do Developers Search for in Source Code and Why},
booktitle = {Proc.\ SUITE},
publisher = {ACM},
pages = {33--36},
doi = {},
year = {2011},
}
Investigating How to Effectively Combine Static Concern Location Techniques
Emily Hill, Lori Pollock, and K. Vijay-Shanker
(Montclair State University, USA; University of Delaware, USA)
As software systems continue to grow and evolve, locating code for maintenance tasks becomes increasingly difficult. Studies have shown that combining static global concern location techniques like search with more structure-based local techniques can improve effectiveness. However, no studies have yet investigated why this occurs. In this paper, we investigate why combining global and local techniques improves effectiveness, and under what conditions. We explore such questions as: “What are the limits of lexical information in locating concerns?”, “How far away does a local technique have to go to locate the remaining relevant elements?”, and “How sensitive are these results to the query or scoring thresholds of the techniques?”. The results of our study can inform design decisions to maximize effective global and local combinations in future concern location techniques.
@InProceedings{SUITE11p37,
author = {Emily Hill and Lori Pollock and K. Vijay-Shanker},
title = {Investigating How to Effectively Combine Static Concern Location Techniques},
booktitle = {Proc.\ SUITE},
publisher = {ACM},
pages = {37--40},
doi = {},
year = {2011},
}
What Kinds of Development Problems Can Be Solved by Searching the Web?: A Field Study
Rosalva E. Gallardo-Valencia and Susan Elliott Sim
(UC Irvine, USA)
@InProceedings{SUITE11p41,
author = {Rosalva E. Gallardo-Valencia and Susan Elliott Sim},
title = {What Kinds of Development Problems Can Be Solved by Searching the Web?: A Field Study},
booktitle = {Proc.\ SUITE},
publisher = {ACM},
pages = {41--40},
doi = {},
year = {2011},
}
proc time: 0.09