Workshop SUITE 2012 – Author Index |
Contents -
Abstracts -
Authors
|
Akhin, Marat |
![]() Marat Akhin, Nikolai Tillmann, Manuel Fähndrich, Jonathan de Halleux, and Michal Moskal (Saint-Petersburg State Polytechnical University, Russia; Microsoft Research, USA) Code search has always been essential to software development; it is the cornerstone of activities such as program comprehension and maintenance. Traditionally, code search required learning of complex query languages with very steep learning curves. In contrast, programming environments for mobile devices targeting novice programmers are becoming popular and code search is becoming increasingly important. Yet, dedicated code query languages present a learning barrier for novice programmers. In this paper we consider search-by-example as a way of dealing with this problem. Given a query code snippet, we find all similar snippets in the codebase and present them to the user. This problem is a special instance of the clone detection problem, and, by using relevant techniques, we can perform precise code search with little to no configuration and completely agnostic of code formatting, variable renamings, etc. These properties make search-by-example very easy to use by inexperienced programmers. We built a prototype of our approach in TouchDevelop, a novel mobile app development environment for Windows Phone. We will use it as a testing ground for future evaluation. ![]() |
|
Atkinson, Colin |
![]() Werner Janjic and Colin Atkinson (University of Mannheim, Germany) Research on software reuse over the last decade has removed a lot of obstacles to its practical adoption. However, despite the claims in the software reuse literature of 1990's there are still some fundamental research challenges to be addressed, especially the problem of delivering "good" (i.e. high quality) search results with high precision and semantic recall. In terms of precision, one of the most promising approach to have emerged in recent years is test-driven search which only includes components in the result set that actually match a developer’s behavioral requirements as defined by a test case. However, the test-driven search prototypes available today currently have a low “semantic recall” because they are unable to find semantically matching components which have the wrong syntactic interface. In this paper we describe an automatic adaptation engine that alleviates this problem by automatically creating adapters to allow semantically mismatching components to be tested by test-driven search engines, thus significantly enhancing their semantic recall. ![]() |
|
Bhowmik, Tanmay |
![]() Nan Niu, Sandeep Reddivari, Anas Mahmoud, Tanmay Bhowmik, and Songhua Xu (Mississippi State University, USA; Oak Ridge National Laboratory, USA) Clustering is of great practical value in retrieving reusable requirements artifacts from the ever-growing software project repositories. Despite the development of automated cluster labeling techniques in information retrieval, little is understood about automatic labeling of requirements clusters. In this paper, we review the literature on cluster labeling, and conduct an experiment to evaluate how automated methods perform in labeling requirements clusters. The results show that differential labeling outperforms cluster-internal labeling, and that hybrid method does not necessarily lead to the labels best matching human judgment. Our work sheds light on improving automated ways to support search-driven development. ![]() |
|
Bislimovska, Bojana |
![]() Bojana Bislimovska, Alessandro Bozzon, Marco Brambilla, and Piero Fraternali (Politecnico di Milano, Italy) As the quantity of software artifacts, mainly source code and software models, stored in repositories increases, the need for their efficient search becomes more important. In this paper we propose content-based query (a.k.a query-by-example) approach for searching software model repositories, in order to retrieve significant models or model fragments. The query-by-example search conveys the user need in form of a model or pattern specified in a coarse way. Our approach incorporates analysis and indexing of models using textual information retrieval techniques, which exploit the knowledge of the metamodel the models conform to. This allows us to explore different segmentation granularities on models and different indexing techniques ranging from simple bag of words, to index structures which integrate metamodel information. We detail the proposed theoretical framework, the implementation of the method upon open-source architectures, and we discuss the results of our experiments upon a public dataset of UML models. ![]() |
|
Bozzon, Alessandro |
![]() Bojana Bislimovska, Alessandro Bozzon, Marco Brambilla, and Piero Fraternali (Politecnico di Milano, Italy) As the quantity of software artifacts, mainly source code and software models, stored in repositories increases, the need for their efficient search becomes more important. In this paper we propose content-based query (a.k.a query-by-example) approach for searching software model repositories, in order to retrieve significant models or model fragments. The query-by-example search conveys the user need in form of a model or pattern specified in a coarse way. Our approach incorporates analysis and indexing of models using textual information retrieval techniques, which exploit the knowledge of the metamodel the models conform to. This allows us to explore different segmentation granularities on models and different indexing techniques ranging from simple bag of words, to index structures which integrate metamodel information. We detail the proposed theoretical framework, the implementation of the method upon open-source architectures, and we discuss the results of our experiments upon a public dataset of UML models. ![]() |
|
Brambilla, Marco |
![]() Bojana Bislimovska, Alessandro Bozzon, Marco Brambilla, and Piero Fraternali (Politecnico di Milano, Italy) As the quantity of software artifacts, mainly source code and software models, stored in repositories increases, the need for their efficient search becomes more important. In this paper we propose content-based query (a.k.a query-by-example) approach for searching software model repositories, in order to retrieve significant models or model fragments. The query-by-example search conveys the user need in form of a model or pattern specified in a coarse way. Our approach incorporates analysis and indexing of models using textual information retrieval techniques, which exploit the knowledge of the metamodel the models conform to. This allows us to explore different segmentation granularities on models and different indexing techniques ranging from simple bag of words, to index structures which integrate metamodel information. We detail the proposed theoretical framework, the implementation of the method upon open-source architectures, and we discuss the results of our experiments upon a public dataset of UML models. ![]() |
|
De Halleux, Jonathan |
![]() Marat Akhin, Nikolai Tillmann, Manuel Fähndrich, Jonathan de Halleux, and Michal Moskal (Saint-Petersburg State Polytechnical University, Russia; Microsoft Research, USA) Code search has always been essential to software development; it is the cornerstone of activities such as program comprehension and maintenance. Traditionally, code search required learning of complex query languages with very steep learning curves. In contrast, programming environments for mobile devices targeting novice programmers are becoming popular and code search is becoming increasingly important. Yet, dedicated code query languages present a learning barrier for novice programmers. In this paper we consider search-by-example as a way of dealing with this problem. Given a query code snippet, we find all similar snippets in the codebase and present them to the user. This problem is a special instance of the clone detection problem, and, by using relevant techniques, we can perform precise code search with little to no configuration and completely agnostic of code formatting, variable renamings, etc. These properties make search-by-example very easy to use by inexperienced programmers. We built a prototype of our approach in TouchDevelop, a novel mobile app development environment for Windows Phone. We will use it as a testing ground for future evaluation. ![]() |
|
DeLine, Robert |
![]() Adrian Kuhn and Robert DeLine (University of British Columbia, Canada; Microsoft Research, USA) Modern software development requires a large investment in learning application programming interfaces (APIs). Recent research found that the learning materials themselves are often inadequate: developers struggle to find answers beyond simple usage scenarios. Solving these problems requires a large investment in tool and search engine development. To understand where further investment would be most useful, we ran a study with 19 professional developers to understand what a solution might look like, free of technical constraints. In this paper, we report on design implications of tools for API learning, grounded in the reality of the professional developers themselves. The reoccurring themes in the participants' feedback were trustworthiness, confidentiality, information overload and the need for code examples as first-class documentation artifacts. ![]() |
|
Fähndrich, Manuel |
![]() Marat Akhin, Nikolai Tillmann, Manuel Fähndrich, Jonathan de Halleux, and Michal Moskal (Saint-Petersburg State Polytechnical University, Russia; Microsoft Research, USA) Code search has always been essential to software development; it is the cornerstone of activities such as program comprehension and maintenance. Traditionally, code search required learning of complex query languages with very steep learning curves. In contrast, programming environments for mobile devices targeting novice programmers are becoming popular and code search is becoming increasingly important. Yet, dedicated code query languages present a learning barrier for novice programmers. In this paper we consider search-by-example as a way of dealing with this problem. Given a query code snippet, we find all similar snippets in the codebase and present them to the user. This problem is a special instance of the clone detection problem, and, by using relevant techniques, we can perform precise code search with little to no configuration and completely agnostic of code formatting, variable renamings, etc. These properties make search-by-example very easy to use by inexperienced programmers. We built a prototype of our approach in TouchDevelop, a novel mobile app development environment for Windows Phone. We will use it as a testing ground for future evaluation. ![]() |
|
Forbes, Christopher |
![]() Iman Keivanloo, Christopher Forbes, and Juergen Rilling (Concordia University, Canada) This paper presents an Eclipse plug-in that provides source code similarity search over source code available on the Internet. We show how our Linked Data repository (SeCold) and scalable clone search approach (SeClone) can provide the enabling technology for an open Internet-scale similarity search service. ![]() |
|
Fraternali, Piero |
![]() Bojana Bislimovska, Alessandro Bozzon, Marco Brambilla, and Piero Fraternali (Politecnico di Milano, Italy) As the quantity of software artifacts, mainly source code and software models, stored in repositories increases, the need for their efficient search becomes more important. In this paper we propose content-based query (a.k.a query-by-example) approach for searching software model repositories, in order to retrieve significant models or model fragments. The query-by-example search conveys the user need in form of a model or pattern specified in a coarse way. Our approach incorporates analysis and indexing of models using textual information retrieval techniques, which exploit the knowledge of the metamodel the models conform to. This allows us to explore different segmentation granularities on models and different indexing techniques ranging from simple bag of words, to index structures which integrate metamodel information. We detail the proposed theoretical framework, the implementation of the method upon open-source architectures, and we discuss the results of our experiments upon a public dataset of UML models. ![]() |
|
Janjic, Werner |
![]() Werner Janjic and Colin Atkinson (University of Mannheim, Germany) Research on software reuse over the last decade has removed a lot of obstacles to its practical adoption. However, despite the claims in the software reuse literature of 1990's there are still some fundamental research challenges to be addressed, especially the problem of delivering "good" (i.e. high quality) search results with high precision and semantic recall. In terms of precision, one of the most promising approach to have emerged in recent years is test-driven search which only includes components in the result set that actually match a developer’s behavioral requirements as defined by a test case. However, the test-driven search prototypes available today currently have a low “semantic recall” because they are unable to find semantically matching components which have the wrong syntactic interface. In this paper we describe an automatic adaptation engine that alleviates this problem by automatically creating adapters to allow semantically mismatching components to be tested by test-driven search engines, thus significantly enhancing their semantic recall. ![]() |
|
Jridi, Jamel Eddine |
![]() Jamel Eddine Jridi, Houari Sahraoui, and Philippe Langlais (University of Montreal, Canada) We propose an interactive querying approach for program analysis and comprehension tasks. In our approach, an analyst uses a set of basic filters (information retrieval, structural, quantitative, and user selection) to define complex queries. These queries are built following an interactive and iterative process where basic filters are selected and executed, and their results displayed, changed, and combined using predefined operators. ![]() |
|
Keivanloo, Iman |
![]() Iman Keivanloo, Christopher Forbes, and Juergen Rilling (Concordia University, Canada) This paper presents an Eclipse plug-in that provides source code similarity search over source code available on the Internet. We show how our Linked Data repository (SeCold) and scalable clone search approach (SeClone) can provide the enabling technology for an open Internet-scale similarity search service. ![]() |
|
Kuhn, Adrian |
![]() Adrian Kuhn and Robert DeLine (University of British Columbia, Canada; Microsoft Research, USA) Modern software development requires a large investment in learning application programming interfaces (APIs). Recent research found that the learning materials themselves are often inadequate: developers struggle to find answers beyond simple usage scenarios. Solving these problems requires a large investment in tool and search engine development. To understand where further investment would be most useful, we ran a study with 19 professional developers to understand what a solution might look like, free of technical constraints. In this paper, we report on design implications of tools for API learning, grounded in the reality of the professional developers themselves. The reoccurring themes in the participants' feedback were trustworthiness, confidentiality, information overload and the need for code examples as first-class documentation artifacts. ![]() |
|
Langlais, Philippe |
![]() Jamel Eddine Jridi, Houari Sahraoui, and Philippe Langlais (University of Montreal, Canada) We propose an interactive querying approach for program analysis and comprehension tasks. In our approach, an analyst uses a set of basic filters (information retrieval, structural, quantitative, and user selection) to define complex queries. These queries are built following an interactive and iterative process where basic filters are selected and executed, and their results displayed, changed, and combined using predefined operators. ![]() |
|
Mahmoud, Anas |
![]() Nan Niu, Sandeep Reddivari, Anas Mahmoud, Tanmay Bhowmik, and Songhua Xu (Mississippi State University, USA; Oak Ridge National Laboratory, USA) Clustering is of great practical value in retrieving reusable requirements artifacts from the ever-growing software project repositories. Despite the development of automated cluster labeling techniques in information retrieval, little is understood about automatic labeling of requirements clusters. In this paper, we review the literature on cluster labeling, and conduct an experiment to evaluate how automated methods perform in labeling requirements clusters. The results show that differential labeling outperforms cluster-internal labeling, and that hybrid method does not necessarily lead to the labels best matching human judgment. Our work sheds light on improving automated ways to support search-driven development. ![]() |
|
Masuhara, Hidehiko |
![]() Hidehiko Masuhara, Naoya Murakami, and Takuya Watanabe (University of Tokyo, Japan; Edirium, Japan) A search-based recommendation system looks, in the code repository, for programs that are relevant to the program being edited. Storing a large amount of open source programs into the repository will make the search results better, but also causes the code clone problem; i.e., recommending a set of program fragments that are almost idential. To tackle this problem, we propose a novel approach that ranks recommended programs by taking their ``freshness'' count into account. This short paper discusses the background of the problem, and illustrates the proposed algorithm. ![]() |
|
Moskal, Michal |
![]() Marat Akhin, Nikolai Tillmann, Manuel Fähndrich, Jonathan de Halleux, and Michal Moskal (Saint-Petersburg State Polytechnical University, Russia; Microsoft Research, USA) Code search has always been essential to software development; it is the cornerstone of activities such as program comprehension and maintenance. Traditionally, code search required learning of complex query languages with very steep learning curves. In contrast, programming environments for mobile devices targeting novice programmers are becoming popular and code search is becoming increasingly important. Yet, dedicated code query languages present a learning barrier for novice programmers. In this paper we consider search-by-example as a way of dealing with this problem. Given a query code snippet, we find all similar snippets in the codebase and present them to the user. This problem is a special instance of the clone detection problem, and, by using relevant techniques, we can perform precise code search with little to no configuration and completely agnostic of code formatting, variable renamings, etc. These properties make search-by-example very easy to use by inexperienced programmers. We built a prototype of our approach in TouchDevelop, a novel mobile app development environment for Windows Phone. We will use it as a testing ground for future evaluation. ![]() |
|
Murakami, Naoya |
![]() Hidehiko Masuhara, Naoya Murakami, and Takuya Watanabe (University of Tokyo, Japan; Edirium, Japan) A search-based recommendation system looks, in the code repository, for programs that are relevant to the program being edited. Storing a large amount of open source programs into the repository will make the search results better, but also causes the code clone problem; i.e., recommending a set of program fragments that are almost idential. To tackle this problem, we propose a novel approach that ranks recommended programs by taking their ``freshness'' count into account. This short paper discusses the background of the problem, and illustrates the proposed algorithm. ![]() |
|
Niu, Nan |
![]() Nan Niu, Sandeep Reddivari, Anas Mahmoud, Tanmay Bhowmik, and Songhua Xu (Mississippi State University, USA; Oak Ridge National Laboratory, USA) Clustering is of great practical value in retrieving reusable requirements artifacts from the ever-growing software project repositories. Despite the development of automated cluster labeling techniques in information retrieval, little is understood about automatic labeling of requirements clusters. In this paper, we review the literature on cluster labeling, and conduct an experiment to evaluate how automated methods perform in labeling requirements clusters. The results show that differential labeling outperforms cluster-internal labeling, and that hybrid method does not necessarily lead to the labels best matching human judgment. Our work sheds light on improving automated ways to support search-driven development. ![]() |
|
Reddivari, Sandeep |
![]() Nan Niu, Sandeep Reddivari, Anas Mahmoud, Tanmay Bhowmik, and Songhua Xu (Mississippi State University, USA; Oak Ridge National Laboratory, USA) Clustering is of great practical value in retrieving reusable requirements artifacts from the ever-growing software project repositories. Despite the development of automated cluster labeling techniques in information retrieval, little is understood about automatic labeling of requirements clusters. In this paper, we review the literature on cluster labeling, and conduct an experiment to evaluate how automated methods perform in labeling requirements clusters. The results show that differential labeling outperforms cluster-internal labeling, and that hybrid method does not necessarily lead to the labels best matching human judgment. Our work sheds light on improving automated ways to support search-driven development. ![]() |
|
Reiss, Steven P. |
![]() Steven P. Reiss (Brown University, USA) Automatically building programs has been a research goal for over 40 years. Code search technology, particularly code search combined with directed program transformations and validation, has the potential to address many of the problems related to automatic programming. In this position paper we outline an approach to using code search as a tool for generating moderate sized programs, define three problems that will need to be addressed, and describe our first steps toward solving those problems. ![]() |
|
Rilling, Juergen |
![]() Iman Keivanloo, Christopher Forbes, and Juergen Rilling (Concordia University, Canada) This paper presents an Eclipse plug-in that provides source code similarity search over source code available on the Internet. We show how our Linked Data repository (SeCold) and scalable clone search approach (SeClone) can provide the enabling technology for an open Internet-scale similarity search service. ![]() |
|
Sahraoui, Houari |
![]() Jamel Eddine Jridi, Houari Sahraoui, and Philippe Langlais (University of Montreal, Canada) We propose an interactive querying approach for program analysis and comprehension tasks. In our approach, an analyst uses a set of basic filters (information retrieval, structural, quantitative, and user selection) to define complex queries. These queries are built following an interactive and iterative process where basic filters are selected and executed, and their results displayed, changed, and combined using predefined operators. ![]() |
|
Tillmann, Nikolai |
![]() Marat Akhin, Nikolai Tillmann, Manuel Fähndrich, Jonathan de Halleux, and Michal Moskal (Saint-Petersburg State Polytechnical University, Russia; Microsoft Research, USA) Code search has always been essential to software development; it is the cornerstone of activities such as program comprehension and maintenance. Traditionally, code search required learning of complex query languages with very steep learning curves. In contrast, programming environments for mobile devices targeting novice programmers are becoming popular and code search is becoming increasingly important. Yet, dedicated code query languages present a learning barrier for novice programmers. In this paper we consider search-by-example as a way of dealing with this problem. Given a query code snippet, we find all similar snippets in the codebase and present them to the user. This problem is a special instance of the clone detection problem, and, by using relevant techniques, we can perform precise code search with little to no configuration and completely agnostic of code formatting, variable renamings, etc. These properties make search-by-example very easy to use by inexperienced programmers. We built a prototype of our approach in TouchDevelop, a novel mobile app development environment for Windows Phone. We will use it as a testing ground for future evaluation. ![]() |
|
Watanabe, Takuya |
![]() Hidehiko Masuhara, Naoya Murakami, and Takuya Watanabe (University of Tokyo, Japan; Edirium, Japan) A search-based recommendation system looks, in the code repository, for programs that are relevant to the program being edited. Storing a large amount of open source programs into the repository will make the search results better, but also causes the code clone problem; i.e., recommending a set of program fragments that are almost idential. To tackle this problem, we propose a novel approach that ranks recommended programs by taking their ``freshness'' count into account. This short paper discusses the background of the problem, and illustrates the proposed algorithm. ![]() |
|
Xu, Songhua |
![]() Nan Niu, Sandeep Reddivari, Anas Mahmoud, Tanmay Bhowmik, and Songhua Xu (Mississippi State University, USA; Oak Ridge National Laboratory, USA) Clustering is of great practical value in retrieving reusable requirements artifacts from the ever-growing software project repositories. Despite the development of automated cluster labeling techniques in information retrieval, little is understood about automatic labeling of requirements clusters. In this paper, we review the literature on cluster labeling, and conduct an experiment to evaluate how automated methods perform in labeling requirements clusters. The results show that differential labeling outperforms cluster-internal labeling, and that hybrid method does not necessarily lead to the labels best matching human judgment. Our work sheds light on improving automated ways to support search-driven development. ![]() |
28 authors
proc time: 0.04