SANER 2017

2017 IEEE 24th International Conference on Software Analysis, Evolution, and Reengineering (SANER), February 20-24, 2017, Klagenfurt, Austria

Desktop Layout

Software Development Support
Main Research
Scalable Tag Recommendation for Software Information Sites
Pingyi Zhou, Jin Liu, Zijiang Yang, and Guangyou Zhou
(Wuhan University, China; Western Michigan University, USA; Central China Normal University, China)
Abstract: Software developers can search, share and learn development experience, solutions, bug fixes and open source projects in software information sites such as StackOverflow and Freecode. Many software information sites rely on tags to classify their contents, i.e. software objects, in order to improve the performance and accuracy of various operations on the sites. The quality of tags thus has a significant impact on the usefulness of these sites. High quality tags are expected to be concise and can describe the most important features of the software objects. Unfortunately tagging is inherently an uncoordinated process. The choice of tags made by individual software developers is dependent not only on a developer's understanding of the software object but also on the developer's English skills and preferences. As a result, the number of different tags grows rapidly along with continuous addition of software objects. With thousands of different tags, many of which introduce noise, software objects become poorly classified. Such phenomenon affects negatively the speed and accuracy of developers' queries. In this paper, we propose a tool called TagMulRec to automatically recommend tags and classify software objects in evolving large-scale software information sites. Given a new software object, TagMulRec locates the software objects that are semantically similar to the new one and exploit their tags. We have evaluated TagMulRec on four software information sites, StackOverflow, AskUbuntu, AskDifferent and Freecode. According to our empirical study, TagMulRec is not only accurate but also scalable that can handle a large-scale software information site with millions of software objects and thousands of tags.


Time stamp: 2020-02-28T23:32:07+01:00