Powered by
1st International Workshop on Crowd-based Software Development Methods and Technologies (CrowdSoft),
November 17, 2014,
Hong Kong, China
1st International Workshop on Crowd-based Software Development Methods and Technologies (CrowdSoft)
Frontmatter
Foreword
Welcome to CrowdSoft 2014, the International Workshop on Crowd-based Software Development Methods and Technologies. CrowdSoft 2014 is held in Hong Kong on November 17, 2014, co-located with FSE 2014. CrowdSoft 2014 aims to provide an interactive forum where researchers and professionals from multiple disciplines and domains meet and exchange ideas to explore and address the challenges brought by the crowd-based software development paradigm. CrowdSoft 2014 concentrates on the following topics: (1) the observations revealed from the data in the different kinds of software communities such as Linux Kernel, Mozilla Firefox, SourceForge, Github, StackOverflow, OsChina, etc., and (2) the novel software development paradigms, models, technologies and tools, to give better support for the collaborative software development and resource sharing in the crowd-based software development paradigm. The former of CrowdSoft 2014 is TTA 2010 (Trustie Technologies and Applications), held in conjunction with the Chinese National Software Application Conference (NASAC 2010), and the proceedings were published in the Chinese Journal of Frontiers of Computer Science and Technology, 2011, 5(10). Trustie is the abbreviation of a grand High-Tech Development project in China named “Highly Trustworthy Software Production Tools and Integration Environment”, which aims to develop novel software development models and technologies for the crowd-based software paradigm. The program committee of CrowdSoft 2014 selected 11 papers for presentation at the workshop and publication in the conference proceedings. The workshop has sessions on software crowdsourcing, crowd-based software development in GitHub, crowd-based development for Web services and mobile Apps. We hope that you will remember the new ideas, approaches and discussions in CrowdSoft 2014.
Software Crowdsourcing
Mon, Nov 17, 09:50 - 11:30, Hall 6 (Chair: Gang Yin)
How the Crowd Impacts Commercial Applications: A User-Oriented Approach
Huihong He, ZhiYi Ma, Hongjie Chen, and Weizhong Shao
(Peking University, China)
As crowdsourcing has been applied to a variety of disciplines, e.g. marketing and operationalization, more and more scientists turn their sights to how the crowd innovate software engineering to produce high quality software. However, they mainly focus on the impacts brought by domain experts or experienced developers on developing and managing open source softwares, whereas how softwares are influenced by the ordinary people e.g. end users is seldom discussed and easily omitted. To fill up the research gaps, we investigate into commercial application improvement paradigm with assistance of user crowd. The approach focuses on end users by proposing a workflow loop to form a healthy cycle between them and applications. Especially, the approach propose a suggestion model to encourage users to participate into application runtime adaptation. So far, a prototype is developed to enable the crowd to raise and modify their advices, and our prior work has proven the effectiveness in which applications consider the users’ advices to adapt themselves.
@InProceedings{CrowdSoft14p1,
author = {Huihong He and ZhiYi Ma and Hongjie Chen and Weizhong Shao},
title = {How the Crowd Impacts Commercial Applications: A User-Oriented Approach},
booktitle = {Proc.\ CrowdSoft},
publisher = {ACM},
pages = {1--6},
doi = {},
year = {2014},
}
Crowdsourcing in the Brazilian IT Industry: What We Know and What We Don’t Know
Leticia Machado, Graziela Pereira, Rafael Prikladnicki, Erran Carmel, and Cleidson R. B. de Souza
(PUCRS, Brazil; American University, USA; Federal University of Pernambuco, Brazil)
Crowdsourcing means outsourcing to a large network of people – a crowd. It has emerged as a new option for a global labor market; it is another valuable option in the ´make or buy´ software decision and has been gaining attention in countries where global software engineering plays a significant role, such as Brazil. The adoption of this practice in the Brazilian IT industry is not well known yet. For this reason, this paper presents findings from an empirical study about the topic, in the context of a multi-year study that has the goal of investigating how the Brazilian software labor and industry market is being transformed and disrupted by crowdsourcing. We interviewed professionals from several companies and identified how crowdsourcing is being adopted in Brazil, including possible benefits, main concerns and factors that may avoid some companies to adopt it from three different perspectives: the buyers, the platforms and the crowd. We also share our thoughts about the future of crowdsourcing in the country in the coming years.
@InProceedings{CrowdSoft14p7,
author = {Leticia Machado and Graziela Pereira and Rafael Prikladnicki and Erran Carmel and Cleidson R. B. de Souza},
title = {Crowdsourcing in the Brazilian IT Industry: What We Know and What We Don’t Know},
booktitle = {Proc.\ CrowdSoft},
publisher = {ACM},
pages = {7--12},
doi = {},
year = {2014},
}
Using Clustering and Transitivity to Reduce the Costs of Crowdsourced Entity Resolution
Lisha Guo, Hailong Sun, and Xudong Liu
(Beihang University, China)
Entity resolution is the process of identifying the data records representing the same entity. ER is a highly important problem in software and application domains. For example, detecting duplicate bug reports with ER can greatly save developing efforts. In most cases, humans can perform better than computer algorithms due to complex semantic analysis involved in ER. In light of this, crowdsourcing has been successfully incorporated into ER to improve its accuracy. However, compared with computer methods, crowdsourcing is subject to higher costs. In this work, we propose a method to reduce the number of questions asked to people with clustering and transitivity analysis. Firstly, with appropriate choosing of two similarity thresholds, we use unsupervised machine learning to cluster records into multiple clusters on the basis of certain similarity metrics. In this way, we prune away the record pairs with no need for asking people. Secondly, we design a cluster merging algorithm with efficient selection of crowdsourced questions and leveraging data transitivity to detect the across-cluster records corresponding to the same entity. Finally, we conduct extensive experiments with two real-world datasets and the results show our method significantly outperform existing methods in terms of incurred costs and the F1 metric.
@InProceedings{CrowdSoft14p13,
author = {Lisha Guo and Hailong Sun and Xudong Liu},
title = {Using Clustering and Transitivity to Reduce the Costs of Crowdsourced Entity Resolution},
booktitle = {Proc.\ CrowdSoft},
publisher = {ACM},
pages = {13--18},
doi = {},
year = {2014},
}
iTest: Testing Software with Mobile Crowdsourcing
Minzhi Yan, Hailong Sun, and Xudong Liu
(Beihang University, China)
In recent years, a lot of crowdsourcing systems have emerged and lead to many successful crowdsourcing systems like Wiki-pedia, Amazon Mechanical Turk and Waze. In the field of software engineering, crowdtesting has acquired increased interest and adoption, especially among personal developers and smaller companies. In this paper, we present iTest which combines mobile crowdsourcing and software testing together to support the testing of mobile application and web services. iTest is a framework for software developers to submit their software and conveniently get the test results from the crowd testers. Firstly, we analyze the key problems need to be solved in a mobile crowdtesting platform; Secondly, we present the architecture of iTest framework; Thirdly, we introduce the workflow of testing web service in iTest and propose an algorithm for solving the tester selection problem mentioned in Section 2; Then the development kit to support testing mobile application is explained; Finally, we perform two experiments to illustrate that both the way to access network and tester's location influence the performance of web service.
@InProceedings{CrowdSoft14p19,
author = {Minzhi Yan and Hailong Sun and Xudong Liu},
title = {iTest: Testing Software with Mobile Crowdsourcing},
booktitle = {Proc.\ CrowdSoft},
publisher = {ACM},
pages = {19--24},
doi = {},
year = {2014},
}
Crowd-Based Software Development in GitHub
Mon, Nov 17, 11:30 - 12:30, Hall 6 (Chair: Wei Wang)
Recommending Relevant Projects via User Behaviour: An Exploratory Study on Github
Lingxiao Zhang, Yanzhen Zou, Bing Xie, and Zixiao Zhu
(Peking University, China)
Social coding sites (e.g., Github) provide various features like Forking and Sending Pull-requests to support crowd-based software engineering. When using these features, a large amount of user behavior data is recorded. User behavior data can reflect developers preferences and interests in software development activities. Online service providers in many fields have been using user behavior data to discover user preferences and interests to achieve various purposes. In the field of software engineering however, there has been few studies in mining large amount of user behavior data. Our goal is to design an approach based on user behavior data, to recommend relevant open source projects to developers, which can be helpful in activities like searching for the right open source solutions to quickly build prototypes. In this paper, we explore the possibilities of such a method by conducting a set of experiments on selected data sets from Github. We find it a promising direction in mining projects' relevance from user behavior data. Our study also obtain some important issues that is worth considering in this method.
@InProceedings{CrowdSoft14p25,
author = {Lingxiao Zhang and Yanzhen Zou and Bing Xie and Zixiao Zhu},
title = {Recommending Relevant Projects via User Behaviour: An Exploratory Study on Github},
booktitle = {Proc.\ CrowdSoft},
publisher = {ACM},
pages = {25--30},
doi = {},
year = {2014},
}
Exploring the Patterns of Social Behavior in GitHub
Yue Yu, Gang Yin,
Huaimin Wang, and Tao Wang
(National University of Defense Technology, China)
Social coding paradigm is reshaping the distributed software development with a surprising speed in recent years. Github, a remarkable social coding community, attracts a huge number of developers in a short time. Various kinds of social networks are formed based on social activities among developers. Why this new paradigm can achieve such a great success in attracting external developers, and how they are connected in such a massive community, are interesting questions for revealing power of social coding paradigm. In this paper, we firstly compare the growth curves of project and user in GitHub with three traditional open source software communities to explore differences of their growth modes. We find an explosive growth of the users in GitHub and introduce the Diffusion of Innovation theory to illustrate intrinsic sociological basis of this phenomenon. Secondly, we construct follow-networks according to the follow behaviors among developers in GitHub. Finally, we present four typical social behavior patterns by mining follow-networks containing independence-pattern, group-pattern, star-pattern and hub-pattern. This study can provide several instructions of crowd collaboration to newcomers. According to the typical behavior patterns, the community manager could design corresponding assistive tools for developers.
@InProceedings{CrowdSoft14p31,
author = {Yue Yu and Gang Yin and Huaimin Wang and Tao Wang},
title = {Exploring the Patterns of Social Behavior in GitHub},
booktitle = {Proc.\ CrowdSoft},
publisher = {ACM},
pages = {31--36},
doi = {},
year = {2014},
}
Investigating Social Media in GitHub’s Pull-Requests: A Case Study on Ruby on Rails
Yang Zhang, Gang Yin, Yue Yu, and Huaimin Wang
(National University of Defense Technology, China)
In GitHub, pull-request mechanism is an outstanding social development method by integrating with many social media. Many studies have explored that social media has an important effect on software development. @-mention as a typical social media, is a useful tool in social platform. In this paper, we made a quantitative analysis of @-mention in pull-requests of the project Ruby on Rails. First, we make a convictive statistics of the popularity of pull-request mechanism in GitHub. Then we investigate the current situation of @-mention in the Ruby on Rails. Our empirical analysis results find some insights of @-mention.
@InProceedings{CrowdSoft14p37,
author = {Yang Zhang and Gang Yin and Yue Yu and Huaimin Wang},
title = {Investigating Social Media in GitHub’s Pull-Requests: A Case Study on Ruby on Rails},
booktitle = {Proc.\ CrowdSoft},
publisher = {ACM},
pages = {37--41},
doi = {},
year = {2014},
}
Crowd-Based Development for Web Services and Mobile Apps
Mon, Nov 17, 14:40 - 16:30, Hall 6 (Chair: Tao Wang)
SmartHR: A Resume Query and Management System Based on Semantic Web
Yeqing Ke, Zhirou Ma, Haijiang Wu, Jie Liu, Hua Zhong, and
Jun Wei
(Institute of Software at Chinese Academy of Sciences, China)
Organizations are always confronted with the challenge of eciently nding out suitable candidates from massive re- sumes. Traditional human resource management based on the information management system usually adopts SQL queries or keywords search, which cannot capture the im- plicit information, while the manual work is always time- consuming. To ll this gap, this paper presents SmartHR, a resume query and management system based on seman- tic web. Beneting from knowledge base, it can understand users' intentions more intelligently and search for suitable candidates more accurately. In this paper, we propose two key technical diculties which SmartHR meets, including the complexity of knowledge base construction and the time- consuming semantic search, and then give appropriate solu- tions respectively. Four channels are adopted to construct knowledge base, which are well illustrated. Furthermore, a variety of performance optimizations are employed and the eectiveness is evaluated on real datasets of up to million- s of triples and the results show a great improvement. As a representative application in semantic web, our practice in SmartHR provides useful experience and conclusions for developers.
@InProceedings{CrowdSoft14p42,
author = {Yeqing Ke and Zhirou Ma and Haijiang Wu and Jie Liu and Hua Zhong and Jun Wei},
title = {SmartHR: A Resume Query and Management System Based on Semantic Web},
booktitle = {Proc.\ CrowdSoft},
publisher = {ACM},
pages = {42--48},
doi = {},
year = {2014},
}
Personalized Mobile Application Discovery
Cheng Yang, Tao Wang, Gang Yin, Huaimin Wang, Ming Wu, and Ming Xiao
(National University of Defense Technology, China)
With the dramatic growing of mobile application markets, users can find apps with any function they desire in these markets. However, the huge amounts of apps make it quite a challenge for users to discover good applications efficiently. Previous studies recommend applications based on the download history, user ratings or app usage records. Most of these studies fail to capture users' personal interests in mobile applications precisely. In this paper, we leverage apps as features for describing user's personal interests and propose a novel approach to do personalized recommendation. We introduce a Small-Crowd model to distinguish apps at reflecting users' personal interests, and design a weighting method to rank the installed apps for users by combining the global download information with fine-grained app usage records. The extensive experiments validate the effectiveness of our approach which outperforms state-of-the-art method.
@InProceedings{CrowdSoft14p49,
author = {Cheng Yang and Tao Wang and Gang Yin and Huaimin Wang and Ming Wu and Ming Xiao},
title = {Personalized Mobile Application Discovery},
booktitle = {Proc.\ CrowdSoft},
publisher = {ACM},
pages = {49--54},
doi = {},
year = {2014},
}
A Novel Multilayered Context Awareness Technology for Internetware Evolution
Yan Hu, Qimin Peng, and Xiaohui Hu
(Institute of Software at Chinese Academy of Sciences, China; University of Chinese Academy of Sciences, China)
With the rapid development of Internet, Internetware has attracted a lot of attention as a new promising computing paradigm. Internetware can evolve continuously during runtime with the changes of its environment, thus to achieve higher software performance and customer satisfaction. So it is important to develop effective Internetware context awareness technologies. In this paper, we propose a novel multilayered context awareness technology. First, we build a three-layered architecture for Internetware to avoid disorderly context information transmission. Then, we utilize the ontology model language OWL to formalize Internetware context, to form a unified understanding of context information in the whole Internetware system. Finally, an extensible context management framework is proposed to drive different components in the Internetware system to access and operate related context information. In this proposed context management framework, we utilize the publish/subscribe mechanism to distribute context information and employ ontology libraries for context persistence. With these mechanisms, context awareness technology can facilitate Internetware evolution effectively.
@InProceedings{CrowdSoft14p55,
author = {Yan Hu and Qimin Peng and Xiaohui Hu},
title = {A Novel Multilayered Context Awareness Technology for Internetware Evolution},
booktitle = {Proc.\ CrowdSoft},
publisher = {ACM},
pages = {55--60},
doi = {},
year = {2014},
}
Estimating the Dynamic Performance of Composed Services: A Probability Theory Based Approach
Mingkun Yang, Qimin Peng, and Xiaohui Hu
(Institute of Software at Chinese Academy of Sciences, China; University of Chinese Academy of Sciences, China)
QoS-aware service composition has been widely addressed. When the optimal execution plan is determined, previous work generally assumes that the QoS attributes of component services are static for simplicity. Due to the ignorance of the dynamic nature of web services, the performance of a composed service is only partially evaluated. On account of the deviation of QoS attributes, execution plans have to be adapted at runtime in an unmanageable manner. Since it is unclear how the QoS attributes will vary, the effort of runtime adaption can hardly be assessed in advance. To solve the aforementioned problems, we provide a probability theory based approach to estimated the dynamic performance of composed services. The main idea is to cover the dynamic nature of web services with random variables. Following predefined aggregating rules, some crucial dynamic aspects of a composed service can be efficiently revealed from those of its component services even before the composed service actually executes, which makes it possible to determine the optimal execution plan in terms of some dynamic performance metrics. We also demonstrate some potential applications based on this better understanding of performance issues provided by our method.
@InProceedings{CrowdSoft14p61,
author = {Mingkun Yang and Qimin Peng and Xiaohui Hu},
title = {Estimating the Dynamic Performance of Composed Services: A Probability Theory Based Approach},
booktitle = {Proc.\ CrowdSoft},
publisher = {ACM},
pages = {61--66},
doi = {},
year = {2014},
}
proc time: 0.78