A Study on the Interplay between Pull Request Review and Continuous Integration Builds by Fiorella Zampetti, Gabriele Bavota, Gerardo Canfora and Massimiliano Di Penta

pubblicato 11 feb 2019, 12:38 da Gerardo Canfora

Modern code review (MCR) is nowadays well- adopted in industrial and open source projects. Recent studies have investigated how developers perceive its ability to foster code quality, developers' code ownership, and team building. MCR is often being used with automated quality checks through static analysis tools, testing or, ultimately, through automated builds on a Continuous Integration (CI) infrastructure. With the aim of understanding how developers use the outcome of CI builds during code review and, more specifically, during the discussion of pull requests, this paper empirically investigates the interplay between pull request discussion and the use of CI by means of 64,865 pull request discussions belonging to 69 open source projects. After having analyzed to what extent a build outcome influences the pull request merger, we qualitatively analyze the content of 857 pull request discussions. Also, we complement such an analysis with a survey involving 13 developers. While pull requests with passed build have a higher chance of being merged than failed ones, and while survey participants confirmed this quantitative finding, other process-related factors play a more important role in the pull request merge decision. Also, the survey participants point out cases where a pull request can be merged in presence of a CI failure, e.g., when a new pull request is opened to cope with the failure, when the failure is due to minor static analysis warnings. The study also indicates that CI introduces extra complexity, as in many pull requests developers have to solve non-trivial CI configuration issues.
Proc. of 26th IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER'19) Zhejiang University, Hangzhou, China - February 24-27, 2019

A Nlp-based Solution to Prevent from Privacy Leaks in Social Network Posts by G. Canfora, A. Di Sorbo, E. Emanuele, S. Forootani, C.A. Visaggio

pubblicato 7 set 2018, 14:04 da Gerardo Canfora

Private and sensitive information is often revealed in posts appearing in Social Networks (SN). This is due to the users’ willingness to increase their interactions within specific social groups, but also to a poor knowledge about the risks for privacy. We argue that technologies able to evaluate the sensitiveness of information while it is being published could enhance privacy protection by warning the user about the risks deriving from the disclosure of a certain information. To this aim, we propose a method, and an accompanying tool, to automatically intercept the sensitive information which is delivered in a social network post, through the exploitation of recurrent natural language patterns that are often used by users to disclose private data. A comparison with several machine learning techniques reveals that our method outperforms them, since it is more precise, accurate and not dependent on (i) a specific training set, or (ii) the selection of particular features.
Proc. of 13th International ARES Conference on Availability, Reliability and Security (ARES 2018) - Hamburg, Germany, August 27-30, 2018

LEILA: formaL tool for idEntifying mobIle maLicious behAviour by Gerardo Canfora, Fabio Martinelli, Francesco Mercaldo, Vittoria Nardone, Antonella Santone, Corrado Aaron Visaggio

pubblicato 4 mag 2018, 05:35 da Gerardo Canfora

With the increasing diffusion of mobile technologies, nowadays mobile devices represent an irreplaceable tool to perform several operations, from posting a status on a social network to transfer money between bank accounts. As a consequence, mobile devices store a huge amount of private and sensitive information and this is the reason why attackers are developing very sophisticated techniques to extort data and money from our devices. This paper presents the design and the implementation of LEILA (formaL tool for idEntifying mobIle maLicious behAviour), a tool targeted at Android malware families detection. LEILA is based on a novel approach that exploits model checking to analyse and verify the Java Bytecode that is produced when the source code is compiled. After a thorough description of the method used for Android malware families detection, we report the experiments we have conducted using LEILA. The experiments demonstrated that the tool is effective in detecting malicious behaviour and, especially, in localizing the payload within the code: we evaluated real-world malware belonging to several widespread families obtaining an accuracy ranging between 0.97 and 1.
IEEE Transactions on Software Engineering (accepted)

The Relation between Developers’ Communication and Fix-Inducing Changes: An Empirical Study by Mario Luca Bernardi, Gerardo Canfora, Giuseppe A. Di Lucca, Massimiliano Di Penta, Damiano Distante

pubblicato 1 mar 2018, 06:29 da Gerardo Canfora

Background Many open source and industrial projects involve several developers spread around the world and working in different timezones. Such developers usually communicate through mailing lists, issue tracking systems or chats. Lack of adequate communication can create misunderstanding and could possibly cause the introduction of bugs.
Aim This paper aims at investigating the relation between the bug inducing and fixing phenomenon and the lack of written communication between committers in open source projects.
Method We performed an empirical study that involved four open source projects, namely Apache httpd, GNU GCC, Mozilla Firefox, and Xorg Xserver. For each project change history data, issue tracker comments, mailing list messages, and chat logs were analyzed in order to answer four research questions about the relation between the social importance and communication level of committers and their proneness to induce bug xes.
Results and implications Results indicate that the majority of bugs are fixed by committers who did not induce them, a smaller but substantial percentage of bugs is xed by committers that induced them, and very few bugs are fixed by committers that were not directly involved in previous changes on the same les of the x. More importantly, committers inducing xes tend to have a lower level of communication between each other than that of other committers. This last finding suggests that increasing the level of communication between x-inducing committers could reduce the number of xes induced in a software project.
Journal of Systems and Software (to appear)

Android Apps and User Feedback: A Dataset for Software Evolution and Quality Improvement by Giovanni Grano, Andrea Di Sorbo, Francesco Mercaldo, Corrado A. Visaggio, Sebastiano Panichella, Gerardo Canfora

pubblicato 24 lug 2017, 10:24 da Gerardo Canfora   [ aggiornato in data 24 lug 2017, 10:24 ]

Nowadays, Android represents the most popular mobile platform with a market share of around 80%. Previous research showed that data contained in user reviews and code change history of mobile apps represent a rich source of information for reducing software maintenance and development effort, increasing customers’ satisfaction. Stemming from this observation, we present in this paper a large dataset of Android applications belonging to 23 different apps categories, which provides an overview of the types of feedback users report on the apps and documents the evolution of the related code metrics. The dataset contains about 395 applications of the F-Droid repository, including around 600 versions, 280,000 user reviews and more than 450,000 user feedback (extracted with specific text mining approaches). Furthermore, for each app version in our dataset, we employed the Paprika tool and developed several Python scripts to detect 8 different code smells and compute 22 code quality indicators. The paper discusses the potential usefulness of the dataset for future research in the field.
Dataset URL:
2nd International Workshop on App Market Analytics, WAMA 2017 (in conjunction with ESEC/FSE 2017)

Beacon-based context-aware architecture for crowd sensing public transportation scheduling and user habits by Danilo Cianciulli, Gerardo Canfora and Eugenio Zimeo

pubblicato 19 mag 2017, 00:05 da Gerardo Canfora

Crowd sourcing and sensing are relatively recent paradigms that, enabled by the pervasiveness of mobile devices, allow users to transparently contribute in complex problem solving. Their effectiveness depends on people voluntarism, and this could limit their adoption. Recent technologies for automating context-awareness could give a significant impulse to spread crowdsourcing paradigms. In this paper, we propose a distributed software system that exploits mobile devices to improve public transportation efficiency. It takes advantage of the large number of deployed personal mobile devices and uses them as both mule sensors, in cooperation with beacon technology for geofecing, and clients for getting information about bus positions and estimated arrival times. The paper discusses the prototype architecture, its basic application for getting dynamic bus information, and the long-term scope in supporting transportation companies and municipalities, reducing costs, improving bus lines, urban mobility and planning.
The International Workshop on Smart Cities Systems Engineering (SCE 2017)

How Open Source Projects use Static Code Analysis Tools in Continuous Integration Pipelines by Fiorella Zampetti, Simone Scalabrino, Rocco Oliveto, Gerardo Canfora, Massimiliano Di Penta

pubblicato 12 mag 2017, 02:18 da Gerardo Canfora

Static analysis tools are often used by software devel- opers to entail early detection of potential faults, vulnerabilities, code smells, or to assess the source code adherence to coding standards and guidelines. Also, their adoption within Continuous Integration (CI) pipelines has been advocated by researchers and practitioners. This paper studies the usage of static analysis tools in 20 Java open source projects hosted on GitHub and using Travis CI as continuous integration infrastructure. Specifically, we investigate (i) which tools are being used and how they are configured for the CI, (ii) what types of issues make the build fail or raise warnings, and (iii) whether, how, and after how long are broken builds and warnings resolved. Results indicate that in the analyzed projects build breakages due to static analysis tools are mainly related to adherence to coding standards, and there is also some attention to missing licenses. Build failures related to tools identifying potential bugs or vulnerabilities occur less frequently, and in some cases such tools are activated in a “softer” mode, without making the build fail. Also, the study reveals that build breakages due to static analysis tools are quickly fixed by actually solving the problem, rather than by disabling the warning, and are often properly documented.
Proc. of 14th International Conference on Mining Software Repositories (MSR 2017) - May 20-21, 2017. Buenos Aires, Argentina.

ARENA: An Approach for the Automated Generation of Release Notes by Laura Moreno, Gabriele Bavota, Massimiliano Di Penta, Rocco Oliveto, Andrian Marcus and Gerardo Canfora

pubblicato 23 mar 2017, 03:24 da Gerardo Canfora

Release notes document corrections, enhancements, and, in general, changes that were implemented in a new release of a software project. They are usually created manually and may include hundreds of different items, such as descriptions of new features, bug fixes, structural changes, new or deprecated APIs, and changes to software licenses. Thus, producing them can be a time-consuming and daunting task. This paper describes ARENA (Automatic RElease Notes generAtor), an approach for the automatic generation of release notes. ARENA extracts changes from the source code, summarizes them, and integrates them with information from versioning systems and issue trackers. ARENA was designed based on the manual analysis of 990 existing release notes. In order to evaluate the quality of the release notes automatically generated by ARENA, we performed four empirical studies involving a total of 56 participants (48 professional developers and 8 students). The obtained results indicate that the generated release notes are very good approximations of the ones manually produced by developers and often include important information that is missing in the manually created release notes.
IEEE Trans. Software Eng. 43(2): 106-127 (2017)

Data Leakage in Mobile Malware: the what, the why and the how by Corrado Aaron Visaggio, Gerardo Canfora, Luigi Gentile, Francesco Mercaldo

pubblicato 7 feb 2017, 13:14 da Gerardo Canfora   [ aggiornato in data 7 feb 2017, 13:15 ]

Mobile technologies are spreading at a very quick pace. Differently from desktop PCs, smartphones, tablets and wearable devices, manage a lot of sensitive information of the device’s owner. For this reason they represent a very appealing opportunity for attackers to write malicious apps that are able to steal such information. In this paper we analyse a huge set of Android malwares in order to discover which kind of data is exfiltrated from mobile devices and which are the mechanisms that malware writers leverage. For this analysis three tools were employed which are considered the state of the art of the available technology: Flowdroid, Amandroid, and Epicc. Our results show that mobile malware usually exposes users to a massive data leakage.
In: "Intrusion Detection and Prevention for Mobile Ecosystems" (Taylor and Francis publisher), edited by George Kambourakis, Asaf Shabtai, Konstantinos Kolias, and Dimitrios Damopoulos.

SURF: Summarizer of User Reviews Feedback by Andrea Di Sorbo, Sebastiano Panichella, Carol V. Alexandru, Corrado A. Visaggio, Gerardo Canfora

pubblicato 7 feb 2017, 13:00 da Gerardo Canfora   [ aggiornato in data 7 feb 2017, 13:03 ]

Continuous Delivery (CD) enables mobile developers to release small, high quality chunks of working software in a rapid manner. However, faster delivery and a higher software quality do neither guarantee user satisfaction nor positive business outcomes. Previous work demonstrates that app reviews may contain crucial information that can guide developer’s software maintenance efforts to obtain higher customer satisfaction. However, previous work also proves the difficulties encountered by developers in manually analyzing this rich source of data, namely (i) the huge amount of reviews an app may receive on a daily basis and (ii) the unstructured nature of their content. In this paper, we propose SURF (Summarizer of User Reviews Feedback), a tool able to (i) analyze and classify the information contained in app reviews and (ii) distill actionable change tasks for improving mobile applications. Specifically, SURF performs a systematic summarization of thousands of user reviews through the generation of an interactive, structured and condensed agenda of recommended software changes. An endto-end evaluation of SURF, involving 2622 reviews related to 12 different mobile applications, demonstrates the high accuracy of SURF in summarizing user reviews content. In evaluating our approach we also involve the original developers of some apps, who confirm the practical usefulness of the software change recommendations made by SURF.
A tool demo at 39th International Conference on Software Engineering (ICSE 2017)

Video di YouTube

Demo URL:
Demo webpage:

1-10 of 73