Android Apps and User Feedback: A Dataset for Software Evolution and Quality Improvement by Giovanni Grano, Andrea Di Sorbo, Francesco Mercaldo, Corrado A. Visaggio, Sebastiano Panichella, Gerardo Canfora

pubblicato 24 lug 2017, 10:24 da Gerardo Canfora   [ aggiornato in data 24 lug 2017, 10:24 ]

Nowadays, Android represents the most popular mobile platform with a market share of around 80%. Previous research showed that data contained in user reviews and code change history of mobile apps represent a rich source of information for reducing software maintenance and development effort, increasing customers’ satisfaction. Stemming from this observation, we present in this paper a large dataset of Android applications belonging to 23 different apps categories, which provides an overview of the types of feedback users report on the apps and documents the evolution of the related code metrics. The dataset contains about 395 applications of the F-Droid repository, including around 600 versions, 280,000 user reviews and more than 450,000 user feedback (extracted with specific text mining approaches). Furthermore, for each app version in our dataset, we employed the Paprika tool and developed several Python scripts to detect 8 different code smells and compute 22 code quality indicators. The paper discusses the potential usefulness of the dataset for future research in the field.
Dataset URL:
2nd International Workshop on App Market Analytics, WAMA 2017 (in conjunction with ESEC/FSE 2017)

Beacon-based context-aware architecture for crowd sensing public transportation scheduling and user habits by Danilo Cianciulli, Gerardo Canfora and Eugenio Zimeo

pubblicato 19 mag 2017, 00:05 da Gerardo Canfora

Crowd sourcing and sensing are relatively recent paradigms that, enabled by the pervasiveness of mobile devices, allow users to transparently contribute in complex problem solving. Their effectiveness depends on people voluntarism, and this could limit their adoption. Recent technologies for automating context-awareness could give a significant impulse to spread crowdsourcing paradigms. In this paper, we propose a distributed software system that exploits mobile devices to improve public transportation efficiency. It takes advantage of the large number of deployed personal mobile devices and uses them as both mule sensors, in cooperation with beacon technology for geofecing, and clients for getting information about bus positions and estimated arrival times. The paper discusses the prototype architecture, its basic application for getting dynamic bus information, and the long-term scope in supporting transportation companies and municipalities, reducing costs, improving bus lines, urban mobility and planning.
The International Workshop on Smart Cities Systems Engineering (SCE 2017)

How Open Source Projects use Static Code Analysis Tools in Continuous Integration Pipelines by Fiorella Zampetti, Simone Scalabrino, Rocco Oliveto, Gerardo Canfora, Massimiliano Di Penta

pubblicato 12 mag 2017, 02:18 da Gerardo Canfora

Static analysis tools are often used by software devel- opers to entail early detection of potential faults, vulnerabilities, code smells, or to assess the source code adherence to coding standards and guidelines. Also, their adoption within Continuous Integration (CI) pipelines has been advocated by researchers and practitioners. This paper studies the usage of static analysis tools in 20 Java open source projects hosted on GitHub and using Travis CI as continuous integration infrastructure. Specifically, we investigate (i) which tools are being used and how they are configured for the CI, (ii) what types of issues make the build fail or raise warnings, and (iii) whether, how, and after how long are broken builds and warnings resolved. Results indicate that in the analyzed projects build breakages due to static analysis tools are mainly related to adherence to coding standards, and there is also some attention to missing licenses. Build failures related to tools identifying potential bugs or vulnerabilities occur less frequently, and in some cases such tools are activated in a “softer” mode, without making the build fail. Also, the study reveals that build breakages due to static analysis tools are quickly fixed by actually solving the problem, rather than by disabling the warning, and are often properly documented.
Proc. of 14th International Conference on Mining Software Repositories (MSR 2017) - May 20-21, 2017. Buenos Aires, Argentina.

ARENA: An Approach for the Automated Generation of Release Notes by Laura Moreno, Gabriele Bavota, Massimiliano Di Penta, Rocco Oliveto, Andrian Marcus and Gerardo Canfora

pubblicato 23 mar 2017, 03:24 da Gerardo Canfora

Release notes document corrections, enhancements, and, in general, changes that were implemented in a new release of a software project. They are usually created manually and may include hundreds of different items, such as descriptions of new features, bug fixes, structural changes, new or deprecated APIs, and changes to software licenses. Thus, producing them can be a time-consuming and daunting task. This paper describes ARENA (Automatic RElease Notes generAtor), an approach for the automatic generation of release notes. ARENA extracts changes from the source code, summarizes them, and integrates them with information from versioning systems and issue trackers. ARENA was designed based on the manual analysis of 990 existing release notes. In order to evaluate the quality of the release notes automatically generated by ARENA, we performed four empirical studies involving a total of 56 participants (48 professional developers and 8 students). The obtained results indicate that the generated release notes are very good approximations of the ones manually produced by developers and often include important information that is missing in the manually created release notes.
IEEE Trans. Software Eng. 43(2): 106-127 (2017)

Data Leakage in Mobile Malware: the what, the why and the how by Corrado Aaron Visaggio, Gerardo Canfora, Luigi Gentile, Francesco Mercaldo

pubblicato 7 feb 2017, 13:14 da Gerardo Canfora   [ aggiornato in data 7 feb 2017, 13:15 ]

Mobile technologies are spreading at a very quick pace. Differently from desktop PCs, smartphones, tablets and wearable devices, manage a lot of sensitive information of the device’s owner. For this reason they represent a very appealing opportunity for attackers to write malicious apps that are able to steal such information. In this paper we analyse a huge set of Android malwares in order to discover which kind of data is exfiltrated from mobile devices and which are the mechanisms that malware writers leverage. For this analysis three tools were employed which are considered the state of the art of the available technology: Flowdroid, Amandroid, and Epicc. Our results show that mobile malware usually exposes users to a massive data leakage.
In: "Intrusion Detection and Prevention for Mobile Ecosystems" (Taylor and Francis publisher), edited by George Kambourakis, Asaf Shabtai, Konstantinos Kolias, and Dimitrios Damopoulos.

SURF: Summarizer of User Reviews Feedback by Andrea Di Sorbo, Sebastiano Panichella, Carol V. Alexandru, Corrado A. Visaggio, Gerardo Canfora

pubblicato 7 feb 2017, 13:00 da Gerardo Canfora   [ aggiornato in data 7 feb 2017, 13:03 ]

Continuous Delivery (CD) enables mobile developers to release small, high quality chunks of working software in a rapid manner. However, faster delivery and a higher software quality do neither guarantee user satisfaction nor positive business outcomes. Previous work demonstrates that app reviews may contain crucial information that can guide developer’s software maintenance efforts to obtain higher customer satisfaction. However, previous work also proves the difficulties encountered by developers in manually analyzing this rich source of data, namely (i) the huge amount of reviews an app may receive on a daily basis and (ii) the unstructured nature of their content. In this paper, we propose SURF (Summarizer of User Reviews Feedback), a tool able to (i) analyze and classify the information contained in app reviews and (ii) distill actionable change tasks for improving mobile applications. Specifically, SURF performs a systematic summarization of thousands of user reviews through the generation of an interactive, structured and condensed agenda of recommended software changes. An endto-end evaluation of SURF, involving 2622 reviews related to 12 different mobile applications, demonstrates the high accuracy of SURF in summarizing user reviews content. In evaluating our approach we also involve the original developers of some apps, who confirm the practical usefulness of the software change recommendations made by SURF.
A tool demo at 39th International Conference on Software Engineering (ICSE 2017)

Video di YouTube

Demo URL:
Demo webpage:

Estimating the number of remaining links in traceability recovery by Davide Falessi, Massimiliano Di Penta, Gerardo Canfora & Giovanni Cantone

pubblicato 10 nov 2016, 13:40 da Gerardo Canfora

Although very important in software engineering, establishing traceability links between software artifacts is extremely tedious, error-prone, and it requires significant effort. Even when approaches for automated traceability recovery exist, these provide the requirements analyst with a, usually very long, ranked list of candidate links that needs to be manually inspected. In this paper we introduce an approach called Estimation of the Number of Remaining Links (ENRL) which aims at estimating, via Machine Learning (ML) classifiers, the number of remaining positive links in a ranked list of candidate traceability links produced by a Natural Language Processing techniques-based recovery approach. We have evaluated the accuracy of the ENRL approach by considering several ML classifiers and NLP techniques on three datasets from industry and academia, and concerning traceability links among different kinds of software artifacts including requirements, use cases, design documents, source code, and test cases. Results from our study indicate that: (i) specific estimation models are able to provide accurate estimates of the number of remaining positive links; (ii) the estimation accuracy depends on the choice of the NLP technique, and (iii) univariate estimation models outperform multivariate ones.
Empirical Software Engineering, DOI 10.1007/s10664-016-9460-6 

What Would Users Change in My App? Summarizing App Reviews for Recommending Software Changes by Andrea Di Sorbo, Sebastiano Panichella, Carol V. Alexandru, Junji Shimagaki, Corrado A. Visaggio, Gerardo Canfora, Harald Gall

pubblicato 11 set 2016, 10:57 da Gerardo Canfora   [ aggiornato in data 11 set 2016, 10:58 ]

Mobile app developers constantly monitor feedback in user reviews with the goal of improving their mobile apps and better meeting user expectations. Thus, automated approaches have been proposed in literature with the aim of reducing the effort required for analyzing feedback contained in user reviews via automatic classification/prioritization according to specific topics. In this paper, we introduce SURF (Summarizer of User Reviews Feedback), a novel approach to condense the enormous amount of information that developers of popular apps have to manage due to user feedback received on a daily basis. SURF relies on a conceptual model for capturing user needs useful for developers performing maintenance and evolution tasks. Then it uses sophisticated summarisation techniques for summarizing thousands of reviews and generating an interactive, structured and condensed agenda of recommended software changes. We performed an end-to-end evaluation of SURF on user reviews of 17 mobile apps (5 of them developed by Sony Mobile), involving 23 developers and researchers in total. Results demonstrate high accuracy of SURF in summarizing reviews and the usefulness of the recommended changes. In evaluating our approach we found that SURF helps developers in better understanding user needs, substantially reducing the time required by developers compared to manually analyzing user (change) requests and planning future software changes.
Prof. of 24th ACM SIGSOFT International Symposium on the Foundations of Software Engineering (FSE 2016)

ARdoc: App Reviews Development Oriented Classifier by Sebastiano Panichella, Andrea Di Sorbo, Emitza Guzman, Corrado A. Visaggio, Gerardo Canfora, Harald Gall

pubblicato 11 set 2016, 10:46 da Gerardo Canfora   [ aggiornato in data 17 nov 2016, 15:02 ]

Google Play, Apple App Store and Windows Phone Store are well known distribution platforms where users can download mobile apps, rate them and write review comments about the apps they are using. Previous research studies demonstrated that these reviews contain important information to help developers improve their apps. However, analyzing reviews is challenging due to the large amount of reviews posted every day, the unstructured nature of reviews and its varying quality. In this demo we present ARdoc, a tool which combines three techniques: (1) Natural Language Parsing, (2) Text Analysis and (3) Sentiment Analysis to automatically classify useful feedback contained in app reviews important for performing software maintenance and evolution tasks. Our quantitative and qualitative analysis (involving mobile professional developers) demonstrates that ARdoc correctly classifies feedback useful for maintenance perspectives in user reviews with high precision (ranging between 84% and 89%), recall (ranging between 84% and 89%), and F-Measure (ranging between 84% and 89%). While evaluating our tool developers of our study confirmed the usefulness of ARdoc in extracting important maintenance tasks for their mobile applications.
A tool demo at 24th ACM SIGSOFT International Symposium on the Foundations of Software Engineering (FSE'16)

Download the final version:
ACM DL Author-ize serviceARdoc: app reviews development oriented classifier
Sebastiano Panichella, Andrea Di Sorbo, Emitza Guzman, Corrado A. Visaggio, Gerardo Canfora, Harald C. Gall
FSE 2016 Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, 2016

How I met your mother? An empirical study about Android Malware Phylogenesis by Gerardo Canfora, Francesco Mercado, Antonio Pirozzi and Corrado Aaron Visaggio

pubblicato 31 mag 2016, 14:51 da Gerardo Canfora

New malware is often not really new: malware writers are used to add functionality to existing malware, or merge different pieces of existing malware code. This determines a proliferation of variants of the same malware, that are logically grouped in “malware families”. To be able to recognize the malware family a malware belongs to is useful for malware analysis, fast infection response, and quick incident resolution. In this paper we introduce DescentDroid, a tool that traces back the malware descendant family. We experiment our technique with a real world dataset of malicious applications labelled with the family they belong to, obtaining high precision in recognizing the malware family membership.
Proc. of 13th International Joint Conference on Security and Cryptography (SECRYPT-2016)

1-10 of 69