Preprints

LEILA: formaL tool for idEntifying mobIle maLicious behAviour by Gerardo Canfora, Fabio Martinelli, Francesco Mercaldo, Vittoria Nardone, Antonella Santone, Corrado Aaron Visaggio

pubblicato 4 mag 2018, 05:35 da Gerardo Canfora

With the increasing diffusion of mobile technologies, nowadays mobile devices represent an irreplaceable tool to perform several operations, from posting a status on a social network to transfer money between bank accounts. As a consequence, mobile devices store a huge amount of private and sensitive information and this is the reason why attackers are developing very sophisticated techniques to extort data and money from our devices. This paper presents the design and the implementation of LEILA (formaL tool for idEntifying mobIle maLicious behAviour), a tool targeted at Android malware families detection. LEILA is based on a novel approach that exploits model checking to analyse and verify the Java Bytecode that is produced when the source code is compiled. After a thorough description of the method used for Android malware families detection, we report the experiments we have conducted using LEILA. The experiments demonstrated that the tool is effective in detecting malicious behaviour and, especially, in localizing the payload within the code: we evaluated real-world malware belonging to several widespread families obtaining an accuracy ranging between 0.97 and 1.
IEEE Transactions on Software Engineering (accepted)

The Relation between Developers’ Communication and Fix-Inducing Changes: An Empirical Study by Mario Luca Bernardi, Gerardo Canfora, Giuseppe A. Di Lucca, Massimiliano Di Penta, Damiano Distante

pubblicato 1 mar 2018, 06:29 da Gerardo Canfora

Background Many open source and industrial projects involve several developers spread around the world and working in different timezones. Such developers usually communicate through mailing lists, issue tracking systems or chats. Lack of adequate communication can create misunderstanding and could possibly cause the introduction of bugs.
Aim This paper aims at investigating the relation between the bug inducing and fixing phenomenon and the lack of written communication between committers in open source projects.
Method We performed an empirical study that involved four open source projects, namely Apache httpd, GNU GCC, Mozilla Firefox, and Xorg Xserver. For each project change history data, issue tracker comments, mailing list messages, and chat logs were analyzed in order to answer four research questions about the relation between the social importance and communication level of committers and their proneness to induce bug xes.
Results and implications Results indicate that the majority of bugs are fixed by committers who did not induce them, a smaller but substantial percentage of bugs is xed by committers that induced them, and very few bugs are fixed by committers that were not directly involved in previous changes on the same les of the x. More importantly, committers inducing xes tend to have a lower level of communication between each other than that of other committers. This last finding suggests that increasing the level of communication between x-inducing committers could reduce the number of xes induced in a software project.
Journal of Systems and Software (to appear)

Android Apps and User Feedback: A Dataset for Software Evolution and Quality Improvement by Giovanni Grano, Andrea Di Sorbo, Francesco Mercaldo, Corrado A. Visaggio, Sebastiano Panichella, Gerardo Canfora

pubblicato 24 lug 2017, 10:24 da Gerardo Canfora   [ aggiornato in data 24 lug 2017, 10:24 ]

Nowadays, Android represents the most popular mobile platform with a market share of around 80%. Previous research showed that data contained in user reviews and code change history of mobile apps represent a rich source of information for reducing software maintenance and development effort, increasing customers’ satisfaction. Stemming from this observation, we present in this paper a large dataset of Android applications belonging to 23 different apps categories, which provides an overview of the types of feedback users report on the apps and documents the evolution of the related code metrics. The dataset contains about 395 applications of the F-Droid repository, including around 600 versions, 280,000 user reviews and more than 450,000 user feedback (extracted with specific text mining approaches). Furthermore, for each app version in our dataset, we employed the Paprika tool and developed several Python scripts to detect 8 different code smells and compute 22 code quality indicators. The paper discusses the potential usefulness of the dataset for future research in the field.
Dataset URL: https://github.com/sealuzh/user_quality
2nd International Workshop on App Market Analytics, WAMA 2017 (in conjunction with ESEC/FSE 2017)

Beacon-based context-aware architecture for crowd sensing public transportation scheduling and user habits by Danilo Cianciulli, Gerardo Canfora and Eugenio Zimeo

pubblicato 19 mag 2017, 00:05 da Gerardo Canfora

Crowd sourcing and sensing are relatively recent paradigms that, enabled by the pervasiveness of mobile devices, allow users to transparently contribute in complex problem solving. Their effectiveness depends on people voluntarism, and this could limit their adoption. Recent technologies for automating context-awareness could give a significant impulse to spread crowdsourcing paradigms. In this paper, we propose a distributed software system that exploits mobile devices to improve public transportation efficiency. It takes advantage of the large number of deployed personal mobile devices and uses them as both mule sensors, in cooperation with beacon technology for geofecing, and clients for getting information about bus positions and estimated arrival times. The paper discusses the prototype architecture, its basic application for getting dynamic bus information, and the long-term scope in supporting transportation companies and municipalities, reducing costs, improving bus lines, urban mobility and planning.
The International Workshop on Smart Cities Systems Engineering (SCE 2017)

How Open Source Projects use Static Code Analysis Tools in Continuous Integration Pipelines by Fiorella Zampetti, Simone Scalabrino, Rocco Oliveto, Gerardo Canfora, Massimiliano Di Penta

pubblicato 12 mag 2017, 02:18 da Gerardo Canfora



Static analysis tools are often used by software devel- opers to entail early detection of potential faults, vulnerabilities, code smells, or to assess the source code adherence to coding standards and guidelines. Also, their adoption within Continuous Integration (CI) pipelines has been advocated by researchers and practitioners. This paper studies the usage of static analysis tools in 20 Java open source projects hosted on GitHub and using Travis CI as continuous integration infrastructure. Specifically, we investigate (i) which tools are being used and how they are configured for the CI, (ii) what types of issues make the build fail or raise warnings, and (iii) whether, how, and after how long are broken builds and warnings resolved. Results indicate that in the analyzed projects build breakages due to static analysis tools are mainly related to adherence to coding standards, and there is also some attention to missing licenses. Build failures related to tools identifying potential bugs or vulnerabilities occur less frequently, and in some cases such tools are activated in a “softer” mode, without making the build fail. Also, the study reveals that build breakages due to static analysis tools are quickly fixed by actually solving the problem, rather than by disabling the warning, and are often properly documented.
Proc. of 14th International Conference on Mining Software Repositories (MSR 2017) - May 20-21, 2017. Buenos Aires, Argentina.

ARENA: An Approach for the Automated Generation of Release Notes by Laura Moreno, Gabriele Bavota, Massimiliano Di Penta, Rocco Oliveto, Andrian Marcus and Gerardo Canfora

pubblicato 23 mar 2017, 03:24 da Gerardo Canfora

Release notes document corrections, enhancements, and, in general, changes that were implemented in a new release of a software project. They are usually created manually and may include hundreds of different items, such as descriptions of new features, bug fixes, structural changes, new or deprecated APIs, and changes to software licenses. Thus, producing them can be a time-consuming and daunting task. This paper describes ARENA (Automatic RElease Notes generAtor), an approach for the automatic generation of release notes. ARENA extracts changes from the source code, summarizes them, and integrates them with information from versioning systems and issue trackers. ARENA was designed based on the manual analysis of 990 existing release notes. In order to evaluate the quality of the release notes automatically generated by ARENA, we performed four empirical studies involving a total of 56 participants (48 professional developers and 8 students). The obtained results indicate that the generated release notes are very good approximations of the ones manually produced by developers and often include important information that is missing in the manually created release notes.
IEEE Trans. Software Eng. 43(2): 106-127 (2017)

Data Leakage in Mobile Malware: the what, the why and the how by Corrado Aaron Visaggio, Gerardo Canfora, Luigi Gentile, Francesco Mercaldo

pubblicato 7 feb 2017, 13:14 da Gerardo Canfora   [ aggiornato in data 7 feb 2017, 13:15 ]

Mobile technologies are spreading at a very quick pace. Differently from desktop PCs, smartphones, tablets and wearable devices, manage a lot of sensitive information of the device’s owner. For this reason they represent a very appealing opportunity for attackers to write malicious apps that are able to steal such information. In this paper we analyse a huge set of Android malwares in order to discover which kind of data is exfiltrated from mobile devices and which are the mechanisms that malware writers leverage. For this analysis three tools were employed which are considered the state of the art of the available technology: Flowdroid, Amandroid, and Epicc. Our results show that mobile malware usually exposes users to a massive data leakage.
In: "Intrusion Detection and Prevention for Mobile Ecosystems" (Taylor and Francis publisher), edited by George Kambourakis, Asaf Shabtai, Konstantinos Kolias, and Dimitrios Damopoulos.
https://www.crcpress.com/Intrusion-Detection-and-Prevention-for-Mobile-Ecosystems/Kambourakis-Shabtai-Kolias-Damopoulos/p/book/9781138033573

SURF: Summarizer of User Reviews Feedback by Andrea Di Sorbo, Sebastiano Panichella, Carol V. Alexandru, Corrado A. Visaggio, Gerardo Canfora

pubblicato 7 feb 2017, 13:00 da Gerardo Canfora   [ aggiornato in data 7 feb 2017, 13:03 ]

Continuous Delivery (CD) enables mobile developers to release small, high quality chunks of working software in a rapid manner. However, faster delivery and a higher software quality do neither guarantee user satisfaction nor positive business outcomes. Previous work demonstrates that app reviews may contain crucial information that can guide developer’s software maintenance efforts to obtain higher customer satisfaction. However, previous work also proves the difficulties encountered by developers in manually analyzing this rich source of data, namely (i) the huge amount of reviews an app may receive on a daily basis and (ii) the unstructured nature of their content. In this paper, we propose SURF (Summarizer of User Reviews Feedback), a tool able to (i) analyze and classify the information contained in app reviews and (ii) distill actionable change tasks for improving mobile applications. Specifically, SURF performs a systematic summarization of thousands of user reviews through the generation of an interactive, structured and condensed agenda of recommended software changes. An endto-end evaluation of SURF, involving 2622 reviews related to 12 different mobile applications, demonstrates the high accuracy of SURF in summarizing user reviews content. In evaluating our approach we also involve the original developers of some apps, who confirm the practical usefulness of the software change recommendations made by SURF.
A tool demo at 39th International Conference on Software Engineering (ICSE 2017)

Video di YouTube


Demo URL: https://youtu.be/Yf-U5ylJXvo
Demo webpage: http://www.ifi.uzh.ch/en/seal/people/panichella/tools/SURFTool.html

Estimating the number of remaining links in traceability recovery by Davide Falessi, Massimiliano Di Penta, Gerardo Canfora & Giovanni Cantone

pubblicato 10 nov 2016, 13:40 da Gerardo Canfora

Although very important in software engineering, establishing traceability links between software artifacts is extremely tedious, error-prone, and it requires significant effort. Even when approaches for automated traceability recovery exist, these provide the requirements analyst with a, usually very long, ranked list of candidate links that needs to be manually inspected. In this paper we introduce an approach called Estimation of the Number of Remaining Links (ENRL) which aims at estimating, via Machine Learning (ML) classifiers, the number of remaining positive links in a ranked list of candidate traceability links produced by a Natural Language Processing techniques-based recovery approach. We have evaluated the accuracy of the ENRL approach by considering several ML classifiers and NLP techniques on three datasets from industry and academia, and concerning traceability links among different kinds of software artifacts including requirements, use cases, design documents, source code, and test cases. Results from our study indicate that: (i) specific estimation models are able to provide accurate estimates of the number of remaining positive links; (ii) the estimation accuracy depends on the choice of the NLP technique, and (iii) univariate estimation models outperform multivariate ones.
Empirical Software Engineering, DOI 10.1007/s10664-016-9460-6 

What Would Users Change in My App? Summarizing App Reviews for Recommending Software Changes by Andrea Di Sorbo, Sebastiano Panichella, Carol V. Alexandru, Junji Shimagaki, Corrado A. Visaggio, Gerardo Canfora, Harald Gall

pubblicato 11 set 2016, 10:57 da Gerardo Canfora   [ aggiornato in data 11 set 2016, 10:58 ]

Mobile app developers constantly monitor feedback in user reviews with the goal of improving their mobile apps and better meeting user expectations. Thus, automated approaches have been proposed in literature with the aim of reducing the effort required for analyzing feedback contained in user reviews via automatic classification/prioritization according to specific topics. In this paper, we introduce SURF (Summarizer of User Reviews Feedback), a novel approach to condense the enormous amount of information that developers of popular apps have to manage due to user feedback received on a daily basis. SURF relies on a conceptual model for capturing user needs useful for developers performing maintenance and evolution tasks. Then it uses sophisticated summarisation techniques for summarizing thousands of reviews and generating an interactive, structured and condensed agenda of recommended software changes. We performed an end-to-end evaluation of SURF on user reviews of 17 mobile apps (5 of them developed by Sony Mobile), involving 23 developers and researchers in total. Results demonstrate high accuracy of SURF in summarizing reviews and the usefulness of the recommended changes. In evaluating our approach we found that SURF helps developers in better understanding user needs, substantially reducing the time required by developers compared to manually analyzing user (change) requests and planning future software changes.
Prof. of 24th ACM SIGSOFT International Symposium on the Foundations of Software Engineering (FSE 2016)

1-10 of 71