Preprints‎ > ‎

Mining Source Code Descriptions from Developer Communications by Sebastiano Panichella, Jairo Aponte, Massimiliano Di Penta, Andrian Marcus, Gerardo Canfora

pubblicato 16 apr 2013, 06:50 da Gerardo Canfora
Very often, source code lacks comments that ade- quately describe its behavior. In such situations developers need to infer knowledge from the source code itself or to search for source code descriptions in external artifacts.
We argue that messages exchanged among contribu- tors/developers, in the form of bug reports and emails, are a useful source of information to help understanding source code. However, such communications are unstructured and usually not explicitly meant to describe specific parts of the source code. De- velopers searching for code descriptions within communications face the challenge of filtering large amount of data to extract what pieces of information are important to them. We propose an approach to automatically extract method descriptions from communications in bug tracking systems and mailing lists.
We have evaluated the approach on bug reports and mailing lists from two open source systems (Lucene and Eclipse). The results indicate that mailing lists and bug reports contain relevant descriptions of about 36% of the methods from Lucene and 7% from Eclipse, and that the proposed approach is able to extract such descriptions with a precision of up to 79% for Eclipse and 87% for Lucene. The extracted method descriptions can help developers in understanding the code and could also be used as a starting point for source code re-documentation.
20th IEEE International Conference on Program Comprehension (ICPC 2012)
Gerardo Canfora,
16 apr 2013, 06:51