Much of what is written about a software project is soon forgotten. Software repositories are full of valuable information about the project: Bug descriptions, check-in messages, email and newsgroup archives, specifications, design documents, product documentation, and product support logs contain a wealth of information that can potentially help software developers resolve crucial questions about the history, rationale, and future plans for source code. For a variety of reasons, developers rarely turn to these resources when trying to answer these questions. We are building a full-text search that encompasses multiple repositories. To effectively implement full-text search in the absence of hyperlinks we propose detecting textual allusions to software artifacts in natural-language prose. Allusions are shown to contribute a significant portion of the relationships represented in the graph.
|Published in||MSR '06: Proceedings of the 2006 international workshop on Mining software repositories|
|Address||New York, NY, USA|
|Publisher||Association for Computing Machinery, Inc.|
Copyright © 2007 by the Association for Computing Machinery, Inc. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications Dept, ACM Inc., fax +1 (212) 869-0481, or firstname.lastname@example.org. The definitive version of this paper can be found at ACM’s Digital Library --http://www.acm.org/dl/.
Gina Danielle Venolia. Textual Allusions to Artifacts in Software-related Repositories, Microsoft Research, May 2006.