Code Similarity in TouchDevelop: Harnessing Clones

MSR-TR-2011-103 |

The number of applications available in mobile marketplaces is increasing rapidly. It’s very easy to become overwhelmed by the sheer size of their codebase. We propose to use code clone analysis to help manage existing applications and develop new ones. First, we propose an automatic application ranking scheme based on (dis)similarity. Traditionally, applications in app stores are ranked manually, by user or moderator input. We argue that automatically computed (dis)similarity information can be used to reinforce this ranking and help in dealing with possible application cloning. Second, we consider code snippet search, a task commonly performed by application developers. We view it as a special instance of the clone detection problem which allows us to perform precise search with little to no configuration and completely agnostic of code formatting, variable renamings, etc. We built a prototype of our approach in TouchDevelop, a novel application development environment for Windows Phone, and will use it as a testing ground for future evaluation.