Share on Facebook Tweet on Twitter Share on LinkedIn Share by email
Code Completion with Statistical Language Models

Speaker  Veselin Raychev

Affiliation  Microsoft

Host  Madan Musuvathi

Duration  00:56:37

Date recorded  27 August 2014

In this talk, I present the problem of synthesizing code completions for programs using APIs. Given a program with holes, we synthesize completions for the holes with the most likely sequences of method calls that we learn from existing code.

The main idea of our approach is to reduce the problem of code completion to a natural-language processing problem of predicting probabilities of sentences. We design a simple and scalable static analysis that extracts sequences of method calls from large codebase, and index them into a statistical language model. Then we employ the language model to find the highest ranked sentences and use them to synthesize a code completion.

Our technique is capable of synthesizing sequences of method calls, calls that may span across multiple objects and methods together with their arguments. We implemented our approach for Java programs using Android APIs and our results show that the system is fast and effective. Virtually all computed completions typecheck and the desired completion appears in the first 3 results for 90% of the test cases.

©2014 Microsoft Corporation. All rights reserved.
> Code Completion with Statistical Language Models