Leveraging Interlingual Classification to Improve Web Search

MSR-TR-2012-11 |

In this paper we address the problem of improving accuracy of web search in a smaller, data-limited search market (search language) using behavioral data from a larger, data-rich market (assist language). Specifically, we use interlingual classification to infer the search language query’s intent using the assist language click-through data. We use these improved estimates of query intent, along with the query intent based on the search language data, to compute features that encode the similarity between a search result (URL) and the query. These features are subsequently fed into the ranking model to improve the relevance ranking of the documents. Our experimental results on German and French languages show the effectiveness of using assist language behavioral data – especially, when the search language queries have small click-through data.