Learning Semantic Representations Using Convolutional Neural Networks for Web Search

  • Yelong Shen ,
  • Xiaodong He ,
  • ,
  • Li Deng ,
  • Gregoire Mesnil

Published by WWW 2014

This paper presents a series of new latent semantic models based on a convolutional neural network (CNN) to learn lowdimensional semantic vectors for search queries and Web documents. By using the convolution-max pooling operation, local contextual information at the word n-gram level is modeled first. Then, salient local features in a word sequence are combined to form a global feature vector. Finally, the high-level semantic information of the word sequence is extracted to form a global vector representation. The proposed models are trained on clickthrough data by maximizing the conditional likelihood of clicked documents given a query, using stochastic gradient ascent. The new models are evaluated on a Web document ranking task using a large-scale, real-world data set. Results show that our model significantly outperforms other semantic models, which were state-of-the-art in retrieval performance prior to this work.