Regularized Mapping to Latent Structures and Its Application to Web Search

  • Wei Wu ,
  • Zhengdong Lu ,
  • Hang Li

MSR-TR-2012-54 |

Projection to Latent Structures (PLS), also known as Partial Least Squares, is a method for matching objects from two heterogeneous domains. Although PLS is empirically verified effective for matching queries and documents, its scalability becomes a major hurdle for its application in real-world web search. In this paper, we study a general framework for matching heterogeneous objects, which renders a rich family of matching models when different regularization are enforced, with PLS as a special case. Particularly, with ℓ1 and ℓ2 type of regularization on the mapping functions, we obtain the model called Regularized Mapping to Latent Structures (RMLS). RMLS enjoys many advantages over PLS, including lower time complexity and easy parallelization. As another contribution, we give a generalization analysis of this matching framework, and apply it to both PLS and RMLS. In experiments, we compare the effectiveness and efficiency of RMLS and PLS on large scale web search problems. The results show that RMLS can achieve equally good performance as PLS for relevance ranking, while significantly speeding up the learning process.