Fan Guo, Chao Liu, and Yi-Min Wang
Many tasks that leverage web search users’ implicit feedback rely on a proper and unbiased interpretation of user clicks. Previous eye-tracking experiments and studies on explaining position-bias of user clicks provide a spectrum of hypotheses and models on how an average user examines and possibly clicks web documents returned by a search en-
gine with respect to the submitted query. In this paper, we attempt to close the gap between previous work, which studied how to model a single click, and the reality that multiple clicks on web documents in a single result page are not uncommon. Specifically, we present two multiple-click models: the independent click model (ICM) which is reformulated from previous work, and the dependent click model (DCM) which takes into consideration dependencies between multiple clicks. Both models can be efficiently learned with linear time and space complexities. More importantly, they can be incrementally updated as new click logs flow in. These are well-demanded properties in reality.
We systematically evaluate the two models on click logs obtained in July 2008 from a major commercial search engine. The data set, after preprocessing, contains over 110 thousand distinct queries and 8.8 million query sessions. Extensive experimental studies demonstrate the gain of modeling multiple clicks and their dependencies. Finally, we note that since our experimental setup does not rely on tweaking search result rankings, it can be easily adopted by future studies.
In WSDM'09: Proceedings of the Second ACM International Conference on Web Search and Data Mining
Publisher Association for Computing Machinery, Inc.
Copyright © 2007 by the Association for Computing Machinery, Inc. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications Dept, ACM Inc., fax +1 (212) 869-0481, or firstname.lastname@example.org. The definitive version of this paper can be found at ACM’s Digital Library --http://www.acm.org/dl/.