Large Scale Log Analysis of Individuals’ Domain Preferences in Web Search

  • Sarah K. Tyler ,
  • ,
  • Peter Bailey ,
  • Sebastian de la Chica ,
  • Nikhil Dandekar

MSR-TR-2015-048 |

Information on almost any given topic can be found on the Web, often accessible via many different websites. But even when the topical content is similar across websites, the websites can have different characteristics that appeal to different people. As a result, individuals can develop preferred websites to visit for certain topics. While it has long been speculated that such preferences exist, little is understood about how prevalent, clear, and stable these preferences actually are. We characterize website preference in search by looking at repeat domain use in two months of large-scale query and webpage visitation logs. We show that while people sometimes provide explicit cues in their queries to indicate their domain preferences, there is a significant opportunity to identify implicit preferences expressed via user behavior. Although domain preferences vary across users, within a user they are consistent and stable over time, even during events that typically disrupt normal search behavior. People’s preferences do, however, vary given the topic of their search. We observe that people exhibit stronger domain preferences while searching than browsing, but that search-based preferences often extend to pages browsed to after the initial search result click. Since domain preferences are common for search and stable over time, the rich understanding of them that we present here will be valuable for personalizing search.