Share on Facebook Tweet on Twitter Share on LinkedIn Share by email
Complex Network Analysis Reveals Kernel-Periphery Structure in Web Search Queries

Rishiraj Saha Roy, Niloy Ganguly, Monojit Choudhury, and Navin Kumar Singh

Abstract

Web search queries have evolved into a language of their own. In this paper, we substantiate this fact through the analysis of complex networks constructed from query logs. Like natural language, a two-regime degree distribution in word or phrase co-occurrence networks of queries reveals the existence of a small kernel and a very large periphery. But unlike natural language, where a large fraction of sentences are formed only using the kernel words, most queries consist of units both from the kernel and the periphery. The long mean shortest path for these networks further show that paths between peripheral units are typically connected through nodes in the kernel, which in turn are connected through multiple hops within the kernel. The extremely large periphery implies that the likelihood of encountering a new word or segment is much higher for queries than in natural language, making the processing of unseen queries

much harder than that of unseen sentences.

Details

Publication typeInproceedings
Published inProceedings of the 2nd International ACM SIGIR (Association for Computing Machinery Special Interest Group on Information Retrieval) Workshop on Query Representation and Understanding 2011 (QRU 2011)
URLhttp://ciir.cs.umass.edu/sigir2011/qru/roy+al.pdf
Pages5-8
PublisherAssociation for Computing Machinery, Inc.
> Publications > Complex Network Analysis Reveals Kernel-Periphery Structure in Web Search Queries