Sreenivas Gollapudi, Samuel Ieong, Alexandros Ntoulas, and Stelios Paparizos
Web search engines incorporate results from structured data sources to answer semantically rich user queries, i.e. ‘Samsung 50 inch led tv’ can be answered from a table of television data. However, users are not domain experts and quite often enter values that do not match precisely the underlying data, so a literal execution will return zero results. A search engine would prefer to return at least a minimum number of results as close to the original query as possible while providing a time-bound execution guarantee. In this paper, we formalize these requirements, show the problem is NP-Hard and present approximation algorithms that produce rewrites that work in practice. We empirically validate our algorithms on large-scale data from a major search engine.
|Published in||ACM Conference on Information and Knowledge Management (CIKM)|