Kevyn Collins-Thompson
November 2009
We introduce a new theoretical derivation, evaluation methods,
and extensive empirical analysis for an automatic query
expansion framework in which model estimation is cast as a
robust constrained optimization problem. This framework
provides a powerful method for modeling and solving complex
expansion problems, by allowing multiple sources of
domain knowledge or evidence to be encoded as simultaneous
optimization constraints. Our robust optimization approach
provides a clean theoretical way to model not only
expansion benefit, but also expansion risk, by optimizing
over uncertainty sets for the data. In addition, we introduce
risk-reward curves to visualize expansion algorithm performance
and analyze parameter sensitivity. We show that a
robust approach significantly reduces the number and magnitude
of expansion failures for a strong baseline algorithm,
with no loss in average gain. Our approach is implemented
as a highly efficient post-processing step that assumes little
about the baseline expansion method used as input, making
it easy to apply to existing expansion methods. We provide
analysis showing that this approach is a natural and effective
way to do selective expansion, automatically reducing
or avoiding expansion in risky scenarios, and successfully
attenuating noise in poor baseline methods.
In Proceedings of CIKM 2009
Publisher Association for Computing Machinery, Inc.
Copyright © 2009 by the Association for Computing Machinery, Inc. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications Dept, ACM Inc., fax +1 (212) 869-0481, or permissions@acm.org. The definitive version of this paper can be found at ACM’s Digital Library --http://www.acm.org/dl/.
| Type | Inproceedings |