Automatic Categorization of Query Results

Kaushik Chakrabarti, Surajit Chaudhuri, and Seung-won Hwang


Exploratory ad-hoc queries could return too many answers – a

phenomenon commonly referred to as “information overload”.

In this paper, we propose to automatically categorize the results

of SQL queries to address this problem. We dynamically generate

a labeled, hierarchical category structure – users can determine

whether a category is relevant or not by examining

simply its label; she can then explore just the relevant categories

and ignore the remaining ones, thereby reducing information

overload. We first develop analytical models to estimate

information overload faced by a user for a given exploration.

Based on those models, we formulate the categorization problem

as a cost optimization problem and develop heuristic algorithms

to compute the min-cost categorization.


Publication typeInproceedings
Published inACM SIGMOD Conference
> Publications > Automatic Categorization of Query Results