Mohan Yang, Bolin Ding, Surajit Chaudhuri, and Kaushik Chakrabarti
We aim to provide table answers to keyword queries using a knowledge base. For queries referring to multiple entities, like “Washington cities population” and “Mel Gibson movies”, it is better to represent each relevant answer as a table which aggregates a set of entities or joins of entities within the same table scheme or pattern. In this paper, we study how to find highly relevant patterns in a knowledge base for user-given keyword queries to compose table answers. A knowledge base is modeled as a directed graph called knowledge graph, where nodes represent its entities and edges represent the relationships among them. Each node/edge is labeled with type and text. A pattern is an aggregation of subtrees which contain all keywords in the texts and have the same structure and types on node/edges. We propose efficient algorithms to find patterns that are relevant to the query for a class of scoring functions. We show the hardness of the problem in theory, and propose pathbased indexes that are affordable in memory. Two query-processing algorithms are proposed: one is fast in practice for small queries (with small numbers of patterns as answers) by utilizing the indexes; and the other one is better in theory, with running time linear in the sizes of indexes and answers, which can handle large queries better. We also conduct extensive experimental study to compare our approaches with a naive adaption of known techniques.
|Published in||Proceedings of the VLDB Endowment, the 41st International Conference on Very Large Data Bases (VLDB 2015)|
|Publisher||VLDB – Very Large Data Bases|