On the Integration of Structure Indexes and Inverted Lists

  • ,
  • Rajasekar Krishnamurthy ,
  • Jeffrey F. Naughton ,
  • Raghu Ramakrishnan

SIGMOD |

Published by Association for Computing Machinery, Inc.

Several methods have been proposed to evaluate queries over a native XML DBMS, where the queries specify both path and keyword constraints. These broadly consist of graph traversal approaches, optimized with auxiliary structures known as structure indexes; and approaches based on information-retrieval style inverted lists. We propose a strategy that combines the two forms of auxiliary indexes, and a query evaluation algorithm for branching path expressions based on this strategy. Our technique is general and applicable for a wide range of choices of structure indexes and inverted list join algorithms. Our experiments over the Niagara XML DBMS show the benefi t of integrating the two forms of indexes. We also consider algorithmic issues in evaluating path expression queries when the notion of relevance ranking is incorporated. By integrating the above techniques with the Threshold Algorithm proposed by Fagin et al., we obtain instance optimal algorithms to push down top k computation.