Towards Complex Query Processing over Key-Value Cloud Stores

Facts: Cloud infrastructures bear an ever-increasing responsibility for storing and maintaining massive volumes of data for different types of data-intensive applications. Key-value cloud-stores, have become a premium choice as the storage back-end for such applications. We need complex query processing capability to access/analyze this data.

Questions: Do we have adequate solutions required to support complex queries, over data residing in such storage infrastructures? Do standard, “cloud-friendly” approaches, such as MapReduce-based algorithms, offer a satisfactory solution? What additional support, in the form of indexing and query processing algorithms, would expedite query processing? Can we do so, while benefiting from the simplicity of the key-value systems’ interface and free-ride on their inherent scalability, elasticity, and reliability?

Answers: In this talk I will present novel indexing structures and processing algorithms for complex query types. Specifically, I will first cover interval queries in depth, presenting indices and associated query processing algorithms. I will also overview indexing and query processing approaches for rank-join queries. Our contributions include key-value representations of our index and statistical structures, MapReduce algorithms to build and populate them, and query processing algorithms utilizing them, catering to idiosyncrasies of key-value stores, but inheriting their advantages. Our implementation and experimentation are over the popular HBase key-value store. I will report on the results of extensive performance evaluations, which show large performance improvements. En route, I will touch upon differences in existing key-value system architectures and their implications. The talk will conclude with the lessons we have learned, pointing to key design decisions, and promising ideas for outstanding challenges.

Speaker Details

Peter Triantafillou received his PhD in Computer Science from the University of Waterloo in 1991. Since then he has held faculty positions in Simon Fraser University (1991-1996), the Technical University of Crete (1996-2002), and the University of Patras (2002-present). He has also been a visiting professor with the Max-Planck Institute for Informatics in 2004-2005 and also currently, since February 2012. He has co-authored more than 100 papers, served as a member of the Program Committee in nearly 100 conferences has received a best-paper award and 2 of his papers were selected among the best papers in conferences. His research and publications have spanned a number of areas including Distributed Databases, Distributed Filesystems, Multimedia Systems, Storage Servers, Peer-to-Peer Data Systems, Publish-Subscribe Systems, Decentralized Search Engines, and Social Networks/Systems. Currently, his interests lie in Cloud Data Management and Human-Centric Data Management.

Date:
Speakers:
Peter Triantafillou
Affiliation:
University of Patras

Watch Next