Herding “small” streaming queries

MSR-TR-2015-26 |

We study the problem of placing streaming queries into servers. Unlike previous work, we focus on queries that consume events of relative low rates, each computed in a single server (i.e. no scaling-out per query). However, we need to place a very large and dynamic number of queries in relatively few servers. Our focus is motivated by the need to support a platform for hosting end-user streaming queries that may come from a variety of applications, such as the Cortana personal assistant.

The placement strives to reduce network and computational overheads. It exploits the observation that a large number of queries consume the same sources of events, and, hence, placing them in the same server in the platform reduces network overheads. However, the placement also needs to balance the load among the servers. A further complication arises from the requirement to allow
the queries to read events from multiple sources concurrently (i.e., to join multiple streams).

In this paper, we formulate the problem of placing queries into the servers of a streaming platform. We propose approximation algorithms and derive approximation bounds for the following cases (a) the offline case where queries are stable and known ahead of time, akin to an “oracle”, and (b) the online case without departures and known query popularities. For the general online problem, we propose effective heuristic algorithms. An extensive set of experiments demonstrates that the proposed algorithms provide good performance in a wide-range of scenarios.