Monitoring Distributed Data Streams

Monitoring data streams in a distributed system is the focus of much research in recent years. Most of the proposed schemes, however, deal with monitoring simple aggregated values, such as the frequency of appearance of items in the streams. More involved challenges, such as the important task of feature selection (e.g., by monitoring the information gain of various features), still require very high communication overhead using naive, centralized algorithms. We present a novel geometric approach by which an arbitrary global monitoring task can be split into a set of
constraints applied locally on each of the streams. The constraints are used to locally filter out data increments that do not affect the monitoring outcome, thus avoiding unnecessary communication. As a result, our approach enables monitoring of arbitrary threshold functions over distributed data streams in an efficient manner. We present experimental results on real-world data which demonstrate that our algorithms are highly scalable, and reducing the communication load by orders of magnitude in comparison to centralized algorithms.
Joint works with Tsachi Sharfman and Daniel Keren

Speaker Details

Prof. Assaf Schuster (http://www.cs.technion.ac.il/~assaf) interests are in the areas of data streams, data mining, parallel and distributed and grid computing. Since 1991 he is with the Computer Science department at the Technion – Israel Institute of Technology. At the Technion he established and is heading the Distributed Systems Laboratory (DSL http://dsl.cs.technion.ac.il/). He published over 140 papers in his areas of expertise in prestigious conferences and high-quality journals. He regularly participates in program committees for conferences on knowledge discovery in large systems, and conferences on parallel and distributed computing. He consults the hi-tech industry and government agencies on related issues and is the inventor of several patents. He serves as an associate editor of the distinguished journals: Journal of Parallel and Distributed Computing, and IEEE Transactions on Computers. He supervises fifteen master and doctor students, and takes part in large national and international projects as an expert on data management, knowledge discovery in databases, grid and distributed computing. His group participate as a partner (specializing on core HPC, grid and data mining technologies) in several national and European projects.

Date:
Speakers:
Assaf Schuster
Affiliation:
Computer Science department at the Technion – Israel Institute of Technology
    • Portrait of Jeff Running

      Jeff Running