Exploiting redundancy for robust sensing

PhD Thesis: Carnegie Mellon University |

Adviser-Srinivasan Seshan

In this thesis, we explore the challenges in making an Internet-scale heterogeneous sensing system more robust. We target ”end-to-end” robustness in that we address failures in collecting data from a large collection of wired and wireless sensors, and problems in making sensor readings available to end-users from storage on Internet-connected nodes. Although often overlooked, robustness is extremely crucial for such systems because they are often deployed in harsh environments and are not typically very well-maintained. Traditional robustness techniques generally involve tradeoffs between robustness and resource-efficiency; i.e., they mask failures by using additional resources (e.g., energy, storage). Unfortunately, these traditional tradeoffs are not well suited to resource constraints and large scales of typical sensing systems.

This dissertation puts forth the claim that more practical solutions can be developed by exploiting several unique deployment- and application-specific properties of typical sensing systems. We show that by slightly relaxing the requirements of exact or fresh answers, we can significantly improve the robustness of a system, without additional resource overheads. We argue that this approach is well suited to sensing systems since optimizing resource usage is one of the important goals of their designs and the applications can often tolerate approximate or slightly stale data. We support the above claim by proposing efficient solutions for robust data collection and storage in a sensing system.

For robust collection of data from wireless sensors, we present Synopsis Diffusion, a novel data aggregation scheme that exploits wireless sensors’ broadcast communication and sensing applications’ tolerance for approximate aggregate answers. Synopsis Diffusion, unlike previous schemes, decouples aggregation algorithms from underlying aggregation topologies, enabling highly robust aggregation with energy-efficient multipath routing. We also present Tributary-Delta, a novel adaptive aggregation scheme that efficiently combines the benefits of existing schemes and uses application-aware adaptation to cope with the dynamics of deployment environments. Under typical loss rates, our techniques can provide five times more accurate results than existing energy-efficient schemes, without additional energy overhead.

For storing sensor readings on Internet-connected nodes, we show that existing design principles used to build highly available storage systems do not work well for an Internet-scale system where failures are often correlated. Our results show that, for sensing applications, weak quorum systems are more suitable than traditional strict quorum systems because weak quorum systems are more effective in tolerating correlated failures and sensing applications can tolerate the small data inconsistency caused by such quorum systems. We also show that configuring a system with parameters derived by using the correlation model we develop is more effective than existing techniques in optimizing resource usage and target availability. Finally, we show how several data- and query-characteristics of a typical sensing system can be exploited to design efficient selfrepairing and load balancing techniques. Our techniques can improve the availability of a sensing system by orders of magnitude without any additional resource overhead.

We show the feasibility of our techniques through a combination of analysis, simulation, and implementation within IrisNet, an Internet-scale sensing infrastructure that we have developed as part of this dissertation.