Demystifying the Dark Side of the Middle: A Field Study of Middlebox Failures in Datacenters

  • Rahul Potharaju ,
  • Navendu Jain

IMC '13 Proceedings of the 2013 conference on Internet measurement conference |

Published by ACM - Association for Computing Machinery

Nominated for best paper award, 2013.

DOI

Network appliances or middleboxes such as firewalls, intrusion detection and prevention systems (IDPS), load balancers, and VPNs form an integral part of datacenters and enterprise networks. Realizing their importance and shortcomings, the research community has proposed software implementations, policy-aware switching, consolidation appliances, moving middlebox processing to VMs, end hosts, and even offloading it to the cloud. While such efforts can use middlebox failure characteristics to improve their reliability, management, and cost-effectiveness, little has been reported on these failures in the field.

In this paper, we make one of the first attempts to perform a large-scale empirical study of middlebox failures over two years in a service provider network comprising thousands of middleboxes across tens of datacenters. We find that middlebox failures are prevalent and they can significantly impact hosted services. Several of our findings differ in key aspects from commonly held views: (1) Most failures are grey dominated by connectivity errors and link flaps that exhibit intermittent connectivity, (2) Hardware faults and overload problems are present but they are not in majority, (3) Middleboxes experience a variety of misconfigurations such as incorrect rules, VLAN misallocation and mismatched keys, and (4) Middlebox failover is ineffective in about 33% of the cases for load balancers and firewalls due to configuration bugs, faulty failovers and software version mismatch. Finally, we analyze current middlebox proposals based on our study and discuss directions for future research.