Providing Private Packet Analysis
August 29, 2010 8:30 PM PT

Frank McSherry and Ratul Mahajan probably will be jet-lagged at any Labor Day cookouts. That’s because McSherry, of Microsoft Research Silicon Valley, and Mahajan, of Microsoft Research Redmond, will have just returned from half a world away, after presenting a major paper in New Delhi, India, during SIGCOMM 2010.

From Aug. 30 to Sept. 3, the world’s foremost experts on data communication will share original research on the applications, technologies, architectures, and protocols for computer communication during the event, the Association for Computing Machinery’s annual conference for the Special Interest Group on Data Communication, being held in India for the first time.

One of seven accepted papers authored or co-authored by Microsoft researchers, Differentially-Private Network Trace Analysis, has elicited great interest, as it sheds light on one of the most vexing problems in conducting research on network data: how to analyze the contents of network packets without compromising the privacy of the underlying data.

Their objective, McSherry explains, is to “address the tension inherent in conducting experiments on networking data, such as network-packet traces, that are rife with sensitive information. Researchers want to be able to evaluate techniques that, for example, look at the contents of network packets to see whether there are lots of worms running around on a network, but they do not want to accidentally see people's passwords, email, or other private data.”

McSherry and Mahajan report on their experiences in conducting a diverse set of analyses in a differentially private manner, a technique that makes it nearly impossible to infer the presence or absence of individual records from the analysis output. But the strong guarantees of differential privacy do impose a cost. Privacy is preserved by adding noise to the output of the analysis, affecting its accuracy. McSherry and Mahajan found, though, that the rate of errors introduced for the sake of privacy was often—though not always—low, even at high levels of privacy, and they concluded that differential privacy shows promise for a broad class of network analyses. Their results indicated that differential privacy has the potential to enable data owners to let other analysts extract statistical information in a provably private manner.

Unproductive Cycles

“The community has tried several things, such as anonymity, but they haven’t worked,” Mahajan explains. “For instance, soon after someone invents a new method to erase private information, a bunch of papers comes along that demonstrate why that method does not really work. This cycle occurs because the problem is difficult and the methods that folks have been trying are somewhat ad hoc. Differential privacy—which was invented by Frank and others—provides an opportunity to get out of this cycle and to enable safe data sharing, because it starts with a very strict and formal definition of privacy.”

A collaborative effort between a privacy researcher, McSherry, and a systems and networking researcher, Mahajan, the paper brings together two essential viewpoints.

“Our findings,” McSherry says, “show the promise of marrying differential privacy with networking analyses.”

Their research builds on the technology in Privacy Integrated Queries (PINQ), which McSherry pioneered, and the release of an updated PINQ tool kit coincides with SIGCOMM, as will a separate release of the network-trace-analysis tools devised by McSherry and Mahajan.

Both researchers are quick to add that this is far from the final word, but they are excited that their work expands the boundaries of possible network analyses.

“There are a lot of research projects that one could look at and say: ‘Whoa! No way can you get that data. Way too sensitive,’ ” McSherry says, “and yet, you can use the algorithms and tools we have put together to write computer programs to measure the aspects of the data that you are actually interested in, without revealing parts of the data that people certainly shouldn't be looking at.”

As Mahajan succinctly adds: “The coolest thing about our research is that it shows the light at the end of the tunnel. It suggests that the seemingly conflicting goals of privacy and data-driven research can be reconciled. While we are not there yet, the potential is clear.”

Workshop Keynote

Another prime-time SIGCOMM moment for Microsoft Research comes Aug. 30, when Victor Bahl, principal researcher and manager of the Networking Research Group, delivers the keynote address, “A Software Perspective to Energy Management,” for the first SIGCOMM Workshop on Green Networking. The workshop will be focused on issues in designing green networking infrastructures in computing and non-computing domains, including the home, the enterprise, and data-center environments, and exploring interdisciplinary approaches for reducing energy consumption. In his keynote, Bahl will discuss why energy management makes sense economically and environmentally, and he will describe how software solutions such as Microsoft Hohm, Joulemeter, Somniloquy, Sleep Proxy,LiteGreen,Data Center Genome, Gargoyle, and Unified Communications can be used to reduce energy usage, lower greenhouse gas emissions, and cut costs.Jitendra Padhye of Microsoft Research Redmond served as co-chair of this workshop.

Venkat Padmanabhan, principal researcher at Microsoft Research India, served as general co-chair of the conference. Ranjita Bhagwan of Microsoft Research India was publicity co-chair, and Chuanxiong Guo of Microsoft Research Asia, , Ram Ramjee of Microsoft Research India, Ant Rowstron of Microsoft Research Cambridge, and Yinglian Xie of Microsoft Research Silicon Valley served on the technical program committee, as did Padhye. Stefan Saroiu of Microsoft Research Redmond and Vishnu Navda of Microsoft Research India were on the Poster and Demo Committee.

In addition to Padhye, several other Microsoft researchers served in leadership positions for SIGCOMM 2010 workshops: Alec Wolman of Microsoft Research Redmond, co-chair of the Workshop on Networking, Systems, and Applications on Mobile Handhelds;Lidong Zhou of Microsoft Research Asia and Chandu Thekkath of Microsoft Research Silicon Valley, co-chair and general chair, respectively, of the Asia-Pacific Workshop on Systems; andChristos Gkantsidis of Microsoft Research Cambridge, co-chair of the Workshop on Home Networks.

SIGCOMM 2010 papers written in whole or in part by Microsoft researchers:

Cloudward Bound: Planning for Beneficial Migration of Enterprise Applications to the Cloud
Mohammad Hajjat, Purdue University; Xin Sun, Purdue University; Yu-Wei Eric Sung, Purdue University; David Maltz, Microsoft Research Redmond; Sanjay Rao, Purdue University; Kunwadee Sripanidkulchai, IBM Research; and Mohit Tawarmalani, Purdue University.

Data Center TCP
Mohammad Alizadeh, Stanford University; Albert Greenberg, Microsoft Research Redmond; David Maltz, Microsoft Research Redmond; Jitu Padhye, Microsoft Research Redmond; Parveen Patel, Microsoft Research Redmond; Balaji Prabhakar, Stanford University; Sudipta Sengupta, Microsoft Research Redmond; and Murari Sridharan, Microsoft.

Differentially-Private Network Trace Analysis
Frank McSherry, Microsoft Research Silicon Valley, and Ratul Mahajan, Microsoft Research Redmond.

Fine-Grained Channel Access in Wireless LAN
Kun Tan, Microsoft Research Asia; Ji Fang, Beijing Jiaotang University; Yuanyang Zhang, Beihang University; Shouyuan Chen, Tsinghua University; Lixin Shi, Tsinghua University; Jiansong Zhang, Microsoft Research Asia; and Yongguang Zhang, Microsoft Research Asia.

Generic and Automatic Address Configuration for Data Center Networks
Kai Chen, Northwestern University; Chuanxiong Guo, Microsoft Research Asia; Haitao Wu, Microsoft Research Asia; Jing Yuan, Tsinghua University; Zhenqian Feng, National University of Defense Technology; Yan Chen, Northwestern University; Songwu Lu, UCLA; and Wenfei Wu, Beihang University.

How Secure are Secure Interdomain Routing Protocols?
Sharon Goldberg, Microsoft Research New England; Michael Schapira, Yale University; Peter Hummon, Princeton University; and Jennifer Rexford, Princeton University.

Symbiotic Routing in Future Data Centers
Hussam Abu-Libdeh, Cornell University; Paolo Costa, Microsoft Research Cambridge; Antony Rowstron, Microsoft Research Cambridge; Greg O'Shea, Microsoft Research Cambridge; and Austin Donnelly, Microsoft Research Cambridge.