|
|
Systems & Networking
Interns and Visiting Researchers
Each summer the Systems and
Networking Research Group at Microsoft Research has several
outstanding students from top schools join the group for research internships.
Many of the research projects they have pursued have resulted in refereed
research publications, including those winning best paper awards.
This page lists the interns who have worked with us in 2003.
2003 Research Interns
This summer, I have been working with Venkat Padmanabhan and Helen
Wang on the CoopNet project. The primary goal of the CoopNet project is
to enable robust and resilient peer-to-peer data dissemination in the
face of peer transience. To this end, it employs Multiple Description
Coding for redundancy and multiple P2P trees for robustness.
The focus this summer was to apply these techniques to a real and
novel application, viz., streaming live game data to a large number of
"spectators" for an online game. We have prototyped an online game which
supports a spectator mode. We hope to deploy the game within a cluster
of machines and measure the quality of the spectating data stream when
peer nodes fail/leave. In an orthogonal direction, we are also exploring
distributed tree management algorithms for building trees which account
for node heterogeneity (fanout constraints).
This summer I worked on three different projects with Atul Adya. The
first project aims to come up with a unified Transmit Power Control (TPC)
scheme that reduces the power consumption of IEEE 802.11 devices. In the
second project I worked with the next generation of IEEE 802.11 cards,
called Native Wi-Fi cards. I built a user-level diagnostic utility that
allows users to set the card in promiscuous mode, set filters for
capturing packets, and exposes these packets to the user. The third
project extends the diagnostic utility to build a management framework
for wireless networks.
This summer I'm helping John Douceur and Jon Howell add some
automated proof-checking/theorem-proving support for Leslie Lamport's TLA+ specification language. The eventual goal is to verify
the specification of the distributed directory service component of the
Farsite DFS. While there have been some attempts at embedding the
temporal component of TLA in existing theorem proving frameworks, such
embeddings fail to account for the expressive extensions to TLA present
in TLA+. In fact, since TLA+ claims to subsume ordinary mathematical
reasoning in typeless set-theory, there can never be a complete
embedding into current theorem proving systems that heavily depend on
the discipline imposed by types. Interpretations of untyped set-theory
in such type-systems, while trivial, does discard the extra semantic
information available for sets like the naturals, and instead use some
axiomatization of naturals in untyped ZF, directly ruling out any
general support for natural numbers present in theorem provers. My
approach is, instead, to discover a typed sub-language of TLA+ that is
expressive enough, albeit with some user interaction, to meet the needs
of the specification; concretely, I translate sets in TLA+ to types in
the Calculus of Inductive Constructions (CIC), an expressive
type-theory, and translate proofs in TLA+ to (incomplete) proof-terms of
the Coq language that implements CIC.
This summer I worked with Dan Simon to design and implement an
audit-enhanced authentication protocol based on Windows Kerberos
Authentication Package. The idea is to modify Windows kernel and
Kerberos DLL so that a piece of extra information can be piggybacked by
authentication traffic. This message-passing infrastructure is
transparent to applications, and thus is a useful mechanism of security
auditing and event-tracking. Major challenges of the implementation are:
to marshal and unmarshal the extra information with Kerberos AP_REQUEST
data, to change the process token structure in the NT kernel, and to
develop a demo of passing information along IE, IIS and SQL without
modifying a single line of application source code. In the demo, we can
set a request ID in the token of IE. When the IE make a request to IIS,
which connects to a back-end SQL server, the request ID is carried all
the way through these tiers. In other words, all threads and processes
involved in handling the request are aware of this information. A
natural extension of the work is to incorporate digital signature with
the extra information so that it is trustworthy and suitable for
security auditing. Many people in the systems and networking group gave
me help throughout the summer, especially Chad Verbowski and Atul Adya.
I am working on collecting and categorizing registry problems for the
Strider Troubleshooter. I created an initial version of the PC Genomics
database which now has a collection of 85 known registry problems as
well as over 200 basic registry entry-to-function mappings. The known
problems have been classified based on their cause, manifestation and
fix. All this information will eventually be used by Strider to improve
its analysis and root cause ranking.
This summer, I'm working with Yi-Min Wang in the Strider group, focusing
on automatic methods of inferring correctness constraints on the
relatively unstructured data stored in the Windows registry. Our goal
is to discover rules that describe the invariants and the dependencies
among the various keys and values stored in the registry. Our approach
is to analyze snapshots of "believed-good" registries in two stages:
- First, we use data-clustering to discover classes of configurations,
such as ActiveX object registrations. We recognize these configuration
classes by looking for repeated patterns of subkeys and values in our
registry snapshots.
- Second, we generate a (large number of) hypotheses about valid
configurations based on the data in our registry snapshots, then
simplify and generalize these hypotheses based on our discovered
configuration classes.
The final output of this analysis is a rule such as (in almost plain
English) "The key located at [.../defaultprinter/] must reference an
instance of the [printer configuration class] defined at
[.../printers/]". Eventually, these rules are meant to be fed into a
monitoring system that constantly checks the registry for correctness,
detecting possible problems before they manifest themselves to the user.
This summer, I am having a great time working on the Herald project.
Herald is a scalable, fault-tolerant, dynamically self-configuring event
notification service. By harnessing the power of geographically
distributed machines, the system aims to provide better performance,
availability and manageability when compared to more traditional
approaches that rely on the small number of well-managed,
well-provisioned clusters. Herald builds upon the existing scalable
peer-to-peer infrastructure, and constructs an application-level
multicast group for each topic. Since Herald allows a large number of
quiescent topics in the system, it is crucial for the scalability of the
system that there is very little non-data-related network traffic.
During the first portion of my internship, I worked on a new
distributed systems "tool" called Delegated Liveness Checking (DLC),
which will be the subject of an upcoming publication. Later on, I
plan to help evaluate the Herald prototype live using the ModelNet
wide-area large-scale emulation environment.
I am working with Venkat Padmanabhan and Dan Simon on implementing a
"secure traceroute" tool for multi-hop wireless mesh networks.
The threat of faulty and/or malicious routers hampering network
communication becomes real in such a setting since there are no physical
barriers protecting the wireless network. A malicious node may insert
itself as a router in the network and then proceed to drop packets
selectively while remaining immune to detection via traditional
techniques. Secure traceroute is a new technique that can be used to
detect such routers, thereby enabling corrective action such as routing
around the suspect routers.
TBD
Wireless Tracing Project with mentor Alec Wolman
This project involves collecting data from access points all over
Microsoft involving around 3200 access points. The information is
collected via SNMP and stored in a SQL database. The data is collected
as frequently as possible and several times minute. The collected
information is then analyzed to find out the usage characteristics of
wireless networks in a real world corporate scenario. Also, once the
data is collected analysis tools have to be written to find out
interesting information hidden in the huge amount of data. The project
right now does not emphasis on packet level traces though that is not
outside the purview of this project. The entire project is being done in
C#.
I am working on converting the trace data that the Farsite group has
been collecting for the past year and half into a file system benchmark.
I will be running statistics across the data and building a database to
query this information, and then use the information to choose
"representative" pieces of the trace for the benchmark itself. I will
also modify the existing replay tools to match a more general benchmark
scenario.
This summer I worked with Victor Bahl in the Mesh networking project.
While the scope of the project is broad, my focus was on the aspect of
improving capacity and connectivity in wireless networks.
I started my internship with implementing a model of Mesh-Box in the
Qualnet simulator. Such a node is equipped with multiple interfaces,
directional antennas and a meshbox network layer. Later I worked on
implementing location discovery protocol for directional antennas. The
use of directional antennas to improve range and connectivity has been
suggested for a long time now. However, we lack protocols to identify
the direction of peers where we can point the directional antenna. The
worked aimed at providing location discovery information of peer nodes,
which can be used by MAC layer. The protocol assumes minimal features
from hardware, and can be realized with the existing technology.
Finally, I worked on implementing a MAC protocol that uses multiple
interfaces for access control. Having multiple channels offers a lot of
potential in improving capacity of network, and we explored the case
when a dedicated control interface is used for control traffic. I
implemented this protocol in the simulator and compared the performance
with 802.11 and other MAC protocols.
Systems and Networking Research Group | Interns and visitors from all years
|