YI-MIN WANG

Director, Search Quality & Cyber-Intelligence Lab

Microsoft Corporation, Redmond, Washington

Last Updated: February 17, 2008

Work Address
Microsoft Corporation, One Microsoft Way, Redmond, WA 98052
Phone: (425) 706-3467; Fax: (425) 936-7329
URL: http://research.microsoft.com/~ymwang
E-mail: ymwang ‘at’ microsoft ‘dot’ com

Vita: http://research.microsoft.com/~ymwang/vita/vita.htm






R&D Areas of Expertise

Search and Ads Quality, Cybersecurity, Systems Management and Diagnostics, Fault Tolerance, Distributed Systems, and Networking.


Highlights

Yi-Min Wang on the New York Times: “Researchers Track Down a Plague of Fake Web Pages,” by John Markoff, March 19, 2007

On eWeek: "Microsoft Unwraps HoneyMonkey Detection Project," eWeek.com, August 5, 2005


Education

University of Illinois at Urbana-Champaign, Urbana, IL
Jan. 1990 -- Aug. 1993
Ph.D. in Electrical and Computer Engineering (G.P.A. 5.00)

University of Illinois at Urbana-Champaign, Urbana, IL
Aug. 1988 -- Jan. 1990
M.S. in Electrical and Computer Engineering (G.P.A. 5.00)

National Taiwan University, Taipei, Taiwan
Aug. 1982 -- May. 1986
B.S. in Electrical Engineering (Ranked #1 in a class of 168)


Work Experience

Microsoft Research, Redmond, WA
July 2007 – Present
Director / Principal Researcher, Internet Services Research Center (ISRC)

Microsoft Research, Redmond, WA
March 2006 – July 2007
Principal Researcher, Systems and Networking Research Area

Microsoft Research, Redmond, WA
May 2005 -- July 2007
Group Manager, Cybersecurity and Systems Management Research Group

Microsoft Research, Redmond, WA
March 2004 – May 2005
Group Manager, Systems Management Research Group

Microsoft Research, Redmond, WA
Sep. 2003 – March 2004
Senior Researcher, Systems and Networking Research Group

Microsoft Research, Redmond, WA
Jan. 1998 – Aug. 2003
Researcher, Systems and Networking Research Group

AT&T Labs, Florham Park, NJ
Jan. 1996 -- Jan. 1998
Principal Technical Staff Member, Distributed Systems

AT&T Bell Laboratories, Murray Hill, NJ
Aug. 1993 -- Jan. 1996
Member of Technical Staff, Reliable Distributed Systems

University of Illinois at Urbana-Champaign, Urbana, IL
Jan. 1990 -- Aug. 1993
Research Assistant, Fault-Tolerant Computing

University of Illinois at Urbana-Champaign, Urbana, IL
Aug. 1988 -- Jan. 1990
Research Assistant, Digital Signal Processing

R.O.C. (Taiwan) Navy, Penghu, Taiwan
Oct. 1986 -- Aug. 1988
Electrical Engineer, Industrial Electronics

Long Shine Electronics, Taipei, Taiwan
Mar. 1986 -- Oct. 1986
Hardware Engineer, PC Graphic Systems Design


Projects

  • 2007 – Present: Research & Advanced Development in Search, Advertising, and Online Services
    • Automated Relevance Diagnostics System (ARDS)
    • Live-site Metrics
    • Dynamic Crawler
  • 2002 – 2007: Research & Advanced Development in Cybersecurity and Systems Management
    • Founded the Systems Management Research Group, Microsoft Research, Redmond in 2004
    • Expanded it into the Cybersecurity & Systems Management Research Group in 2005
    • Systems Management Projects

·        Strider Troubleshooter

Ø  Systematic configuration troubleshooter based on state differencing, access tracing, and statistical analysis (see June 2003 DSN paper, Oct. 2003 LISA Best Paper & Dec. 2004 OSDI paper)

Ø  Strider File and Registry tracers were shipped as part of Windows Vista.

·        Flight Data Recorder (FDR)

Ø  Highly efficient and highly compressed always-on tracing of persistent-state accesses for configuration monitoring (see 2006 OSDI paper & 2006 LISA paper)

Ø  FDR is now deployed on 1,000+ Microsoft production servers and 500+ desktop machines.

·        Patch Impact Analyzer

Ø  Intersecting always-on persistent-state access trace with patch manifest to predict potential stability impact due to patch installation (see May 2004 ICAC paper)

Ø  Shipped as part of Windows Vista Application Compatibility Toolkit (ACT)

·        Strider Security Tracer

Ø  A black-box tracing technique that identifies the causes for least privilege incompatibilities (i.e., application dependencies on Admin privileges) (see Feb. 2005 NDSS paper)

Ø  Shipped as part of Windows Vista Application Compatibility Toolkit (ACT)

    • Cybersecurity Projects

·        Strider Gatekeeper

Ø  Proposed a characterization of spyware based on the concept of Auto-Start Extensibility Points (ASEPs) (see Nov. 2004 LISA paper)

Ø  This project jumped start Microsoft anti-spyware product planning and the ASEP concept influenced the actual product.

·        Strider GhostBuster

Ø  Proposed a cross-view diff-based approach to rootkit detection (see June 2005 DSN paper & Dec. 2005 LISA paper)

Ø  This project jumped start Microsoft anti-rootkit product planning. The GhostBuster tool was deployed on 200,000 internal machines.

·        Strider HoneyMonkey

Ø  Proposed a black-box, state-change-based, signature-free approach to detecting malicious websites that exploit known and zero-day browser vulnerabilities (see Feb. 2006 NDSS paper)

Ø  This technology was transferred to the Microsoft security unit, which now operates a production HoneyMonkey system.

·        Strider Typo-Patrol

Ø  Proposed a traffic redirection-based analysis for detecting large-scale, systematic domain cybersquatters (see July 2006 SRUTI paper)

Ø  The tool was released at http://research.microsoft.com/URLTracer and has been used by many trademark domain owners to identify cybersquatters.

·        Strider Search Ranger

Ø  Proposed a “Follow the Money” approach to detecting large-scale search spammers who are corrupting the Web with junk content and websites in order to promote their links to spam content into top search results (see Feb 2007 NDSS paper, May 2007 WWW paper, and June 2007 ICAC paper)

Ø  This technology was transferred to Microsoft Live Search organization and was proven to be very effective in reducing spam in search results.

  • 1997 – 2002: Research in Distributed Systems and Networking
    • COMERA
      • Component Object Model (COM) Extensible Remoting Architecture (see April 1998 COOTS paper)
    • Millennium Falcon
      • Extensive marshaling, runtime, and transport optimizations for Distributed Component Object Model (DCOM) over Virtual Interface Architecture (VIA); achieved 72-microsecond round-trip time & 86-megabytes/sec application bandwidth (see July 1999 NT Symposium paper)
    • Aladdin & Simba
      • Extensible remote home automation and sensing (see August 2000 Windows Systems Symposium paper & July 2001 DSN paper)
    • Panorama – Theory
      • Distributed topology control for wireless multi-hop ad hoc networks, based on directional information; proved that "150 degrees" is a tight upper bound for guaranteeing global network connectivity (see April 2001 INFOCOM paper & August 2001 PODC paper)
    • Masquerade
      • Conducted a 100,000-site experiment to show that traffic information consisting of the number of Web objects and object sizes suffices to identify many Web sites (see May 2002 IEEE SSP paper)
  • 1993 – 1997: Research in Fault Tolerance
    • Progressive Retry
      • Proposed the concept of software error recovery through staged rollbacks with increasing scopes and introduced non-determinism (see June 1993 FTCS paper & April 1995 PDS paper)
    • Checkpointing Library
      • Implemented the first checkpointing library, named LibCkp, that supports rollback of file updates consistent with rollback of memory state (see June 1995 FTCS paper)
      • Deployed inside AT&T Bell Labs as part of the CosMiC idle-workstation hunter and process migration system (see Dec. 1997 paper)
    • Checkpointing Survey
      • Initiated and co-authored the well-known checkpointing survey paper titled “A Survey of Rollback-Recovery Protocols in Message-passing Systems”, published in Sep. 2002 ACM Computing Surveys
    • Rollback Dependency Trackability (RDT) – Theory
      • Introduced the concept of RDT, which captures the important property of allowing transitive dependency tracking to carry full information on rollback dependency set in an online fashion (see April 1997 IEEE TC paper)
      • Opened a brand new research direction with many follow-up papers
    • Software Rejuvenation
      • Proposed and demonstrated the concept of proactively terminating and restarting applications to remove potential bad-state accumulation that could lead to failures and service downtime (see March 1996 paper in AT&T Technical Journal & June 1995 FTCS paper)
    • One-IP
      • An IP routing-based cluster platform for scalable and fault-tolerant Web services (see April 1997 WWW paper)
    • Xept
      • An object-code instrumentation tool for intercepting function calls based on a small C-like specification language (see Nov. 1997 ISSRE paper)

 


Honors and Awards

  • Windows Vista Ship-It Award, 2007
  • Microsoft Early Stock Award, 2006
  • Microsoft Achievement Award, 2005
  • Microsoft Gold Star Award, 2005
  • Microsoft Gold Star Award, 2004
  • Best Paper Award, 17th Usenix Large Installation System Administration (LISA) Conference, 2003.
  • Microsoft Gold Star Award, 2000
  • Information Principles Laboratory Award for the work on CosMiC, 1994
  • Robert T. Chien Memorial Award for excellence in research, Graduate College, University of Illinois at Urbana-Champaign, 1993
  • Best Student Paper Award, 12th IEEE Symposium on Reliable Distributed Systems, 1993
  • Best Presentation Award, Center for Reliable and High-Performance Computing Seminars, 1992
  • Dean's list, National Taiwan University, 1983, 1984, 1985 and 1986
  • Member of Phi Tau Phi Honor Society
  • Member of Phi Kappa Phi Honor Society

Patents Issued

1.     "Progressive retry method and apparatus having reusable software modules for software failure recovery in multi-process message-passing applications," U. S. Patent Number 5,440,726, issued on August 8, 1995.

2.     "Input sequence reordering method for software failure recovery," U. S. Patent Number 5,530,802, issued on June 25, 1996.

3.     "Progressive retry method and apparatus for software failure recovery in multi-process message-passing applications," U. S. Patent Number 5,590,277, issued on December 31, 1996.

4.     "Method for software error recovery using consistent global checkpoints," U. S. Patent Number 5,630,047, issused on May 13, 1997.

5.     "Method for deadlock recovery using consistent global checkpoints," U. S. Patent Number 5,664,088, issused on September 2, 1997.

6.     "Distributed recovery with K-optimistic logging," U. S. Patent Number 5,938,775, issued on August 17, 1999.

7.     "Apparatus and methods for sharing idle workstations," U. S. Patent Number 5,978,829, issued on Nov. 2, 1999.

8.     "Client-side parallel requests for network services using group name association," U. S. Patent Number 6,012,090, issued on Jan. 4, 2000.

9.     "Optimistic distributed simulation based on transitive dependency tracking," U. S. Patent Number 6,031,987, issued on Feb. 29, 2000.

10. "Checkpoint and restoration systems for execution control ," U. S. Patent Number 6,044,475, issued on March 28, 2000.

11. "Persistent state checkpoint and restoration systems," U. S. Patent Number 6,105,148, issued on August 15, 2000.

12. "Optimistic distributed simulation based on transitive dependency tracking," U. S. Patent Number 6,341,262, issued on January 22, 2002.

13. "Hosting a network service on a cluster of servers using a single-address image," U. S. Patent Number 6,470,389, issued on October 22, 2002.

14. "Device Adapter for Automation System," U. S. Patent Number 6,535110, issued on March 18, 2003.

15. “Accelerating a distributed component architecture over a network using a modified RPC communication,” U. S. Patent Number 6,708,223, issued on March 16, 2004.

16. “Accelerating a distributed component architecture over a network using a direct marshaling,” U.S. Patent Number 6,826,763, issued on November 30, 2004.

17. “Automation system for controlling and monitoring devices and sensors,” U.S. Patent Number 6,961,763, issued on November 1, 2005.

18. “System and method for protecting privacy and anonymity of parties of network communications,” U.S. Patent Number 6,986,036, issued on January 10, 2006.

19. “Distributed topology control for wireless multi-hop sensor networks,” U.S. Patent Number 6,990,080, issued on January 24, 2006.

20. “Method and system for providing reliability and availability in a distributed component object model (DCOM) object oriented system,” U.S. Patent Number 7,082,553, issued on July 25, 2006.

21. “System and method for evaluating and enhancing source anonymity for encrypted web traffic,” U.S. Patent Number 7,096,200, issued on August 22, 2006.

22. “Pattern-and model-based power line monitoring,” U.S. Patent Number 7,133,729, issued on November 7, 2006.

23. “Weak leader election,” U.S. Patent Number 7,139,790, issued on November 21, 2006.

24. “Event-based Automated Diagnosis of Known Problems,” U.S. Patent Number 7,171,337, issued on January 30, 2007.

 


Professional Activities


Invited Talks

  • Spyware
    • Stanford University, California, November 10, 2005
  • Automated Web Patrol with Strider HoneyMonkeys: Finding Web Sites That Exploit Browser Vulnerabilities
    • University of Illinois, Urbana-Champaign, October 19, 2005
  • STRIDER: A New Approach to Configuration and Security Management
    • UC Berkeley, California, October 28, 2004
    • Georgia Tech, Atlanta, Georgia, November 18, 2004 
  • STRIDER: A Computer Genomics Approach to Systems Management and Support
    • Stanford University, California, May 6, 2003

Publications

Journal papers

1.     Xuxian Jiang, Florian Buchholz, AAron Walters, Dongyan Xu, Yi-Min Wang, Eugene H. Spafford, "Tracing Worm Break-in and Contaminations via Process Coloring: A Provenance-Preserving Approach", in IEEE Transactions on Parallel and Distributed Systems, 19(6), 2008

2.     Ming-Wei Wu, Yi-Min Wang, Yennun Huang, and Sy-Yen Kuo, “Self-Healing Spyware: Detection and Remediation,” in IEEE Transactions on Reliability, Vol. 56, No. 4, December 2007

3.     Xuxian Jiang, Dongyan Xu, Yi-Min Wang, "Collapsar: A VM-Based Honeyfarm and Reverse Honeyfarm Architecture for Network Attack Capture and Detention", Journal of Parallel and Distributed Computing, Special Issue on Security In Grid and Distributed Systems, 66(9), 2006

4.     L. Li, J. Y. Halpern, V. Bahl, Y. M. Wang and R. Wattenhofer, “Analysis of a Cone-Based Distributed Topology Control Algorithms for Wireless Multi-hop Networks,” IEEE/ACM Transaction On Networking, February, 2005.

5.     Yi-Min Wang, Chad Verbowski, John Dunagan, Yu Chen, Yuan Chun, Helen J. Wang, and Zheng Zhang, “STRIDER: A Black-box, State-based Approach to Change and Configuration Management and Support,” Science of Computer Programming, Topics in System Administration, Volume 53, Issue 2, November 2004, pp. 143-164.

6.     Yi-Min Wang, Lili Qiu, Chad Verbowski, Dimitris Achlioptas, Gautam Das, and Paul Larson, “Summary-based Routing for Content-based Event Distribution Networks,” Computer Communication Review (CCR), October 2004.

7.     O. P. Damani, Y. M. Wang, and V. K. Garg, “Distributed Recovery with K-optimistic Logging,” in J. Parallel and Distributed Computing, 63, pp. 1193-1218, 2003.

8.     E. N. Elnozahy, L. Alvisi, Y. M. Wang , and D. B. Johnson, “A Survey of Rollback-Recovery Protocols in Message-passing Systems,” ACM Computing Surveys, Vol. 34, Issue 3, pp. 375 – 408, Sept. 2002.

9.     J. Tsai, S.-Y. Kuo, and Y. M. Wang, " Theoretical Analysis for Communication-Induced Checkpointing Protocols with Rollback-Dependency Trackability," in IEEE Trans. on Parallel and Distributed Systems, Vol.9, No.10, pp.963-971, Oct. 1998.

10. O. P. Damani, P. Y. Chung, Y. Huang, C. Kintala, and Y. M. Wang, " ONE-IP: Techniques for hosting a service on a cluster of machines," in Journal of Computer Networks and ISDN Systems, 29, 1019-1027, 1997.

11. Y. M. Wang, Y. Huang, W. K. Fuchs, C. Kintala, and G. Suri, " Progressive retry for software failure recovery in message-passing applications," in IEEE Trans. on Computers, Vol. 46, No. 10, pp. 1137-1141, Oct. 1997.

12. Y. M. Wang, " Consistent global checkpoints that contain a given set of local checkpoints," in IEEE Trans. on Computers, Vol. 46, No. 4, pp. 456-468, April 1997.

13. Y. Huang, C. Kintala, L. Bernstein and Y. M. Wang, " Components for software fault tolerance and rejuvenation," in AT&T Technical Journal, pp. 29-37, March 1996.

14. Y. M. Wang, P. Y. Chung, I. J. Lin, and W. K. Fuchs, " Checkpoint space reclamation for uncoordinated checkpointing in message-passing systems," in IEEE Trans. on Parallel and Distributed Systems, Vol. 6, No. 5, pp. 546-554, May 1995.

15. Y. M. Wang, P. Y. Chung, and W. K. Fuchs, " Scheduling for periodic concurrent error detection in processor arrays," in J. Parallel and Distributed Computing, Vol. 23, No. 3, pp. 306-313, Dec. 1994.

16. P. Y. Chung, Y. M. Wang, and I. N. Hajj, " Logic design errors diagnosis and correction," in IEEE Trans. on VLSI Systems, Vol. 2, No. 3, pp. 320-332, Sep. 1994.

17. Y. M. Wang, H. Lee, and D. V. Apte, " Quantitative NMR spectroscopy by matrix pencil methods," in International Journal of Imaging Systems and Technology, Vol. 4, pp. 201-206, 1992.


Bulletins, letters, and magazine papers

1.     J. Tsai, S. Y. Kuo and Y. M. Wang, "Evaluations on Domino-Free Communication-Induced Checkpointing Protocols," Information Processing Letters, Vol. 69, No. 1, pp. 31-37, Jan. 1999.

2.     P. Bahl, A. Balachandran, A. Miu, W. Russell, G. Voelker and Y. M. Wang, PAWNs: Satisfying the Need for Ubiquitous Connectivity and Location Services, IEEE Personal Communications Magazine (PCS), Vol. 9, No. 1, February 2002.

3.     Y. M. Wang and P.-Y. E. Chung, " Customization of Distributed Systems Using COM," IEEE Concurrency Magazine, Vol.6, No.3, pp.8-12, July-Sep. 1998.

4.     P. E. Chung, Y. Huang, S. Yajnik, D. Liang, J. C. Shih, C.-Y. Wang, and Y. M. Wang, " DCOM and CORBA Side by Side, Step By Step, and Layer by Layer," C++ Report, Vol. 10, No. 1, pp. 18-29,40, Jan. 1998.

5.     Y. Huang, C. Kintala, and Y. M. Wang, " Software Tools and Libraries for Fault Tolerance," in Bulletin of the Technical Committee on Operating Systems and Application Environment (TCOS), Vol. 7, No. 4, pp. 5-9, Winter, 1995.

6.     Y. M. Wang, A. Lowry, and W. K. Fuchs, " Consistent global checkpoints based on direct dependency tracking," Information Processing Letters, Vol. 50, No. 4, pp. 223-230, May 1994.


Conference papers

1.     Xuxian Jiang, Helen Wang, Dongyan Xu and Yi-Min Wang, “RandSys: A Two-Dimensional Randomization Approach to Thwarting Code Injection Attacks,” in Proc. SRDS, 2007

2.     Yi-Min Wang and Ming Ma, “Strider Search Ranger: Towards an Autonomic Anti-Spam Search Engine,” in Proc. ICAC, June 2007

3.     Yi-Min Wang, Ming Ma, Yuan Niu, and Hao Chen, “Spam Double-Funnel: Connecting Web Spammers with Advertisers,” in Proc. WWW, May 2007

4.     Yuan Niu, Yi-Min Wang, Hao Chen, Ming Ma, and Francis Hsu, “A Quantitative Study of Forum Spamming Using Context-based Analysis,” in Proc. NDSS, February 2007

5.     Ming-Wei Wu, Yennun Huang, Yi-Min Wang, Sy-Yen Kuo, “A Stateful Approach to Spyware Detection and Removal,” in Proc. PRDC, Dec. 2006.

    1. Chad Verbowski, Juhan Lee, Xiaogang Liu, Roussi Roussev, and Yi-Min Wang, “LiveOps

: Systems Management as a Service

,” in Proc. Large Installation System Administration (LISA) Conference, 2006.

7.