Statistical Identification of Encrypted Web Browsing Traffic

Encryption is often proposed as a tool for protecting the privacy of World Wide Web browsing. However, encryption - particularly as typically implemented in, or in concert with popular Web browsers - does not hide all information about the encrypted plaintext. Specifically, HTTP object count and sizes are often revealed (or at least incompletely concealed). We investigate the identifiability of World Wide Web traffic based on this unconcealed information in a large sample of Web pages, and show that it suffices to identify a significant fraction of them quite reliably. We also suggest some possible countermeasures against the exposure of this kind of information and experimentally evaluate their effectiveness.

Publisher  Institute of Electrical and Electronics Engineers, Inc.
© 2002 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.


InstitutionMicrosoft Research
