|
|
Knowledge Tools
The Knowledge Tools group started in July, 2004. Out mission is to improve
the data/human interface. People at work have to cope with large, confusing
data sets in order to get their job done and make decisions. We want to build
tools to help these people cope with data complexity. To build these tools, we
must make advances in three areas:
- Programming languages/tools
- Machine learning algorithms that scale to large data sets
- User interfaces and visualization for interacting with data
We believe that several different types of knowledge workers can benefit
from our tools:
- Security analysts must monitor large event logs and gigantic flows
of network traffic
- System administrators need to understand the complex activity of
large server farms
- Decision makers want to have relevant information at their
fingertips, without having to search
To make research progress, we build prototype tools and get them into the
hands of these types of users. We build many of our prototype tools on top
of
IronPython, a version of Python for .NET.
We have analyzed system and network behavior, in order to build more
secure and efficient networks and computers:
-
Fast Variational Inference for Large-Scale Internet Diagnosis by J.C. Platt, E. Kiciman, D.A. Maltz, Advances in Neural Informations Processing Systems 20, (2008).
-
Why did my PC suddenly slow down? by S. Basu, J. Dunagan, G. Smith, Proc. SysML (2007).
-
Analyzing and Improving a BitTorrent Network's Performance Mechanisms
by A. Bharambe, C. Herley, V.N. Padmanabhan, Proc InfoCom (2006).
-
Mining Web Logs to Debug Distant Connectivity Problems by E.
Kiciman, D.A. Maltz, M. Goldszmidt, J.C. Platt, SIGCOMM Workshop on
Mining Network Data, (2006).
-
Automatically Extracting Fields from Unknown Network Protocols
by K. Gopalratnam, S. Basu, J. Dunagan, H. Wang, Proc. Systems and
Machine Learning Workshop, (2006).
-
Some Observations on BitTorrent by A. Bharambe, C. Herley, V.N.
Padmanabhan, ACM Sigmetrics, (2005).
-
Automatic Misconfiguration Troubleshooting with PeerPressure
by H.J. Wang, J. Platt, Y. Chen, R. Zhang, Y.-M. Wang, Proc. 6th Symposium on Operating System Design and Implementation, (2004).
We have published papers on security at scale: exploiting data statistics
to make systems and networks more secure. Publications in this area include:
-
Protecting Financial Instiutions from Brute-Force Attacks by C. Herley and D. Florencio, Proc. SEC, (2008).
-
Can Something-You-Know be Saved? by B. Coskun, C. Herley, Proc. ISC, (2008).
-
A Large Scale Study of Automated of Automated Web Search Traffic by G. Buehrer, J. Stokes, K. Chellapilla, Proc. Int'l Workshop on Adversarial Information Retrieval on the Web, (2008).
-
Do Strong Web Passwords Accomplish Anything? by D. Florêncio, C. Herley, and B. Coskun, Proc. USENIX HotSEC, (2007).
-
Evaluating Password Re-Use for Phishing Prevention by D. Florêncio, C. Herley, Proc. APWG eCrime, (2007).
-
A Large Scale Study of Web Password Habits by D. Florêncio,
C. Herley, Proc. WWW, (2007).
-
KLASSP: Entering Passwords on a Spyware Infected Machine using a
Shared-Secret Proxy by D. Florêncio,
C. Herley, Proc. ACSAC, (2006).
-
How to Login from an Internet Cafe without Worrying about Keyloggers
by D. Florêncio, C. Herley, Symp. on Usable
Privacy and Security, (2006).
-
Password Rescue: A New Approach to Phishing Prevention by
D. Florêncio, C. Herley, 1st USENIX
Workshop on Hot Topics in Security, pp. 7-11, (2006).
-
Analyzing and Improving Anti-Phishing Schemes by D. Florêncio,
C. Herley, Proc. SEC (2006).
We have studied how to help knowledge workers maintain awareness of important
information:
-
BLEWS: Using Blogs to Provide Context for News Articles
by M. Gamon, S. Basu, D. Belenko, D. Fisher, M. Hurst, A.C. König, Proc. Int'l Conf. on Weblogs and Social Media, (2008).
-
Scalable Summaries of Spoken Conversations
by S. Basu, S. Gupta, M. Mahajan, P. Nguyen, J.C. Platt,
Proc. Intelligent User Interfaces, (2008).
-
Selective Supervision: Guiding Supervised Learning with Decision-Theoretic Active Learning by A. Kapoor, E. Horvitz, S. Basu, IJCAI, (2007).
-
Parsing Ink Annotations on Heterogeneous Documents by X. Wang,
M. Shilman, S. Raghupathy, Eurographics Workshop on Sketch-Based Interfaces and Modeling, (2006).
-
Incremental Aspect Models for Mining Document Streams by A.C. Surendran,
pS. Sra, ECML/PKDD, (2006).
-
SWISH: Semantic Analysis of Window Titles and Switching History by N.
Oliver, G. Smith, C. Thakkar, A.C. Surendran, Int'l Conference on Intelligent User Interfaces, (2006).
-
Automatic Discovery of Personal
Topics to Organize Email,
by A.C. Surendran, J.C. Platt, E. Renshaw, 2nd Conference on Email and Anti-Spam, (2005)
-
Modeling Conversational Dynamics as a Mixed-Memory Markov Process by T. Choudhury, S. Basu, Proc. NIPS, Vol. 17 (2004).
Finally, we have created many generic machine learning algorithms, to better
build these applications:
-
Fast Low-Rank Semidefinite Programming for Embedding and Clustering by B. Kulis, A.C. Surendran, J.C. Platt, AISTATS, (2007).
-
Online Decoding of Markov Models under Latency Constraints by M. Narasimhan, P. Viola, M. Shilman, ICML, (2006).
-
Multiple Instance Boosting for Object Detection by P. Viola, J.C. Platt,
C. Zhang, NIPS, Vol 18, pp. 1417-1426, (2006).
-
Redundant Bit Vectors for Quickly Searching High-Dimensional Regions,
by J. Goldstein, J.C. Platt, C.J.C. Burges, Proc. Sheffield Machine Learning Workshop, Springer Lecture Notes in Computer Science 3635, (2005).
-
Extensions of the Informative Vector Machine by
N.D. Lawrence, J.C. Platt, M.I. Jordan, Proc. Sheffield Machine Learning Workshop, Springer Lecture Notes in Computer Science 3635, (2005).
-
Learning to Rank using Gradient Descent, by
C.J.C. Burges, T. Shaked, E. Renshaw, A. Lazier, M. Deeds, N. Hamilton, G. Hullender, 22nd International Conference on Machine Learning, (2005).
-
FastMap, MetricMap, and Landmark MDS are all Nystrom Algorithms, by J.C. Platt, Proc. 10th International Workshop on Artificial Intelligence and Statistics, pp. 261-268, (2005).
-
Learning to Learn with the Informative Vector Machine, by N.D. Lawrence, J.C. Platt, International Conference on Machine Learning,
Paper No. 65, (2004).
For our publications on machine learning related to media, please see our Statistical Media Processing Publications page.
|