|
|
Text Mining Search and Navigation Research
The Text Mining, Search, and Navigation group (TMSN) works with large text collections, such as the Web, help and support, sites, discussion groups, e-mail, and intranets. We aim to extract and use knowledge from and about these collections and to help people find information more effectively within these collections.
A key area of our research is improving the efficiency of a user's interaction with large collections of information, so they can more easily find/explore/learn/share/buy/sell/meet/edit/express.
Our long-term goal is to develop a new foo, where foo is to today's search engine as a car is to a horse and buggy (where the horse is super old, and the buggy is headache-inducing squeaky and has two bad tires).
We also work on somewhat shorter-term projects, such as: improving the relevance and the reliability of our search functionality, gaining more understanding of the way search is used, building really cool infrastructure to make it easier to process zillions of bytes of data efficiently, etc., and of course, top secret project ZGX-103.
Primary Contact: Eric Brill
| 
Galen | 
Misha | 
Eric | 
Chris J.C. | 
Ronnie | 
Raman | 
Ken | 
Silviu-Petru | 
Imig, Scott | 
John | Photo Not Available Bill | 
Matthew | 
Robert | 
Krysta | 
Ryen | Photo Not Available Qiang | 
Dengyong | | | | Affiliate Members
| 
Susan | 
Tie-Yan | 
Bob | |
-
Jacob R. Lorch, Atul Adya, William J. Bolosky, Ronnie Chaiken, John R. Douceur, Jon Howell.
The SMART way to migrate replicated stateful services
April 2006
Leuven, Belgium
Proceedings of the 2006 EuroSys Conference
103--115
-
Ryen W. White, Diane Kelly.
A study on the effects of personalization and task information on implicit feedback performance
November 2006
New York, NY, USA
CIKM '06: Proceedings of the 15th ACM international conference on Information and knowledge management
297--306
-
M Taylor, H Zaragoza, N Craswell, S Robertson, C Burges.
Optimisation methods for ranking functions with multiple parameters
2006
New York
CIKM 2006: Proceedings of the 13th ACM Conference on Information and Knowledge Management
585--593
-
C.J.C. Burges.
Geometric Methods for Feature Selection and Dimensional Reduction
2005, to appear
Data Mining and Knowledge Discovery Handbook: A Complete Guide for Practitioners and Researchers
L. Rokach and O. Maimon
-
C.J.C. Burges, D. Plastina, J.C. Platt, E. Renshaw, H. Malvar.
Using Audio Fingerprinting for Duplicate Detection and Audio Thumbnail Generation
2005, to appear
Proc. IEEE Conference on Acoustics, Speech and Signal Processing
-
R. Soricut, E. Brill.
A Unified Framework for Automatic Evaluation using N-gram Co-Occurrence Statistics
2004
Proceedings of ACL 2004
-
R. Soricut, E. Brill.
Automatic Question-Answering: Beyond the Factoid
2004
Proceedings of HLT 2004
-
W. Xi, J. Lind, E. Brill.
Learning Effective Ranking Functions for Newsgroup Search
2004
Proceedings of SIGIR 2004
-
S. Cucerzan, E. Brill.
Spelling correction as an iterative process that exploits the collective knowledge of web users
2004
Proceedings of EMNLP 2004
293-300
-
R. Chandrasekar, H. Chen, S. Cortson-Oliver, E. Brill.
Subwebs for Specialized Search
2004
Proceedings of SIGIR 2004
-
H. Daume, E. Brill.
Web Search Intent Induction via Automatic Query Reformulation
2004
Proceedings of HLT 2004
-
C.J.C. Burges.
Some Notes on Applied Mathematics for Machine Learning
2004
Advanced Lectures on Machine Learning
O. Bousquet and U. von Luxburg and G. Rätsch
21--40
-
D. Azari, E. Horvitz, S. Dumais, E. Brill.
Web-based question answering: A decision making perspective
2003
Proceedings of UAI 2003
-
S. Cucerzan, D. Yarowsky.
Minimally Supervised Induction of Grammatical Gender
2003
Proceedings of HLT-NAACL 2003: Main Conference
40-47
-
Matthew Richardson, Pedro Domingos.
Building Large Knowledge Bases by Mass Collaboration
2003
Sanibel Island, FL
Proceedings of the Second International Conference on Knowledge Capture
129-137
-
M. Richardson, P. Domingos.
Learning with Knowledge from Multiple Experts
2003
Washington, DC
Proceedings of the Twentieth International Conference on Machine Learning
624-631
-
Matthew Richardson, Rakesh Agrawal, Pedro Domingos.
Trust Management for the Semantic Web
2003
Sanibel Island, FL
Proceedings of the Second International Semantic Web Conference
351--368
-
C.J.C. Burges, J.C. Platt, S. Jana.
Distortion Discriminant Analysis for Audio Fingerprinting
2003
IEEE Transactions on Speech and Audio Processing
3
165--174
11
-
C.J.C. Burges, D. Crisp.
Uniqueness Theorems for Kernel Methods
2003
Neurocomputing
1-2
187--220
55
-
E. Brill, S. Dumais, M. Banko.
An Analysis of the AskMSR Question-Answering System
2002
Proceedings of EMNLP 2002
-
S. Dumais, M. Banko, E. Brill, J. Lin, A. Ng.
Web question answering: Is more always better?
2002
Proceedings of SIGIR 2002
-
S. Cucerzan, D. Yarowsky.
Augmented Mixture Models for Lexical Disambiguation
2002
Proceedings of EMNLP-2002
33-40
-
S. Cucerzan, D. Yarowsky.
Bootstrapping a Multilingual Part-of-speech Tagger in One Person-day
2002
Proceedings of CoNLL 2002
132-138
-
S. Cucerzan, D. Yarowsky.
Language Independent NER using a Unified Model of Internal and Contextual Evidence
2002
Proceedings of CoNLL 2002
171-175
-
Matthew Richardson, Pedro Domingos.
The Intelligent Surfer: Probabilistic Combination of Link and Content Information in PageRank
2002
Cambridge, MA
Advances in Neural Information Processing Systems 14
T. G. Dietterich and S. Becker and Z. Ghahramani
1441-1448
-
Matthew Richardson, Pedro Domingos.
Mining Knowledge-Sharing Sites for Viral Marketing
2002
Edmonton, Canada
Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
61-70
-
C.J.C. Burges, J.C. Platt, S. Jana.
Extracting Noise-Robust Features from Audio Data
2002
Proc. IEEE Conference on Acoustics, Speech and Signal Processing
1021--1024
-
C.J.C. Burges.
Factoring as Optimization
2002
Microsoft Research
MSR-TR-2002-83
-
Atul Adya, William J. Bolosky, Miguel Castro, Gerald Cermak, Ronnie Chaiken, John R. Douceur, Jon Howell, Jacob R. Lorch, Marvin Theimer, Roger P. Wattenhofer.
FARSITE: Federated, available, and reliable storage for an incompletely trusted environment
Dec. 2002
Boston, MA
Proceedings of the 5th Symposium on Operating Systems Design and Implementation (OSDI)
1--14
-
E. Brill, J. Lin, M. Banko, S. Dumais, A. Ng.
Data-intensive question answering
2001
Proceedings of TREC 2001
-
E. Brill, G. Kacmarcik, C. Brockett.
Learning to Extract Katakana-English Word Pairs from Non-Aligned Web Queries Using a Noisy-Channel Model of Back-Transliteration
2001
Proceedings of NLPRS 2001
-
M. Banko, E. Brill.
Mitigating the Paucity-of-Data Problem: Exploring the Effect of Training Corpus Size on Classifier Performance for Natural Language Processing
2001
Proceedings of HLT 2001
-
M. Banko, E. Brill.
Scaling to Very Very Large Corpora for Natural Language Disambiguation
2001
Proceedings of ACL 2001
-
Pedro Domingos, Matthew Richardson.
Mining the Network Value of Customers
2001
San Francisco, CA
Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
57-66
-
E. Brill, R. C. Moore.
An Improved Error Model for Noisy Channel Spelling Correction
2000
Proceedings of ACL 2000
-
E. Brill.
Pattern-Based Disambiguation for Natural Language Processing
2000
Proceedings of EMNLP/VLC 2000
-
D.J. Crisp, C.J.C. Burges.
A Geometric Interpretation of $\nu$-SVM Classifiers
2000
NIPS
244-250
12
-
C.J.C. Burges, D.J. Crisp.
Uniqueness of the SVM Solution
2000
NIPS
223-229
12
-
C.J.C. Burges.
Geometry and Invariance in Kernel Based Methods
1999
Advances in Kernel Methods: Support Vector Learning
B. Schölkopf and C.J.C. Burges and A.J. Smola
89--116
-
C.J.C. Burges.
A Tutorial on Support Vector Machines for Pattern Recognition
1998
Data Mining and Knowledge Discovery
2
121-167
2
-
C. J. C. Burges, B. Schölkopf.
Improving the accuracy and speed of support vector learning machines
1997
Cambridge, MA
Advances in Neural Information Processing Systems 9
M. Mozer and M. Jordan and T. Petsche
375-381
-
C.J.C. Burges.
Simplified Support Vector Decision Rules
1996
Bari, Italy
Proceedings of the Thirteenth International Conference on Machine Learning
Lorenza Saitta
71-77
-
C.J.C. Burges, O. Matan, Y. Le Cun, J.S. Denker, L.D. Jackel, C.E. Stenard, C.R. Nohl, J.I. Ben.
Shortest Path Segmentation: A Method For Training a Neural Network to Recognize Character Strings
1992
IJCNN Conference Proceedings
165--172
3
|