|
|
Text Mining Search and Navigation Research
The Text Mining, Search, and Navigation group (TMSN) works with large text collections, such as the Web, help and support, sites, discussion groups, e-mail, and intranets. We aim to extract and use knowledge from and about these collections and to help people find information more effectively within these collections.
A key area of our research is improving the efficiency of a user's interaction with large collections of information, so they can more easily find/explore/learn/share/buy/sell/meet/edit/express.
Our long-term goal is to develop a new foo, where foo is to today's search engine as a car is to a horse and buggy (where the horse is super old, and the buggy is headache-inducing squeaky and has two bad tires).
We also work on somewhat shorter-term projects, such as: improving the relevance and the reliability of our search functionality, gaining more understanding of the way search is used, building really cool infrastructure to make it easier to process zillions of bytes of data efficiently, etc., and of course, top secret project ZGX-103.
Primary Contact: Chris J.C. Burges
| 
Galen | 
Misha | 
Eric | 
Chris J.C. | 
Ronnie | 
Raman | 
Ken | 
Silviu-Petru | Photo Not Available Ofer | 
Imig, Scott | 
John | Photo Not Available Pastusiak, Andrzej | Photo Not Available Bill | 
Matthew | 
Robert | 
Krysta | 
Ryen | Photo Not Available Qiang | 
Dengyong | | Affiliate Members
| 
Susan | 
Tie-Yan | 
Bob | |
-
W. Dakka, S. Cucerzan.
Augmenting Wikipedia with Named Entity Tags
2008
Proceedings of IJCNLP 2008
-
P. Li, C.J.C. Burges, Q. Wu.
Learning to Rank Using Classification and Gradient Boosting
2008
Advances in Neural Information Processing Systems 20
-
M. Bilenko, R.W. White, M. Richardson, G.C. Murray.
Talking the Talk vs. Walking the Walk: Salience of Information Needs in Querying vs. Browsing
2008
Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR-2008)
-
M. Bilenko, R.W. White.
Mining the Search Trails of Surfing Crowds: Identifying Relevant Websites From User Activity
2008
Proceedings of the 17th International World Wide Web Conference (WWW-2008)
-
R.W. White, M. Bilenko, S. Cucerzan.
Leveraging Popular Destinations to Enhance Web Search Interaction
2008
ACM Transactions on the Web
-
P. Singla, M. Richardson.
Yes, There is a Correlation - From Social Networks to Personal Behavior on the Web
2008
Proceedings of the 17th International World Wide Web Conference (WWW-2008)
-
S. Cucerzan.
Large-Scale Named Entity Disambiguation Based on Wikipedia Data
2007
Proceedings of EMNLP-CoNLL 2007
708--716
-
A. Jain, S. Cucerzan, S. Azzam.
Acronym-Expansion Recognition and Ranking on the Web
2007
Proceedings of IEEE-IRI 2007
-
R.W. White, M. Bilenko, S. Cucerzan.
Studying the Use of Popular Destinations to Enhance Web Search Interaction
2007
Proceedings of SIGIR 2007
159--166
-
S. Cucerzan, R.W. White.
Query Suggestion based on User Landing Pages
2007
Proceedings of SIGIR 2007
875-876
-
R.W. White, C.L.A. Clarke, S. Cucerzan.
Comparing Query Logs and Pseudo-Relevance Feedback for Web-Search Query Refinement
2007
Proceedings of SIGIR 2007
831--832
-
K.M. Svore, L. Vanderwende,, C.J.C. Burges.
Enhancing Single-document Summarization by Combining RankNet and Third-party Sources
2007
Proceedings of Empirical Methods in Natural Language Processing (EMNLP)
-
K.M. Svore, Q. Wu, C.J.C. Burges, A. Raman.
Improving Web Spam Classification using Rank-time Features
2007
Proceedings of Adversarial Information Retrieval on the Web (AIRWeb)
-
D. Zhou, C.J.C. Burges.
Spectral Clustering and Transductive Learning with Multiple Views
2007
Proceedings of the 24th International Conference on Machine Learning
-
D. Zhou, C.J.C. Burges, Tao.
Transductive Link Spam Detection
2007
Proceedings of Adversarial Information Retrieval on the Web (AIRWeb)
-
C.J.C. Burges, R. Ragno, Q.V. Le.
Learning to Rank with Non-Smooth Cost Functions
2007
Advances in Neural Information Processing Systems 19
-
G. Andrew, J. Gao.
Scalable training of L1-regularized log-linear models
2007
Proceedings of the 24th International Conference on Machine Learning(ICML)
-
J. Gao, G. Andrew, M. Johnson, K. Toutanova.
A comparative study of parameter estimation methods for statistical natural language processing
2007
Proceedings of the 45th Annual Meeting of the Association for Computational Lingustics(ACL)
-
A. Edmonds, R.W. White, D. Morris, S.M. Drucker.
Instrumenting the Dynamic Web
2007
Journal of Web Engineering
243--260
-
G. Marchionini, R.W. White.
Find What You Need, Understand What You Find
2007
International Journal of Human-Computer Interaction
205-237
-
R.W. White, G. Marchionini.
Examining the Effectiveness of Real-Time Query Expansion
2007
Information Processing and Management
685-704
-
R.W. White, S.M. Drucker.
Investigating the Querying and Browsing Behavior of Advanced Search Engine Users
2007
Proceedings of SIGIR
-
C.L.A. Clarke, E. Agichtein, S. Dumais, R.W. White.
The Influence of Caption Features on Clickthrough Patterns in Web Search
2007
Proceedings of SIGIR
-
R.W. White, S.M. Drucker.
Investigating Behavioral Variability in Web Search
2007
Proceedings of the 16th International World Wide Web Conference(WWW)
-
R.W. White, S.M. Drucker, G. Marchionini, M. Hearst, mc schraefel.
Exploratory Search and HCI: Designing and Evaluating Interfaces to Support Exploratory Search Interaction
2007
ACM SIGCHI Conference on Human Factors in Computing Systems(CHI)
-
M. Melucci, R.W. White.
Utilizing a Geometry of Context For Enhanced Implicit Feedback
2007
Proceedings of the 16th Annual ACM CIKM Conference on Information and Knowledge Management (CIKM)
-
M. Melucci, R.W. White.
Discovering Hidden Contextual Factors for Implicit Feedback
2007
Workshop on Contextual Information Retrieval (part of the 6th International and Interdisciplinary Conference on Modeling and Using Context)
-
M. Richardson, E. Dominowska, R. Ragno.
Predicting Clicks: Estimating the Click-Through Rate for New Ads
2007
Proceedings of the 16th International World Wide Web Conference(WWW-2007)
-
Jacob R. Lorch, Atul Adya, William J. Bolosky, Ronnie Chaiken, John R. Douceur, Jon Howell.
The SMART way to migrate replicated stateful services
April 2006
Leuven, Belgium
Proceedings of the 2006 EuroSys Conference
103--115
-
Ryen W. White, Diane Kelly.
A study on the effects of personalization and task information on implicit feedback performance
November 2006
New York, NY, USA
CIKM '06: Proceedings of the 15th ACM international conference on Information and knowledge management
297--306
-
M Taylor, H Zaragoza, N Craswell, S Robertson, C Burges.
Optimisation methods for ranking functions with multiple parameters
2006
New York
CIKM 2006: Proceedings of the 13th ACM Conference on Information and Knowledge Management
585--593
-
I. Matveeva, C.J.C. Burges, T. Burkard, A. Laucius, L. Wong.
High accuracy retrieval with multiple nested rankers
2006
Proceedings of SIGIR
-
G. Andrew.
A hybrid Markov/semi-Markov conditional random field for sequence segmentation
2006
Proceedings of Empirical Methods in Natural Language Processing(EMNLP)
-
M. Bilenko, B. Kamath, R.J. Mooney.
Adaptive Blocking: Learning to Scale Up Record Linkage and Clustering
2006
Proceedings of the 6th IEEE International Conference on Data Mining (ICDM-2006)
-
M. Richardson, A. Prakash, E. Brill.
Beyond PageRank: Machine Learning for Static Ranking
2006
Proceedings of the 15th International World Wide Web Conference(WWW-2006)
-
C. Konig, E. Brill.
Reducing the human overhead in text categorization
2006
Proceedings of KDD 2006
-
S. Vassilvitskii, E. Brill.
Using Web-Graph Distance for Relevance Feedback in Web Search
2006
Proceedings of SIGIR 2006
-
E. Agichtein, E. Brill, S. Dumais.
Improving Web Search Ranking by Incorporating User Behavior
2006
Proceedings of SIGIR 2006
-
E. Agichtein, E. Brill, S. Dumais, R. Ragno.
Learning User Interaction Models for Predicting Web Search Result Preferences
2006
Proceedings of SIGIR 2006
-
C.J.C. Burges.
Geometric Methods for Feature Selection and Dimensional Reduction
2005, to appear
Data Mining and Knowledge Discovery Handbook: A Complete Guide for Practitioners and Researchers
L. Rokach and O. Maimon
-
C.J.C. Burges, D. Plastina, J.C. Platt, E. Renshaw, H. Malvar.
Using Audio Fingerprinting for Duplicate Detection and Audio Thumbnail Generation
2005, to appear
Proc. IEEE Conference on Acoustics, Speech and Signal Processing
-
E. Agichtein, S Cucerzan, E. Brill.
Analysis of Factoid Questions for Effective Relation Extraction
2005
Proceedings of SIGIR 2005
-
R. Soricut, E. Brill.
Automatic Question Answering Using the Web: Beyond the Factoid
2005
Journal of Information Retrieval - Special Issue on Web Information Retrieval
-
R. Soricut, E. Brill.
A Unified Framework for Automatic Evaluation using N-gram Co-Occurrence Statistics
2004
Proceedings of ACL 2004
-
R. Soricut, E. Brill.
Automatic Question-Answering: Beyond the Factoid
2004
Proceedings of HLT 2004
-
W. Xi, J. Lind, E. Brill.
Learning Effective Ranking Functions for Newsgroup Search
2004
Proceedings of SIGIR 2004
-
S. Cucerzan, E. Brill.
Spelling correction as an iterative process that exploits the collective knowledge of web users
2004
Proceedings of EMNLP 2004
293-300
-
R. Chandrasekar, H. Chen, S. Cortson-Oliver, E. Brill.
Subwebs for Specialized Search
2004
Proceedings of SIGIR 2004
-
H. Daume, E. Brill.
Web Search Intent Induction via Automatic Query Reformulation
2004
Proceedings of HLT 2004
-
C.J.C. Burges.
Some Notes on Applied Mathematics for Machine Learning
2004
Advanced Lectures on Machine Learning
O. Bousquet and U. von Luxburg and G. Rätsch
21--40
-
D. Azari, E. Horvitz, S. Dumais, E. Brill.
Web-based question answering: A decision making perspective
2003
Proceedings of UAI 2003
-
S. Cucerzan, D. Yarowsky.
Minimally Supervised Induction of Grammatical Gender
2003
Proceedings of HLT-NAACL 2003: Main Conference
40-47
-
Matthew Richardson, Pedro Domingos.
Building Large Knowledge Bases by Mass Collaboration
2003
Sanibel Island, FL
Proceedings of the Second International Conference on Knowledge Capture
129-137
-
M. Richardson, P. Domingos.
Learning with Knowledge from Multiple Experts
2003
Washington, DC
Proceedings of the Twentieth International Conference on Machine Learning
624-631
-
Matthew Richardson, Rakesh Agrawal, Pedro Domingos.
Trust Management for the Semantic Web
2003
Sanibel Island, FL
Proceedings of the Second International Semantic Web Conference
351--368
-
C.J.C. Burges, J.C. Platt, S. Jana.
Distortion Discriminant Analysis for Audio Fingerprinting
2003
IEEE Transactions on Speech and Audio Processing
3
165--174
11
-
C.J.C. Burges, D. Crisp.
Uniqueness Theorems for Kernel Methods
2003
Neurocomputing
1-2
187--220
55
-
E. Brill, S. Dumais, M. Banko.
An Analysis of the AskMSR Question-Answering System
2002
Proceedings of EMNLP 2002
-
S. Dumais, M. Banko, E. Brill, J. Lin, A. Ng.
Web question answering: Is more always better?
2002
Proceedings of SIGIR 2002
-
S. Cucerzan, D. Yarowsky.
Augmented Mixture Models for Lexical Disambiguation
2002
Proceedings of EMNLP-2002
33-40
-
S. Cucerzan, D. Yarowsky.
Bootstrapping a Multilingual Part-of-speech Tagger in One Person-day
2002
Proceedings of CoNLL 2002
132-138
-
S. Cucerzan, D. Yarowsky.
Language Independent NER using a Unified Model of Internal and Contextual Evidence
2002
Proceedings of CoNLL 2002
171-175
-
Matthew Richardson, Pedro Domingos.
The Intelligent Surfer: Probabilistic Combination of Link and Content Information in PageRank
2002
Cambridge, MA
Advances in Neural Information Processing Systems 14
T. G. Dietterich and S. Becker and Z. Ghahramani
1441-1448
-
Matthew Richardson, Pedro Domingos.
Mining Knowledge-Sharing Sites for Viral Marketing
2002
Edmonton, Canada
Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
61-70
-
C.J.C. Burges, J.C. Platt, S. Jana.
Extracting Noise-Robust Features from Audio Data
2002
Proc. IEEE Conference on Acoustics, Speech and Signal Processing
1021--1024
-
C.J.C. Burges.
Factoring as Optimization
2002
Microsoft Research
MSR-TR-2002-83
-
Atul Adya, William J. Bolosky, Miguel Castro, Gerald Cermak, Ronnie Chaiken, John R. Douceur, Jon Howell, Jacob R. Lorch, Marvin Theimer, Roger P. Wattenhofer.
FARSITE: Federated, available, and reliable storage for an incompletely trusted environment
Dec. 2002
Boston, MA
Proceedings of the 5th Symposium on Operating Systems Design and Implementation (OSDI)
1--14
-
E. Brill, J. Lin, M. Banko, S. Dumais, A. Ng.
Data-intensive question answering
2001
Proceedings of TREC 2001
-
E. Brill, G. Kacmarcik, C. Brockett.
Learning to Extract Katakana-English Word Pairs from Non-Aligned Web Queries Using a Noisy-Channel Model of Back-Transliteration
2001
Proceedings of NLPRS 2001
-
M. Banko, E. Brill.
Mitigating the Paucity-of-Data Problem: Exploring the Effect of Training Corpus Size on Classifier Performance for Natural Language Processing
2001
Proceedings of HLT 2001
-
M. Banko, E. Brill.
Scaling to Very Very Large Corpora for Natural Language Disambiguation
2001
Proceedings of ACL 2001
-
Pedro Domingos, Matthew Richardson.
Mining the Network Value of Customers
2001
San Francisco, CA
Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
57-66
-
E. Brill, R. C. Moore.
An Improved Error Model for Noisy Channel Spelling Correction
2000
Proceedings of ACL 2000
-
E. Brill.
Pattern-Based Disambiguation for Natural Language Processing
2000
Proceedings of EMNLP/VLC 2000
-
D.J. Crisp, C.J.C. Burges.
A Geometric Interpretation of $\nu$-SVM Classifiers
2000
NIPS
244-250
12
-
C.J.C. Burges, D.J. Crisp.
Uniqueness of the SVM Solution
2000
NIPS
223-229
12
-
C.J.C. Burges.
Geometry and Invariance in Kernel Based Methods
1999
Advances in Kernel Methods: Support Vector Learning
B. Schölkopf and C.J.C. Burges and A.J. Smola
89--116
-
C.J.C. Burges.
A Tutorial on Support Vector Machines for Pattern Recognition
1998
Data Mining and Knowledge Discovery
2
121-167
2
-
C. J. C. Burges, B. Schölkopf.
Improving the accuracy and speed of support vector learning machines
1997
Cambridge, MA
Advances in Neural Information Processing Systems 9
M. Mozer and M. Jordan and T. Petsche
375-381
-
C.J.C. Burges.
Simplified Support Vector Decision Rules
1996
Bari, Italy
Proceedings of the Thirteenth International Conference on Machine Learning
Lorenza Saitta
71-77
-
C.J.C. Burges, O. Matan, Y. Le Cun, J.S. Denker, L.D. Jackel, C.E. Stenard, C.R. Nohl, J.I. Ben.
Shortest Path Segmentation: A Method For Training a Neural Network to Recognize Character Strings
1992
IJCNN Conference Proceedings
165--172
3
|