|
Data Exploration
Goal
This project pursues research on data exploration that identifies techniques for flexible ways to query, browse and aggregate data. One of our goals is to support approximate matches and ranked search in the database context. We also like to enable data browsing and querying services for XML that can interoperate between text, structured, and semi-structured (e.g., mail messages) data. We also investigate efficient approximate query processing techniques for answering ad-hoc aggregate queries (e.g. decision support or OLAP queries). People
Publications
The following papers are in pdf format. Click here to install Adobe Acrobat Reader. Chaudhuri, S., Church, K., König, A.C. and Sui L., Heavy-Tailed Distributions and Multi-Keyword Queries . Proceedings of ACM SIGIR, Amsterdam, Netherlands, 2007. [pdf version] Chakrabarti K., Ganti V., Han J., and Xin D., Ranking Objects Based on Relationships: Computing Top-K Over Aggregation. Proceedings of ACM SIGMOD, Chicago, 2006. pdf version König, A.C. and Brill, E. , Reducing the Human Overhead in Text Categorization. Proceedings of the twelfth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Philadelphia, USA, 2006. pdf version Chakrabarti K., Chaudhuri S. and Hwang S., Automatic Categorization of Query Results. Proceedings of ACM SIGMOD, Paris, France, 2004. pdf version Chaudhuri S., Das G. and Srivastava U., Efficient use of Block-Level Sampling in Statistics Estimation. Proceedings of ACM SIGMOD, Paris, France, 2004. pdf version Babcock B., Chaudhuri S. and Das G., Dynamic Sample Selection for Approximate Query Processing. Proceedings of ACM SIGMOD, San Diego, USA, 2003. pdf version Agrawal S., Chaudhuri S., Das G. and Gionis A., Automated Ranking of Database Query Results. Proceedings of First Biennial Conference on Innovative Data Systems Research (CIDR), Asilomar, USA, 2003. pdf version Agrawal S. Chaudhuri S. and Das G. , DBExplorer: A System for Keyword Search over Relational Databases. Proceedings of 18th International Conference on Data Engineering, San Jose, USA, 2002. pdf version Chaudhuri S., Das G., and Narasayya V., A Robust, Optimization-Based Approach for Approximate Answering of Aggregate Queries. Proceedings of ACM SIGMOD, Santa Barbara, USA. 2001. PDF version of Conference paperFull paper (Microsoft Technical Report). Chaudhuri S., Das G., Datar M., Motwani R. and Narasayya V. , Overcoming Limitations of Sampling for Aggregation Queries. Proceedings of 17th International Conference on Data Engineering, Heidelberg, Germany 2001. pdf version Charikar M., Chaudhuri S., Motwani R., Narasayya V. Towards Estimation Error Guarantees for Distinct Values. 19th ACM SIGMOD-SIGACT-SIGART Symp. on Principles of Database Systems, Dallas, USA. 2000. Postscript version Gravano L. Evaluating Top-k Selection Queries. Proceedings of 25th VLDB Conference, Edinburgh, Scotland , UK. 1999. pdf versionChaudhuri S., Motwani R. and Narasayya V., On Random Sampling over Joins. Proceedings of ACM SIGMOD, Philadelphia, USA. 1999. pdf versionChaudhuri S., Motwani R. and Narasayya V., Random Sampling for Histogram Construction: How much is enough? Proceedings of ACM SIGMOD, Seattle, USA. 1998. pdf versionIf you have questions about this project, please contact Surajit Chaudhuri (surajitc@microsoft.com). |