Where there is a will, there is a way (有志者事竟成)
Building 2, No. 5 Danling Street,
Haidian District, Beijing, P.R. China 100080
Zhicheng Dou is a researcher in Web Search and Data Management Group, Microsoft Research Asia. He joined Microsoft in July 2008. He received his Ph.D. and B.S. degrees in computer science and technology from the Nankai University of China in 2008 and 2003, respectively. His research interests include several topics in Web Search and Data Mining fields, including personalized web search, anchor text and click-through data mining, query understanding, and search result diversification. He is recently interested in the temporal Web, and is now working on extraction and management of the time-series data from the Web.
Besides research, Zhicheng Dou is also a good developer. He enjoys implementing cool ideas into real systems.
- Intent and Diversity (INDI)By submitting one query, users may have different intents. For an ambiguous query, users may seek for different interpretations. For a faceted topic, users may be interested in different subtopics. In this project, we investigate how many queries are ambiguous in real search logs; we propose methods to diversify search results; we experiment with new metrics to measure diversity; we also organize NTCIR INTENT and IMINE tasks to provide common data for IR community.
- Project QSearch has a long document-centric tradition, where “searching information” is equivalent to “searching document”. Project Q is our recent effort to explore a new query-centric search paradigm, which treats query as object and shifts search from “searching document” to “searching query”. We have developed several effective query mining technologies and proved that, when deeply mining queries “without” time constraint, we can greatly improve search relevance and user experiences.
- Web Page Analysis (WEPA)A Web page is not atom but rich in structure. In this project, we take advantage of HTML DOM structure and associated visual features, such as font size, width and height of a DOM element, to understand the purpose of authors in creating a page. We model importance of blocks in the page; we extract structured data from pages across websites; we learn templates from a set of mixed pages from a website; we also identify article title, body and images from pages to improve reading experience.
- WebSensor (InformationSensor)With the rapid growth of the web, there are grand challenges when making sense of web data: big volume, high velocity, high variety, and unknown veracity. In the physical world, a sensor is a converter that measures a physical quantity and converts it into a signal that can be read by an observer or by an instrument—today, mostly electronic. This project creates a virtual, WebSensor layer atop the web.
- WebStudioWebStudio is an end-to-end experimental search system for facilitating search experiments on specific web data collections. In WebStudio, some default components are implemented. Users can customize major operations (including document parsing, page classification, index building, index serving, and front-end processing) in the E2E search engine, by adding their own experimental logic for testing ideas.
- Xiao Ding, Zhicheng Dou, Bing Qin, Ting Liu, and Ji-Rong Wen, Improving Web Search Ranking by Incorporating Structured Annotation of Queries, in EMNLP 2013, October 2013
- Tetsuya Sakai and Zhicheng Dou, Summaries, Ranked Retrieval and Sessions: A Unified Framework for Information Access Evaluation, in Proceedings of SIGIR 2013, ACM, 2013
- Tetsuya Sakai, Zhicheng Dou, Takehiro Yamamoto, Yiqun Liu, Min Zhang, Makoto Kato, Ruihua Song, and Mayu Iwata, Summary of the NTCIR-10 INTENT-2 Task: Subtopic Mining and Search Result Diversification, in Proceedings of SIGIR 2013, ACM, 2013
- Qinglei Wang, Yanan Qian, Zhicheng Dou, Fan Zhang, and Tetsuya Sakai, Mining Search Intents from Text Fragments, in Information Retrieval, 2013
- Tetsuya Sakai, Zhicheng Dou, and Carles Clarke, The Impact of Intent Selection on Diversified Search Evaluation, in Proceedings of SIGIR 2013, ACM, 2013
- Kosetsu Tsukuda, Tetsuya Sakai, Zhicheng Dou, and Katsumi Tanaka, Estimating Intent Types for Search Result Diversification, in Proceedings of AIRS 2013, 2013
- Ke Zhou, Tetsuya Sakai, Mounia Lalmas, Zhicheng Dou, and Joemon M. Jose, Evaluating Heterogeneous Information Access, in ACM SIGIR 2013 Workshop on Modeling User Behavior for Information Access Evaluation, 2013
- Tetsuya Sakai, Zhicheng Dou, Ruihua song, and Noriko Kando, The Reusability of a Diversified Search Test Collection, in Asia Information Retrieval Societies (AIRS 2012), Lecture Notes in Computer Science, 20 December 2012
- Zhicheng Dou, Sha Hu, Kun Chen, Ruihua Song, and Ji-Rong Wen, Multi-dimensional Search Result Diversification, in Proceedings of WSDM'11, Association for Computing Machinery, Inc., February 2011
- Zhicheng Dou, Finding Dimensions for Queries, in Proceedings of CIKM2011, ACM, 2011
- PC, SIGIR 2013, CIKM 2013, IEEE BIG Data 2013, OAIR 2013, WWW 2013, SDM 2013, KDD 2012, WIDM 2009
- Organizer, NTCIR10 Intent2 task
- Reviewer, TKDE, KAIS, KDD'08, WWW'07, KDD'07, APWeb'07, ICDM'06
- Nothing to update