I am now a lead researcher in Knowledge Mining Group, Microsoft Research Asia. I joined Microsoft in Aug 2004, just after receiving my PhD degree from Tsinghua University in Jul 2004. My main research interests include natural language understanding and knowledge mining.
Research Projects and Prototypes
NeedleSeek is a project for open-domain semantic search and mining. It aims at automatically extracting and aggregating semantic knowledge from tera-scale, open-domain data sources, and providing semantic search based on the knowledge. Try our online demo (needleseek.msra.cn) to access the NeedleSeek knowledge-base.
- PC member: WWW'2014, ACL’2014, EACL’2014, EMNLP’2014
- Co-chair of CIKM'2013 demo track
- PC member: WWW'2013, ACL'2013, EMNLP'2013
- Co-organizer of the RITE track: NTCIR-9 (2011) and NTCIR-10 (2012/2013).
- PC member: WWW'2012, ACL'2012, NAACL'2012, COLING'2012, SIGIR'2012
- PC member: WWW'2011, EMNLP'2011, APWeb'2011
- PC member: AAAI'2010, EMNLP'2010, WISE'2010
- PC member: IUI'2009
- Journal reviewer: Computational Linguistics (CL), 2014
- Journal reviewer: ACM Transactions on Information Systems (TOIS), 2011, 2012, 2013
- Jounral reviewer: Information Processing & Management (IPM), 2010
- Jounral reviewer: IEEE Trans. Parallel & Distributed Systems (TPDS), 2009
- Journal reviewer: Pattern Recognition Letters, 2005~2008
Selected Conference and Journal Publications
- Lei Shi, Shuming Shi, Chin-Yew Lin, Yi-Dong Shen, and Yong Rui. Unsupervised Template Mining for Semantic Category Understanding. In Proceedings of EMNLP 2014. [paper]
- Bonan Min, Shuming Shi, Ralph Grishman and Chin-Yew Lin. Towards Large-Scale Unsupervised Relation Extraction from the Web. In International Journal on Semantic Web and Information Systems (IJSWIS), Volume 8 Issue 3, 2012.
- Bonan Min, Shuming Shi, Ralph Grishman and Chin-Yew Lin. Ensemble Semantics for Large-scale Unsupervised Relation Extraction . In Proceedings of EMNLP-CoNLL 2012.
- Fan Zhang, Shuming Shi, Jing Liu, Shuqi Sun, and Chin-Yew Lin. Nonlinear Evidence Fusion and Propagation for Hyponymy Relation Mining. In ACL'11.
- Hao Yan, Shuming Shi, Fan Zhang, Torsten Suel and Ji-Rong Wen. Efficient Term Proximity Search with Term-Pair Indexes. In CIKM'10.
- Shuming Shi, Huibin Zhang, Xiaojie Yuan, and Ji-Rong Wen. Corpus-based Semantic Class Mining: Distributional vs. Pattern-Based Approaches. In the 23rd International Conference on Computational Linguistics (COLING'10), Beijing, August, 2010.
- Fan Zhang, Shuming Shi, Hao Yan, and Ji-Rong Wen. Revisiting Globally Sorted Indexes for Efficient Document Retrieval. Third ACM International Conference on Web Search and Data Mining (WSDM'10), New York, 2010.
- Shuming Shi, Bin Lu, Yunxiao Ma, Ji-Rong Wen. Nonlinear Static-Rank Computation. In ACM 18th Conference on Information and Knowledge Management (CIKM'09). Hong Kong, China, Nov. 2~6, 2009.
- Huibin Zhang, Mingjie Zhu, Shuming Shi, and Ji-Rong Wen. Employing Topic Models for Pattern-based Semantic Class Discovery. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL'09), Singapore, August 2009. [paper][slides][related research prototype]
- Mingjie Zhu, Shuming Shi, Mingjing Li, and Ji-Rong Wen. Effective Top-k Computation with Term-Proximity Support. In Information Processing and Management (IPM Journal), 2009.
- Mingjie Zhu, Shuming Shi, Nenghai Yu, and Ji-Rong Wen. Can Phrase Indexing Help to Process Non-Phrase Queries? In ACM 17th Conference on Information and Knowledge Management (CIKM'08). Napa Valley, California, USA, 2008.
- Shuming Shi, Xiaokang Liu, Ji-Rong Wen. Pattern-based Semantic Class Discovery with Multi-Membership Support. In ACM 17th Conference on Information and Knowledge Management (CIKM'08). Napa Valley, California, USA, 2008 (Poster). [related research prototype]
- Zhiwei Li, Shuming Shi, Lei Zhang. Improving Relevance Judgment of Web Search Results with Image Excerpts. In Proceedings of the 17th International World Wide Web Conference (WWW'08). Beijing, China, 2008.
- Mingjie Zhu, Shuming Shi, Mingjing Li, and Ji-Rong Wen. Effective Top-K Computation in Retrieving Structured Documents with Term-Proximity Support. In ACM 16th Conference on Information and Knowledge Management (CIKM'07). Lisbon, Portugal, Nov. 6-9, 2007.
- Zaiqing Nie, Yunxiao Ma, Shuming Shi, Ji-Rong Wen, and Wei-Ying Ma. Web Object Retrieval. In Proceedings of the 16th International World Wide Web Conference (WWW'07). May 8-12, 2007.
- Shuming Shi, Fei Xing, Mingjie Zhu, Zaiqing Nie, Ji-Rong Wen. Pseudo-Anchor Text Extraction for Searching Vertical Objects. In Proceedings of the 2006 ACM 15th Conference on Information and Knowledge Management (CIKM'06). Arlington, USA, Nov. 6-11, 2006 (Poster).
- Shuming Shi, Ji-Rong Wen, Qing Yu, Rui-Hua Song, Wei-Ying Ma. Gravitation-based Model for Information Retrieval. In Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'05). Salvador, Brazil, August 15-19, 2005. [paper] [slides]
- Yunhua Hu, Guomao Xin, Ruihua Song, Guoping Hu, Shuming Shi, Yunbo Cao, and Hang Li. Title Extraction from Bodies of HTML Documents and its Application to Web Page Retrieval. In Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'05). Salvador, Brazil, August 15-19, 2005.
- Ruihua Song, Ji-Rong Wen, Shuming Shi, Guomao Xin, Tie-Yan Liu, et al. Microsoft Research Asia at Web Track and Terabyte Track of TREC 2004. In 2004 Text REtrieval Conference (TREC'04). [paper]
- Shuming Shi, Guangwen Yang, Dingxing Wang, Jin Yu, Shaogang Qu, Ming Chen. Making Peer-to-Peer Keyword Searching Feasible Using Multi-level Partitioning. In Proceedings of the 3rd International Workshop on Peer-to-Peer Systems (IPTPS'04) San Diego, CA, USA. February 26-27, 2004. [paper]
- Shuming Shi, Jin Yu, GuangWen Yang, DingXing Wang. Distributed Page Ranking in Structured P2P Networks. International Conference on Parallel Processing (ICPP'03), 2003. [paper]
- Zheng Zhang, Shuming Shi, Jing Zhu. SOMO: self-organized metadata overlay for resource management in P2P DHT. In Proceedings of the 2nd International Workshop on Peer-to-Peer Systems (IPTPS'03). 20-21 February 2003. Berkeley, CA, USA. [paper]
- Shuming Shi, Ruihua Song, Ji-Rong Wen. Latent Additivity: Combining Homogeneous Evidence. Technical report, MSR-TR-2006-110, Microsoft Research, August 2006. [pdf]
[This report provides a simple and effective approach for combining evidence.]
- Shuming Shi, Fei Xing, Mingjie Zhu, Zaiqing Nie, Ji-Rong Wen. Pseudo-Anchor Text Extraction for Vertical Search. Technique report, MSR-TR-2006-122, Microsoft Research, August 2006. [pdf]
- Shuming Shi, Ji-Rong Wen, Qing Yu, Rui-Hua Song, Wei-Ying Ma. Gravitation-based model for information retrieval (extended version). Technical report, MSR-TR-2005-65, Microsoft Research, May 2005. [pdf]
Knowledge Mining Group
Microsoft Research Asia
No. 5 Danling Street, Haidian District
Beijing, P.R.China, 100080
Email: shumings AT microsoft.com