Marc joined Microsoft Research Silicon Valley in October 2001. He is currently working on link-based ranking algorithms for web search results. Past projects at Microsoft include heuristics for detecting spam web pages; PageTurner, a large-scale study of the evolution of web pages; and Boxwood, a distributed B-Tree system.
Before joining MSR, Marc spent 8 years at DEC's, then Compaq's (and now HP's) Systems Research Center. Projects at SRC included Mercator, a high-performance distributed web crawler; JCAT, a web-based algorithm animation system; and Obliq-3D, a scripting system for 3D animations.
Marc is the editor-in-chief of ACM TWEB and is co-chairing the news section of CACM. He served as conference chair of WSDM 2008 and program co-chair of WWW 2004.
Marc received a Ph.D. in Computer Science from UIUC for his work on Cube, a 3D visual programming language.
- Nick Craswell, Bodo Billerbeck, Dennis Fetterly, and Marc Najork, Robust Query Rewriting using Anchor Data, in 6th ACM International Conference on Web Search and Data Mining (WSDM), ACM, February 2013
- Marc Najork, Detecting Quilted Web Pages at Scale, in 35th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), Association for Computing Machinery, Inc., August 2012
- Marc Najork, Dennis Fetterly, Alan Halverson, Krishnaram Kenthapadi, and Sreenivas Gollapudi, Of Hammers and Nails: An Empirical Comparison of Three Paradigms for Processing Large Graphs, in 5th ACM International Conference on Web Search and Data Mining (WSDM), ACM, February 2012
- Rina Panigrahy, Marc Najork, and Yinglian Xie, How User Behavior is Related to Social Affinity, in 5th ACM International Conference on Web Search and Data Mining (WSDM), ACM, February 2012
- Bodo Billerbeck, Nick Craswell, Dennis Fetterly, and Marc Najork, Microsoft Research at TREC 2011 Web Track, in Proc. of the 20th Text Retrieval Conference (TREC), National Institute of Standards and Technology , November 2011
- Nick Craswell, Dennis Fetterly, and Marc Najork, The Power of Peers, in 33rd European Conference on IR Research (ECIR), Springer Verlag, April 2011
- Nick Craswell, Dennis Fetterly, and Marc Najork, Microsoft Research at TREC 2010 Web Track, in Proc. of the 19th Text Retrieval Conference (TREC), National Institute of Standards and Technology , November 2010
- Marc Najork, Querying the Web Graph (Invited Talk), in 17th International Symposium on String Processing and Information Retrieval (SPIRE), Springer Verlag, October 2010
- Atish Das Sarma, Sreenivas Gollapudi, Marc Najork, and Rina Panigrahy, A Sketch-Based Distance Oracle for Web-Scale Graphs, in 3rd ACM International Conference on Web Search and Data Mining (WSDM), Association for Computing Machinery, Inc., February 2010
- Christopher Olston and Marc Najork, Web Crawling, in Foundations and Trends in Information Retrieval, vol. 4, no. 3, pp. 175-246, NOW Publishers, 2010
- Nick Craswell, Dennis Fetterly, Marc Najork, Stephen Robertson, and Emine Yilmaz, Microsoft Research at TREC 2009: Web and Relevance Feedback Tracks, in Proc. of the 18th Text Retrieval Conference (TREC), National Institute of Standards and Technology , November 2009
- Marc Najork, Web Crawler Architecture, in Encyclopedia of Database Systems, Springer Verlag, September 2009
- Hugo Zaragoza and Marc Najork, Web Search Relevance Ranking, in Encyclopedia of Database Systems, Springer Verlag, September 2009
- Marc Najork, Web Spam Detection, in Encyclopedia of Database Systems, Springer Verlag, September 2009
- Marc Najork, The Scalable Hyperlink Store, in 20th ACM Conference on Hypertext and Hypermedia, Association for Computing Machinery, Inc., June 2009
- Marc Najork, Sreenivas Gollapudi, and Rina Panigrahy, Less is More: Sampling the Neighborhood Graph Makes SALSA Better and Faster, in 2nd ACM International Conference on Web Search and Data Mining (WSDM), Association for Computing Machinery, Inc., February 2009
- Marc Najork and Nick Craswell, Efficient and Effective Link Analysis with Precomputed SALSA Maps, in 17th ACM Conference on Information and Knowledge Management (CIKM), Association for Computing Machinery, Inc., October 2008
- Frank McSherry and Marc Najork, Computing Information Retrieval Performance Measures Efficiently in the Presence of Tied Scores, in 30th European Conference on IR Research (ECIR), Springer-Verlag, April 2008
- Sreenivas Gollapudi, Marc Najork, and Rina Panigrahy, Using Bloom Filters to Speed Up HITS-like Ranking Algorithms, in 5th Workshop on Algorithms and Models for the Web Graph (WAW), Springer-Verlag, December 2007
- Marc Najork, Comparing the Effectiveness of HITS and SALSA, in 16th ACM Conference on Information and Knowledge Management (CIKM), Association for Computing Machinery, Inc., November 2007
- Marc Najork, Hugo Zaragoza, and Michael Taylor, HITS on the Web: How does it Compare?, in 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), Association for Computing Machinery, Inc., Amsterdam, Netherlands, July 2007
- Brian D. Davison, Marc Najork, and Tim Converse, SIGIR Workshop Report: Adversarial Information Retrieval on the Web (AIRWeb 2006), in ACM SIGIR Forum, vol. 40, no. 2, pp. 27-30, ACM, December 2006
- Alexandros Ntoulas, Marc Najork, Mark Manasse, and Dennis Fetterly, Detecting Spam Web Pages Through Content Analysis, in 15th International World Wide Web Conference (WWW), Association for Computing Machinery, Inc., Edinburgh, Scotland, May 2006
- Dennis Fetterly, Mark Manasse, and Marc Najork, Detecting Phrase-Level Duplication on the World Wide Web, in 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), Association for Computing Machinery, Inc., Salvador, Brazil, August 2005
- John MacCormick, Nick Murphy, Marc Najork, Chandramohan A. Thekkath, and Lidong Zhou, Boxwood: Abstractions as the Foundation for Storage Infrastructure, in Symposium on Operating System Design and Implementation (OSDI), USENIX, December 2004
- Dennis Fetterly, Mark Manasse, and Marc Najork, On the Evolution of Clusters of Near-Duplicate Web Pages, in Journal of Web Engineering, vol. 2, no. 4, pp. 228-246, Institute of Electrical and Electronics Engineers, Inc., October 2004
- Dennis Fetterly, Mark Manasse, and Marc Najork, Spam, Damn Spam, and Statistics: Using statistical analysis to locate spam web pages, in 7th International Workshop on the Web and Databases (WebDB), Association for Computing Machinery, Inc., June 2004
- Dennis Fetterly, Mark Manasse, Marc Najork, and Janet Wiener, A Large-Scale Study of the Evolution of Web Pages, in Software: Practice & Experience, vol. 34, no. 2, pp. 213-237, Wiley, February 2004
- Dennis Fetterly, Mark Manasse, and Marc Najork, On the Evolution of Clusters of Near-Duplicate Web Pages, in Proceedings of the 1st Latin American Web Congress (LA-WEB), IEEE Computer Society, Washington, DC, USA, November 2003
- Dennis Fetterly, Mark Manasse, Marc Najork, and Janet Wiener, A large-scale study of the evolution of web pages, in Proceedings of the 12th International World Wide Web Conference (WWW), ACM, New York, NY, USA, May 2003
- Andrei Z. Broder, Marc Najork, and Janet L. Wiener, Efficient URL caching for World Wide Web crawling, in Proceedings of the 12th International World Wide Web Conference (WWW), Budapest, Hungary, May 2003
Issued Patents
- Marc A. Najork. Changing number of machines running distributed hyperlink database. US patent 8,392,366, issued 3/5/2013.
- Marc A. Najork. Incremental update scheme for hyperlink database. US patent 8,209,305, issued 6/26/2012.
- Marc A. Najork, Dennis C. Fetterly, Mark S. Manasse, Alexandros Ntoulas. Using content analysis to detect spam web pages. US patent 7,962,510, issued 6/14/2011.
- Marc A. Najork. Query dependant link-based ranking using authority scores. US patent 7,818,334, issued 10/19/2010.
- Marc A. Najork. Query dependent link-based ranking. US patent 7,792,854, issued 9/7/2010.
- Marc A. Najork. Deletion and compaction using versioned nodes. US patent 7,783,671, issued 8/24/2010.
- Marc A. Najork. Systems and methods for ranking documents based upon structurally interrelated information. US patent 7,739,281, issued 6/15/2010.
- Marc A. Najork. Systems and methods for inferring uniform resource locator (URL) normalization rules. US patent 7,680,785, issued 3/16/2010.
- Marc A. Najork. Fault tolerance scheme for distributed hyperlink database. US Patent 7,627,777, issued 12/1/2009.
- Marc A. Najork. System and method for maintaining a distributed database of hyperlinks. US Patent 7,340,467, issued 3/4/2008.
- Marc A. Najork. System and method for distributed web crawling. US Patent 7,139,747, issued 11/21/2006.
- Marc A. Najork and Chandramohan A. Thekkath. Algorithm for tree traversals using left links. US Patent 7,082,438, issued 7/25/2006.
- Marc A. Najork and Chandramohan A. Thekkath. Deletion and compaction using versioned nodes. US Patent 7,072,904, issued 7/4/2006.
- Marc A. Najork and Chandramohan A. Thekkath. Algorithm for tree traversals using left links. US Patent 7,007,027, issued 2/28/2006.
- Marc A. Najork and Clark A. Heydon. System and method for efficient filtering of data set addresses in a web crawler. US Patent 6,952,730, issued 10/4/2005.
- Marc A. Najork. System and method for identifying cloaked web servers. US Patent 6,910,077, issued 6/21/2005.
- Marc A. Najork, Clark A. Heydon, Michael Mitzenmacher, and Monika H. Henzinger. System and method for near-uniform sampling of web page addresses. US Patent 6,594,694, issued 7/15/2003.
- Marc A. Najork and Clark A. Heydon. Web crawler system using parallel queues for queing data sets having common address and concurrently downloading data associated with data set in each queue. US Patent 6,377,984, issued 4/23/2002.
- Marc A. Najork and Clark A. Heydon. System and method for associating an extensible set of data with documents downloaded by a web crawler. US Patent 6,351,755, issued 2/26/2002.
- Marc A. Najork and Clark A. Heydon. System and method for enforcing politeness while scheduling downloads in a web crawler. US Patent 6,321,265, issued 11/20/2001.
- Marc A. Najork and Clark A. Heydon. System and method for efficient representation of data set addresses in a web crawler. US Patent 6,301,614, issued 10/9/2001.
- Marc A. Najork, Clark A. Heydon, and Janet L. Wiener. Web crawler system using plurality of parallel priority level queues having distinct associated download priority levels for prioritizing document downloading and maintaining document freshness. US Patent 6,263,364, issued 7/17/2001.
