|
|
Marc joined Microsoft Research Silicon
Valley in October 2001. He is currently working on heuristics for
detecting "spam" web pages, and on link-based ranking algorithms for
web search results. Past projects at Microsoft include Boxwood, a distributed B-Tree system,
and PageTurner, a large-scale
study of the evolution of web pages.
Before joining MSR, Marc spent 8 years at DEC's, then Compaq's (and now HP's)
Systems Research Center.
Projects at SRC included
Mercator,
a high-performance distributed web crawler,
JCAT,
a web-based algorithm animation system, and
Obliq-3D,
a scripting system for 3D animations.
Marc served as program co-chair of
WWW2004
and conference chair of WSDM 2008.
He is co-chairing the news section of
CACM,
and he is an associate editor of ACM TWEB
and of JVLC.
Marc received a Ph.D. in Computer Science from
UIUC
for his work on Cube, a 3D visual programming language.
Papers and patents
Book chapters |
Marc Najork and Allan Heydon.
High-Performance Web Crawling.
Chapter 2 in J. Abello et al. (editors),
Handbook of Massive Data Sets, Kluwer Academic Publishers, 2002.
Marc H. Brown and Marc A. Najork.
Algorithm Animation Using Interactive 3D Graphics.
Chapter 9 in J. Stasko et al. (editors),
Software Visualization - Programming as a Multimedia Experience, MIT Press, 1998.
| Journal articles |
Brian Davison, Marc Najork, and Tim Converse.
SIGIR Workshop Report: Adversarial Information Retrieval on the Web (AIRWeb 2006).
ACM SIGIR Forum 40(2):27-30, December 2006.
[Abstract]
Dennis Fetterly, Mark Manasse, and Marc Najork.
On the Evolution of Clusters of Near-Duplicate Web Pages.
Journal of Web Engineering, 2(4):228-246, October 2004.
[Abstract]
Dennis Fetterly, Mark Manasse, Marc Najork, and Janet Wiener.
A Large-Scale Study of the Evolution of Web Pages.
Software: Practice & Experience, 34(2):213-237, February 2004.
[Abstract]
Allan Heydon and Marc Najork.
Performance Limitations of the Java Core Libraries.
Concurrency: Practice & Experience, 12(6):363-373, May 2000.
[Abstract]
Allan Heydon and Marc Najork.
Mercator: A Scalable, Extensible Web Crawler.
World Wide Web 2(4):219-229, December 1999.
[Abstract, Draft]
Marc Brown and Marc Najork.
Collaborative Active Textbooks.
Journal of Visual Languages and Computing 8(4):453-486, August 1997.
[Abstract]
Marc Najork.
Programming in Three Dimensions.
Journal of Visual Languages and Computing 7(2):219-242, June 1996.
[Abstract]
Marc Najork and Marc Brown.
Obliq-3D: A High-Level, Fast-Turnaround 3D Animation System.
IEEE Transactions on Visualization and Computer Graphics 1(2):175-193,
June 1995.
[Abstract,
bad scan]
Sharon Kuck, Roland John, Arnd Lewe, and Marc Najork.
Roles and their role in posing recursive queries.
Information Systems 15(2):173-186 (1990).
[Abstract]
| Conference papers |
Marc Najork, Sreenivas Gollapudi and Rina Panigrahy.
Less is More: Sampling the Neighborhood Graph Makes SALSA Better and Faster.
To appear in 2nd ACM International Conference on Web Search and Data Mining (February 2009).
Marc Najork and Nick Craswell.
Efficient and Effective Link Analysis with Precomputed SALSA Maps.
17th ACM Conference on Information and Knowledge Management (October 2008),
pages 53-61.
[PDF]
[Slides]
Frank McSherry and Marc Najork.
Computing Information Retrieval Performance Measures Efficiently in the Presence of Tied Scores.
30th European Conference on Information Retrieval (April 2008), pages 414-421.
[PDF]
Sreenivas Gollapudi, Marc Najork and Rina Panigrahy.
Using Bloom Filters to Speed Up HITS-like Ranking Algorithms.
5th Workshop on Algorithms and Models for the Web Graph (December 2007), pages 195-201.
[PDF]
[Slides]
Marc Najork.
Comparing the Effectiveness of HITS and SALSA.
16th ACM Conference on Information and Knowledge Management (November 2007), pages 157-164.
[PDF]
[Slides]
Marc Najork, Hugo Zaragoza, and Michael Taylor.
HITS on the Web: How does it Compare?
30th Annual International ACM SIGIR Conference on Research and
Development in Information Retrieval (July 2007), pages 471-478.
[PDF]
[Slides]
Alexandros Ntoulas, Marc Najork, Mark Manasse, and Dennis Fetterly.
Detecting Spam Web Pages Through Content Analysis.
15th International World Wide Web Conference (May 2006), pages 83-92.
[PDF]
Dennis Fetterly, Mark Manasse, and Marc Najork.
Detecting Phrase-Level Duplication on the World Wide Web.
28th Annual International ACM SIGIR Conference on Research and
Development in Information Retrieval (August 2005), pages 170-177.
[PDF]
John MacCormick, Nick Murphy, Marc Najork, Chandramohan A. Thekkath, and Lidong Zhou.
Boxwood: Abstractions as the Foundation for Storage Infrastructure.
6th Symposium on Operating Systems Design and Implementation (December 2004),
pages 105-120.
[PDF]
Dennis Fetterly, Mark Manasse, and Marc Najork.
Spam, Damn Spam, and Statistics: Using Statistical Analysis to Locate Spam Web Pages.
7th International Workshop on the Web and Databases (June 2004), pages 1-6.
[PS, PDF]
Dennis Fetterly, Mark Manasse, and Marc Najork.
On the Evolution of Clusters of Near-Duplicate Web Pages.
1st Latin American Web Congress (November 2003), pages 37-45.
[PS, PDF]
Andrei Z. Broder, Marc Najork, and Janet L. Wiener.
Efficient URL caching for World Wide Web crawling.
12th International World Wide Web Conference (May 2003), pages 679-689.
[HTML,
PS,
PDF]
Dennis Fetterly, Mark Manasse, Marc Najork, and Janet Wiener.
A Large-Scale Study of the Evolution of Web Pages.
12th International World Wide Web Conference (May 2003), pages 669-678.
[HTML,
PS,
PDF]
Marc Najork. Web-Based Algorithm Animation.
38th Design Automation Conference (June 2001), pages 506-511.
[PDF]
Marc Najork and Janet L. Wiener.
Breadth-First Search Crawling Yields High-Quality Pages.
10th International World Wide Web Conference (May 2001), pages 114-118.
[HTML,
PDF]
Monika Henzinger, Allan Heydon, Michael Mitzenmacher, and Marc Najork.
On Near-Uniform URL Sampling.
9th International World Wide Web Conference (May 2000), pages 295-308.
[HTML]
Allan Heydon and Marc Najork.
Performance Limitations of the Java Core Libraries.
ACM 1999 Java Grande Conference (June 1999), pages 35-41.
[PDF]
Monika Henzinger, Allan Heydon, Michael Mitzenmacher, and Marc Najork.
Measuring Index Quality Using Random Walks on the Web.
8th International World Wide Web Conference (May 1999), pages 213-225.
[HTML,
PDF,
PS]
Marc H. Brown, Marc A. Najork, and Roope Raisamo.
A Java-Based Implementation of Collaborative Active Textbooks.
IEEE Symposium on Visual Languages (September 1997), pages 372-379.
[PDF,PS]
Marc H. Brown and Marc A. Najork. Distributed Applets.
CHI'97 Conference Companion (March 1997), pages 204-205.
[HTML,
PDF, PS]
Marc H. Brown and Marc A. Najork.
Collaborative Active Textbooks: A Web-based Algorithm Animation System for
an Electronic Classroom.
IEEE Symposium on Visual Languages (September 1996), pages 266-275.
[PDF,PS]
Marc Brown and Marc Najork. Distributed Active Objects.
5th International World Wide Web Conference (May 1996), pages 1037-1052.
[HTML]
Marc Najork and Marc Brown.
A Library for Visualizing Combinatorial Structures.
IEEE Visualization'94 (October 1994), pages 164-171.
[PS]
Marc Brown and Marc Najork.
Algorithm Animation Using 3D Interactive Graphics.
ACM Symposium on User Interface Software and Technology (November 1993),
pages 93-100.
Marc Najork and Simon Kaplan.
Cube: Eine dreidimensionale visuelle Programmiersprache.
Informatik, Wirtschaft, Gesellschaft (September 1993), pages 340-345.
Marc A. Najork and Simon M. Kaplan.
Specifying Visual Languages with Conditional Set Rewrite Systems.
IEEE Symposium on Visual Languages (August 1993), pages 12-18.
[PS]
Marc A. Najork and Simon M. Kaplan.
A Prototype Implementation of the Cube Language.
IEEE Workshop on Visual Languages (September 1992), pages 270-272.
[PS]
Marc A. Najork and Simon M. Kaplan. The Cube Language.
IEEE Workshop on Visual Languages (October 1991), pages 218-224.
Marc A. Najork and Eric J. Golin.
Enhancing Show-and-Tell with a polymorphic type system and higher-order
functions.
IEEE Workshop on Visual Languages (October 1990), pages 215-220.
| Popular magazines |
Marc Brown and Marc Najork.
Distributed Active Objects.
Dr. Dobb's Journal (March 1997), pages 34-41.
Marc Najork. Visual Programming in 3D.
Dr. Dobb's Journal (December 1995), pages 18-31.
| Technical reports |
Marc Najork and Allan Heydon.
High-Performance Web Crawling.
SRC Research Report 173, Compaq Systems Research Center (September 2001).
[PDF,
PS]
Marc Najork and Marc H. Brown.
Three-Dimensional Web-Based Algorithm Animations.
SRC Research Report 170, Compaq Systems Research Center, Palo Alto (July 2001).
[PDF,
PS]
Marc H. Brown, Hannes Marais, Marc A. Najork, and William E. Weihl.
Focus + Context Displays of Web Pages: Implementation Alternatives.
SRC Technical Note 1997-010, Digital Systems Research Center (May 1997).
[HTML]
Marc H. Brown and Marc A. Najork.
Collaborative Active Textbooks: A Web-based Algorithm Animation System for
an Electronic Classroom.
SRC Research Report 142, Digital Systems Research Center (May 1996).
[PDF,
PS]
Marc H. Brown and Marc A. Najork.
Distributed Active Objects.
SRC Research Report 141a, Digital Systems Research Center, Palo Alto (April 1996).
[PDF,
PS]
Marc A. Najork. Obliq-3D Tutorial and Reference Manual.
SRC Research Report 129, Digital Systems Research Center, Palo Alto (December 1994).
[PDF,
PS]
Marc A. Najork and Marc H. Brown.
A Library for Visualizing Combinatorial Structures.
SRC Research Report 128a, Digital Systems Research Center, Palo Alto (September 1994).
[PDF,
PS]
Marc A. Najork.
Programming in Three Dimensions.
Technical Report UIUCDCS-R-93-1838,
Dept. of Computer Science, Univ. of Illinois (October 1993).
[PDF, PS]
Marc H. Brown and Marc A. Najork.
Algorithm Animation Using 3D Interactive Graphics.
SRC Research Report 110a, Digital Systems Research Center, Palo Alto
(September 1993).
[PDF,
PS]
Marc Najork.
Funktionale, logik-basierte und objektorientierte Sprachstile und Wege zur
Vereinheitlichung.
Thesis, Fachbereich Informatik, Technical University of
Darmstadt, Germany (1989).
Marc Najork.
Enhanced ER-Easy: A Database Scheme Designer.
Technical Report UIUCDCS-R-88-1464,
Dept. of Computer Science, Univ. of Illinois (May 1988).
Roland John, Sharon Kuck, Arnd Lewe, and Marc Najork.
Roles and their role in posing recursive queries over the universal relation.
Technical Report UIUCDCS-R-88-1463,
Dept. of Computer Science, Univ. of Illinois (May 1988).
| | Videos |
Marc H. Brown and Marc A. Najork.
Distributed Active Objects.
SRC Video 141b, Digital Systems Research Center, Palo Alto (April 1996).
[YouTube]
Marc A. Najork and Marc H. Brown.
A Library for Visualizing Combinatorial Structures.
SRC Video 128b, Digital Systems Research Center, Palo Alto (September 1994).
[YouTube]
Marc H. Brown and Marc A. Najork.
Algorithm Animation Using 3D Interactive Graphics.
SRC Video 110b, Digital Systems Research Center, Palo Alto (September 1993).
[YouTube]
| | Patents |
Marc A. Najork.
System and method for maintaining a distributed database of hyperlinks.
US Patent 7,340,467, issued 3/4/2008.
Marc A. Najork.
System and method for distributed web crawling.
US Patent 7,139,747, issued 11/21/2006.
Marc A. Najork and Chandramohan A. Thekkath.
Algorithm for tree traversals using left links.
US Patent 7,082,438, issued 7/25/2006.
Marc A. Najork and Chandramohan A. Thekkath.
Deletion and compaction using versioned nodes.
US Patent 7,072,904, issued 7/4/2006.
Marc A. Najork and Chandramohan A. Thekkath.
Algorithm for tree traversals using left links.
US Patent 7,007,027, issued 2/28/2006.
Marc A. Najork and Clark A. Heydon.
System and method for efficient filtering of data set addresses in a web crawler.
US Patent 6,952,730, issued 10/4/2005.
Marc A. Najork.
System and method for identifying cloaked web servers.
US Patent 6,910,077, issued 6/21/2005.
Marc A. Najork, Clark A. Heydon, Michael Mitzenmacher, and Monika H. Henzinger.
System and method for near-uniform sampling of web page addresses.
US Patent 6,594,694, issued 7/15/2003.
Marc A. Najork and Clark A. Heydon.
Web crawler system using parallel queues for queing data sets having common
address and concurrently downloading data associated with data set in each
queue.
US Patent 6,377,984, issued 4/23/2002.
Marc A. Najork and Clark A. Heydon.
System and method for associating an extensible set of data with documents
downloaded by a web crawler.
US Patent 6,351,755, issued 2/26/2002.
Marc A. Najork and Clark A. Heydon.
System and method for enforcing politeness while scheduling downloads in a
web crawler.
US Patent 6,321,265, issued 11/20/2001.
Marc A. Najork and Clark A. Heydon.
System and method for efficient representation of data set addresses
in a web crawler.
US Patent 6,301,614, issued 10/9/2001.
Marc A. Najork, Clark A. Heydon, and Janet L. Wiener.
Web crawler system using plurality of parallel priority level queues having
distinct associated download priority levels for prioritizing document
downloading and maintaining document freshness.
US Patent 6,263,364, issued 7/17/2001.
|
|