User Agent 'MSR-ISRCCrawler'
(typically from 131.107.65.41): is used by the
Microsoft Research (MSR) Internet Services Research Center (ISRC) to analyze
the web for Microsoft's Search and Ads services. Having low impact to web sites
on the Internet and maintaining a positive relationship with webmasters is
extremely important to us. If you have any questions or concerns, please
contact us at isrc-bot@microsoft.com.
Our crawler has strict adherence to the http 1.1 protocol,
and fully respects robots.txt files as documented at http://www.robotstxt.org/robotstxt.html.
If robots.txt files are malformed, have ambiguous specifications, or do not
exist on a given site we conservatively avoid crawling subsets of the site and
ensure our crawling waits at least 1 second between requests to be 'polite'. If
a site has a robots.txt file the crawler uses the “crawl-delay” parameter as
the wait between requests. Politeness is followed at the IP level of each
website we crawl enabling us to be polite to sites that host multiple hosts /
domains on a single IP address.
Past uses of our crawler have helped Live Search
understand the rate of change of web pages across the Internet to optimize
their crawling policy, understand non-404 error pages returned by web sites for
missing pages, and to survey robots.txt files across the Internet. The crawler
is sometimes used to download all files required to render the page in a web
browser, such as the referenced .css and .js files.
Note: the “ISRC” in “MSR-ISRCCrawler”
does not stand for International Standard Recording Code.