The basis of our Site Audit tool is a website crawler bot, crawling a site’s pages to collect data for sophisticated website analytics.
We obey the Robots.txt protocol and will not crawl your site if you exclude the Similarweb user-agent token. For example:
User-agent: similarweb
Disallow: /
By default, our crawler requests come from one of two IP addresses: 52.5.118.182 or 52.86.188.211. However, if the crawler is running in stealth mode or you have specifically asked the crawler to originate the request from a different region, the IP address might be different.
Please also note that we do not support the crawl-delay directive. We aim to match the way Google crawls as closely as possible, and Google does not support the crawl-delay directive. Crawl-delay can also make it difficult to support domain-level crawl rate limits, which is why most dev ops use a bot management system to give them complete control.
If you have already excluded the Similarweb user-agent but your site is still being crawled without your permission and you would like it to stop, then please contact our Support team
Comments
Please sign in to leave a comment.