Analyze any website's robots.txt file. See parsed rules grouped by user-agent, sitemap declarations, and crawl-delay settings — all for free.
A robots.txt file is a plain text file located at the root of a website (e.g. example.com/robots.txt) that provides instructions to web crawlers. It uses the Robots Exclusion Protocol to specify which paths crawlers are allowed or disallowed from accessing.
The file groups rules by user-agent (crawler name). Common user-agents include Googlebot, Bingbot, and the wildcard * which applies to all crawlers. The robots.txt file can also declare sitemap locations using Sitemap: directives.
We fetch the robots.txt file from the domain you provide and display the raw content.
We parse each line and group rules by user-agent, extracting Allow, Disallow, and other directives.
We extract sitemap declarations and display them as clickable links, so you can verify they resolve correctly.
A robots.txt file is a text file at the root of a website that tells search engine crawlers which pages they can and cannot access. It follows the Robots Exclusion Protocol and is one of the first files a crawler checks before indexing a site.
Misconfigured robots.txt files can accidentally block search engines from crawling important pages, or fail to block pages you want hidden. Regularly checking your robots.txt ensures your crawl directives match your SEO strategy.
Sitemap declarations (lines starting with 'Sitemap:') tell search engines where to find your XML sitemaps. This is one of the primary ways search engines discover sitemaps, alongside checking /sitemap.xml directly.
Crawl-delay is a directive that asks crawlers to wait a specified number of seconds between requests. While Google ignores this directive, other search engines like Bing may respect it. A high crawl-delay can slow down indexing.
Need to discover sitemaps at scale?
Get a free API key to discover and extract sitemaps programmatically. 100 requests/month included.