Question 1

What is a robots.txt file?

Accepted Answer

A robots.txt file is a text file at the root of a website that tells search engine crawlers which pages they can and cannot access. It follows the Robots Exclusion Protocol and is one of the first files a crawler checks before indexing a site.

Question 2

Why should I check my robots.txt?

Accepted Answer

Misconfigured robots.txt files can accidentally block search engines from crawling important pages, or fail to block pages you want hidden. Regularly checking your robots.txt ensures your crawl directives match your SEO strategy.

Question 3

What are sitemap declarations in robots.txt?

Accepted Answer

Sitemap declarations (lines starting with 'Sitemap:') tell search engines where to find your XML sitemaps. This is one of the primary ways search engines discover sitemaps, alongside checking /sitemap.xml directly.

Question 4

What is crawl-delay?

Accepted Answer

Crawl-delay is a directive that asks crawlers to wait a specified number of seconds between requests. While Google ignores this directive, other search engines like Bing may respect it. A high crawl-delay can slow down indexing.

Robots.txt Checker

What is a Robots.txt File?

How the Robots.txt Checker Works

Fetch robots.txt

Parse directives

Highlight sitemaps

Frequently Asked Questions

What is a robots.txt file?

Why should I check my robots.txt?

What are sitemap declarations in robots.txt?

What is crawl-delay?