Free tool

Robots and Sitemap Checker

Robots.txt and sitemap.xml are small files, but they decide which URLs crawlers can discover and which canonical pages they trust.

Read-only public scan. No login, no crawler install, and no private URLs are fetched.

Surface
Free tool
Scope
Public web evidence
Auth
None required
Schema
SoftwareApplication

Answer first

What it checks

The scan fetches robots.txt and sitemap.xml, verifies response status and content type, samples listed URLs, and checks canonical agreement.

Detail 01

Common blockers

Staging disallows, stale sitemap entries, wrong hosts, missing homepage URLs, and discovery files hidden by robots policy are the usual launch failures.

Detail 02

How to fix

Keep robots rules intentional, list canonical production URLs only, and make sitemap, canonical tags, and llms.txt point at the same preferred origin.

FAQ

Common questions

Do I need both robots.txt and sitemap.xml?
Yes for most public sites. Robots.txt declares access policy; sitemap.xml declares canonical URLs and update hints. They solve different problems.
Should the sitemap include private pages?
No. Sitemaps should list canonical public URLs that you want crawlers to discover and evaluate.