llms.txt is a short, agent-readable map of the surfaces an AI assistant
should look at first on your site — docs, tools, policies, canonical
URLs. It doesn't replace your sitemap, robots.txt, or HTML metadata.
It points assistants at the handful of pages you actually want them
quoting, before they spend tokens on anything else.
What it is (and isn't)
llms.txt is a plain-text Markdown file served from your canonical
origin at /llms.txt. It's a curated index — think "top of the file
cabinet" — not a crawl manifest. Two rules:
Short. A human should be able to read it in under a minute.
Opinionated. Link the pages you want an assistant to trust, not every page that exists.
A longer variant, /llms-full.txt, is optional and holds the same
list plus one-sentence summaries per link. Both files are public.
Neither replaces sitemap.xml: sitemaps enumerate every canonical
URL for crawlers; llms.txt curates a small set for answer engines.
Placement and linking
Serve both files from the canonical origin with Content-Type:
text/plain; charset=utf-8. Don't redirect, don't gate behind a
login, and keep the URLs stable so assistants can cache the file
keys. Discover-ability checklist:
Link to
/llms.txtfromrobots.txt(a regular comment line is fine) and from your HTML<head>via<link rel="alternate" type="text/plain" href="/llms.txt">.Reference the same canonical URLs in
/llms.txtthat yoursitemap.xmlemits. Disagreement between the two is the most common audit finding.If you ship
/llms-full.txt, point at it from/llms.txtso the short file stays scannable.
A minimal example
# isitready.dev
> Public website scanning for AI readiness, SEO, security,
> performance, and production quality.
## Docs
- [Methodology](https://isitready.dev/methodology)
- [FAQ](https://isitready.dev/faq)
## Tools
- [llms.txt checker](https://isitready.dev/tools/llms-txt-checker)
## Policy
- [robots.txt](https://isitready.dev/robots.txt)
- [sitemap.xml](https://isitready.dev/sitemap.xml)That's it. Level-two headings group surfaces; bullet list items point at canonical URLs. An assistant reading this file knows which docs are authoritative, which tools are runnable, and which policy files to cross-check.
Common failure modes
Drift. The URLs in llms.txt diverge from the sitemap or the canonical tags. Fix with a build-time check: fail the build if a URL listed in llms.txt returns non-200 or points at a non-canonical route.
Bloat. The file grows to include every page. At that point it's a worse sitemap. Prune to the surfaces you'd hand to a new engineer on day one.
Hidden behind auth. Agents need unauthenticated access. If the origin requires login, serve llms.txt from a public subdomain or CDN origin and point the canonical host at it.
Markdown that isn't Markdown. Some generators emit HTML or escape the headings. Validate the rendered output with any Markdown linter before shipping.
Verify before shipping
Run the llms.txt checker against the
public origin. It reads /llms.txt, /llms-full.txt, robots.txt
policy for AI crawlers, sitemap alignment, and markdown negotiation
in one pass, and returns the same evidence rows as the full
isitready.dev report. Fix what it flags, re-run, ship.