llms.txt is a proposed, agent-readable map of the surfaces a compatible
AI assistant can look at first on your site — docs, tools, policies, canonical
URLs. It doesn't replace your sitemap, robots.txt, or HTML metadata.
It points assistants at the handful of pages you actually want them
quoting, before they spend tokens on broader crawling.
What it is (and isn't)
llms.txt is a plain-text Markdown file served from your canonical
origin at /llms.txt. It's a curated index — think "top of the file
cabinet" — not a crawl manifest. Two rules:
Short. A human should be able to read it in under a minute.
Opinionated. Link the pages you want an assistant to trust, not every page that exists.
Longer companion files are optional and can hold expanded context or
summaries for agents that need more than the short index. This site
uses /llms-full.txt for that role. Both files are public.
Neither replaces sitemap.xml: sitemaps enumerate every canonical
URL for crawlers; llms.txt curates a small set for compatible agents.
Placement and linking
Serve both files from the canonical origin with Content-Type:
text/plain; charset=utf-8. Don't redirect, don't gate behind a
login, and keep the URLs stable so assistants can cache the file
keys. Discover-ability checklist:
Link to
/llms.txtfromrobots.txt(a regular comment line is fine) and from your HTML<head>via<link rel="alternate" type="text/plain" href="/llms.txt">.Reference the same canonical URLs in
/llms.txtthat yoursitemap.xmlemits. Disagreement between the two is the most common audit finding.If you ship
/llms-full.txt, point at it from/llms.txtso the short file stays scannable.
A minimal example
# isitready.dev
> Public website scanning for AI readiness, SEO, security,
> performance, and production quality.
## Docs
- [Methodology](https://isitready.dev/methodology)
- [FAQ](https://isitready.dev/faq)
## Tools
- [llms.txt checker](https://isitready.dev/tools/llms-txt-checker)
## Policy
- [robots.txt](https://isitready.dev/robots.txt)
- [sitemap.xml](https://isitready.dev/sitemap.xml)That's it. Level-two headings group surfaces; bullet items point at canonical URLs. An assistant reading this file knows which docs to trust and which policy files to cross-check before quoting anything.
Common failure modes
Drift. The URLs in llms.txt diverge from the sitemap or the canonical tags. Fix with a build-time check: fail the build if a URL listed in llms.txt returns non-200 or points at a non-canonical route.
Bloat. The file grows to include every page. At that point it's a worse sitemap. Prune to the surfaces you'd hand to a new engineer on day one.
Hidden behind auth. Agents need unauthenticated access. If the origin requires login, serve llms.txt from a public subdomain or CDN origin and point the canonical host at it.
Markdown that isn't Markdown. Some generators emit HTML or escape the headings. Validate the rendered output with any Markdown linter before shipping.
Verify before shipping
Run the llms.txt checker against the
public origin. It reads /llms.txt, /llms-full.txt, robots.txt
policy for AI crawlers, sitemap alignment, and markdown negotiation
in one pass, and returns the same evidence rows as the full
isitready.dev report. Fix what it flags, re-run, ship.
If the report says the file is missing, use the focused missing llms.txt fix guide before changing anything else. It keeps the fix narrow: publish the file at the canonical root, return plain text, link the same production URLs as the sitemap, and verify with a fresh scan.