llms.txt is a short, agent-readable map of the surfaces an AI assistant should look at first on your site — docs, tools, policies, canonical URLs. It doesn't replace your sitemap, robots.txt, or HTML metadata. It points assistants at the handful of pages you actually want them quoting, before they spend tokens on anything else.

What it is (and isn't)

llms.txt is a plain-text Markdown file served from your canonical origin at /llms.txt. It's a curated index — think "top of the file cabinet" — not a crawl manifest. Two rules:

  1. Short. A human should be able to read it in under a minute.

  2. Opinionated. Link the pages you want an assistant to trust, not every page that exists.

A longer variant, /llms-full.txt, is optional and holds the same list plus one-sentence summaries per link. Both files are public. Neither replaces sitemap.xml: sitemaps enumerate every canonical URL for crawlers; llms.txt curates a small set for answer engines.

Placement and linking

Serve both files from the canonical origin with Content-Type: text/plain; charset=utf-8. Don't redirect, don't gate behind a login, and keep the URLs stable so assistants can cache the file keys. Discover-ability checklist:

  • Link to /llms.txt from robots.txt (a regular comment line is fine) and from your HTML <head> via <link rel="alternate" type="text/plain" href="/llms.txt">.

  • Reference the same canonical URLs in /llms.txt that your sitemap.xml emits. Disagreement between the two is the most common audit finding.

  • If you ship /llms-full.txt, point at it from /llms.txt so the short file stays scannable.

A minimal example

# isitready.dev

> Public website scanning for AI readiness, SEO, security,
> performance, and production quality.

## Docs
- [Methodology](https://isitready.dev/methodology)
- [FAQ](https://isitready.dev/faq)

## Tools
- [llms.txt checker](https://isitready.dev/tools/llms-txt-checker)

## Policy
- [robots.txt](https://isitready.dev/robots.txt)
- [sitemap.xml](https://isitready.dev/sitemap.xml)

That's it. Level-two headings group surfaces; bullet list items point at canonical URLs. An assistant reading this file knows which docs are authoritative, which tools are runnable, and which policy files to cross-check.

Common failure modes

  • Drift. The URLs in llms.txt diverge from the sitemap or the canonical tags. Fix with a build-time check: fail the build if a URL listed in llms.txt returns non-200 or points at a non-canonical route.

  • Bloat. The file grows to include every page. At that point it's a worse sitemap. Prune to the surfaces you'd hand to a new engineer on day one.

  • Hidden behind auth. Agents need unauthenticated access. If the origin requires login, serve llms.txt from a public subdomain or CDN origin and point the canonical host at it.

  • Markdown that isn't Markdown. Some generators emit HTML or escape the headings. Validate the rendered output with any Markdown linter before shipping.

Verify before shipping

Run the llms.txt checker against the public origin. It reads /llms.txt, /llms-full.txt, robots.txt policy for AI crawlers, sitemap alignment, and markdown negotiation in one pass, and returns the same evidence rows as the full isitready.dev report. Fix what it flags, re-run, ship.