// Reference
What is llms.txt?
llms.txt is an emerging plain-text convention, analogous to robots.txt, that tells AI engines which canonical content on your site they should ingest and how.
// Format
How llms.txt is structured.
llms.txt lives at the root of your domain and is served as text/plain. It is written in Markdown, intentionally simple, and assumes an LLM is the primary reader. A typical file opens with an H1 of your site or product name, a short summary paragraph, and then a sequence of H2 sections grouping links to canonical pages: documentation, glossary entries, pricing, comparisons, policies. Each link gets a short label. The whole point is to hand the model a hand-curated map instead of letting it guess. Vizelo’s own file is at /llms.txt if you want a working example.
// Variants
llms.txt vs llms-full.txt.
Two files, two jobs.
- llms.txt is the index. Short. Scannable. Links to your most important canonical pages with one-line descriptions. Treat it like a sitemap for LLMs.
- llms-full.txt is the bundle. It concatenates the full text of those pages into a single file an LLM can ingest in one read — useful for ingestion pipelines that prefer not to crawl many URLs.
Vizelo publishes both: /llms.txt and /llms-full.txt. Most sites should ship the index first and add the full bundle once the canonical pages stabilize.
// Contents
What to put in yours.
- Identity. Your brand name, one-line description, and canonical URL. Resolve the entity before you list anything else.
- Core docs. Product overview, pricing, comparisons, how the thing works. The pages an evaluator would actually need.
- Definitional content. Your glossary, your category-defining pages, your category FAQs. The stuff engines reach for when asked “what is X.”
- Policies. Privacy and terms, so an engine summarizing your trust posture has the right source.
- Skip. Login pages, paginated archives, marketing experiments, anything you wouldn’t link a senior buyer to.
// Reality check
How AI engines actually use it (today).
Honest read: adoption is early and uneven. Developer-facing ingestion pipelines and several smaller crawlers already read llms.txt. The largest consumer-facing engines have not formally committed to a spec, and their behavior is inconsistent. The reason to ship one anyway is that the cost is near zero, the file improves at least one downstream surface today (developer tools, agents, anyone consuming your docs through an LLM), and the directional bet is obvious: as crawling gets more expensive and ingestion gets more structured, a hand-curated content map gets more valuable, not less. Ship it now, update it quarterly, and treat full-engine support as upside rather than the reason to do it. Vizelo generates and maintains both files automatically as part of your GEO program.
// Related
Keep reading.
FAQ
Common questions.
What is llms.txt?
llms.txt is a plain-text file placed at the root of your site that points AI engines and ingestion pipelines at the canonical content you want them to use. It is conceptually similar to robots.txt and sitemap.xml, but written in human-readable Markdown for LLM consumption.
What’s the difference between llms.txt and llms-full.txt?
llms.txt is the lightweight index — short, scannable, with links to your most important pages. llms-full.txt is the long-form version, containing the actual text of those pages bundled into one file an LLM can ingest in a single read.
Do AI engines actually read llms.txt today?
Adoption is early and uneven. Some ingestion pipelines and developer-facing crawlers read it; the largest consumer-facing engines have not formally committed. The file is still worth shipping because the cost is near zero, it improves at least one downstream path today, and adoption is heading the right way.
Where should llms.txt live?
At the root of your domain — https://example.com/llms.txt — served as text/plain. Same convention as robots.txt. The llms-full.txt sibling sits next to it.
Does llms.txt block AI engines from training on my site?
No. llms.txt is descriptive, not restrictive. It tells engines what you want them to use. For opting out of training, you need user-agent rules in robots.txt and platform-specific controls like Google-Extended, OAI-SearchBot, and equivalents.