llms.txt is a plain Markdown file you place at `/llms.txt` on your domain. It gives AI language models a curated map of your most important content - what your site is, what it does, and which pages are worth reading. It takes under two hours to create and requires no ongoing infrastructure.
What Is llms.txt?
llms.txt is a Markdown-formatted file served at the root of a domain (yourdomain.com/llms.txt) that provides AI language models with a structured, author-curated index of a website's most important content.
The specification was proposed by Jeremy Howard - co-founder of Answer.AI and fast.ai - on September 3, 2024. Howard published the original proposal at answer.ai/posts/2024-09-03-llmstxt.html. The canonical specification now lives at llmstxt.org, and the project is maintained at github.com/answerdotai/llms-txt.
The problem Howard identified is straightforward: LLMs have finite context windows. When an AI assistant or agent needs to understand a product, service, or documentation site, it cannot ingest every page. HTML pages are cluttered with navigation, footers, cookie banners, and marketing copy that consume tokens without adding meaning. Even sitemaps just list hundreds of URLs with no context about which pages matter most.
llms.txt solves this by letting site owners write, in plain language, exactly what their site is and exactly which pages the model should prioritize. It is purely additive - it does not restrict access the way robots.txt does. It provides context.
Why llms.txt Matters for AI Visibility
Roughly one in ten domains has now published an llms.txt file. An SE Ranking study of approximately 300,000 domains found a 10.13% adoption rate as of early 2026. Adoption is concentrated in developer tools, documentation sites, and technical SaaS products.
Notably, adoption is fairly even across traffic tiers:
- Low-traffic sites: 9.88%
- Mid-traffic sites: 10.54%
- High-traffic sites: 8.27%
The largest consumer platforms (Google, Facebook, YouTube, Amazon) have not adopted it. llms.txt is primarily a tool for the technical and professional web - the segment most likely to be discovered through AI assistants rather than social feeds.
For GEO practitioners, the relevance is practical rather than speculative. AI tools - from ChatGPT to Perplexity to IDE coding assistants - increasingly serve as the first point of contact between a buyer or developer and your product. If an AI model cannot accurately describe what you do, you lose that touchpoint. llms.txt gives you a direct line to write that description yourself.
The strongest confirmed use case is developer-facing products: coding assistants like Cursor and GitHub Copilot actively consume documentation context. A well-structured llms.txt directly improves how those tools answer developer questions about your API.
How llms.txt Works
When an AI tool, agent, or IDE needs to understand a website, it can fetch /llms.txt as a first step before deciding which deeper pages to read. The file's H1 and blockquote give the model an immediate summary. The linked sections tell it where to go next.
The file is designed for three distinct use cases:
- Inference-time use - A user asks an AI assistant about your product. The assistant fetches your llms.txt to quickly orient itself before answering.
- Agent-based workflows - An AI agent autonomously navigates your site. The llms.txt file acts as a table of contents, reducing unnecessary fetches.
- IDE and tooling ingestion - Coding assistants like Cursor index your documentation. Your llms.txt tells them which pages contain the most relevant technical content.
Which crawlers actually use it? As of mid-2026, confirmed support is limited:
| Crawler / Platform | Status | Notes |
|---|---|---|
| Anthropic / Claude | Confirmed | Publishes its own llms.txt; Claude-based tools report using it |
| Perplexity | Confirmed | Public support statement |
| OpenAI / ChatGPT | Observed, unconfirmed | GPTBot fetches the file but no public statement |
| Mistral | Maturing | Listed as having partial support |
| Google / Gemini | No support | Explicitly stated no plans to use it |
| Bing / Copilot | No confirmed support | Not documented |
| LangChain, LlamaIndex | Variable | Plugins exist; depends on developer configuration |
A 90-day server log study by OtterlyAI found that /llms.txt received about 84 total AI bot visits out of over 62,000 AI bot visits - roughly 0.1% of AI traffic. Even where support exists, the file is not being heavily prioritized yet.
llms.txt vs robots.txt: Key Differences
These two files are often confused but serve opposite purposes and should both exist on your domain.
| Dimension | robots.txt | llms.txt |
|---|---|---|
| Purpose | Access control - tells crawlers what NOT to fetch | Semantic guidance - tells AI what IS most worth reading |
| Direction | Restrictive | Additive |
| Mechanism | Allow/disallow rules with user-agent targeting | Curated Markdown index with descriptions |
| Scope | All crawlers (search engines, AI bots, scrapers) | AI models and agents specifically |
| Enforcement | Technical + legal weight; widely enforced | Proposal only; no enforcement mechanism |
| Support | 25+ years, near-universal implementation | ~18 months, ~10% adoption |
| Effect on Google | Controls indexing and crawl budget | No effect (Google does not use it) |
Think of robots.txt as a bouncer who controls which rooms people can enter. llms.txt is a tour guide who explains which exhibits are worth seeing. You need both.
robots.txt blocks content you do not want crawled. llms.txt promotes content you want AI to understand. Use robots.txt to exclude low-value paths (admin, search results, pagination). Use llms.txt to point AI toward your best content.
How to Create Your llms.txt File (Step by Step)
- Audit your content hierarchy. Identify the 5-20 pages an AI most needs to understand your product. For a SaaS product, this typically includes: product overview, key features page, pricing, getting started guide, API reference, and FAQ. Do not list every page - curation is the entire point.
- Write the H1. Use your product or company name exactly as you brand it. Nothing else on this line.
- Write the blockquote. One to three sentences describing what your product does, who it serves, and what problem it solves. Write this carefully - AI models often use this text verbatim when describing your product to users.
- Organize into sections. Group your links under H2 headings. Common patterns:
## Docs,## API Reference,## Pricing,## Blog,## Legal,## Optional.
- Write descriptions for every link. One sentence per link, placed after a colon following the URL. Models use these to decide whether to fetch a given URL.
- Create the file. For static sites, place a plain text file named
llms.txtin your public root. For Next.js App Router, use a route handler (see format section below).
- Optionally create llms-full.txt. For extensive documentation, concatenate the full Markdown content of all referenced pages into
/llms-full.txt.
- Do not add it to sitemap.xml. llms.txt is not an indexable HTML page. Do not list it in your sitemap.
- Verify. Confirm your URL returns HTTP 200 with
Content-Type: text/plainand valid Markdown.
llms.txt Format and Syntax
The specification defines a strict element order:
- Optional UTF-8 byte-order mark (BOM)
- H1 heading (REQUIRED) - the only mandatory element
- Blockquote (optional but strongly recommended) - short summary of the site
- Optional unstructured Markdown (no H2/H3 at this level)
- H2-delimited sections each containing a Markdown list of links
A section titled ## Optional has special meaning: tools processing the file may safely skip those links when building a shorter context window.
Here is a complete, real-world example:
# CitedSpy
> CitedSpy is a GEO (Generative Engine Optimization) tracking platform for brands and agencies.
> It monitors how often your brand is cited, mentioned, and recommended across AI engines -
> ChatGPT, Perplexity, Gemini, Claude, and Copilot.
## Product
- [How CitedSpy Works](https://citedspy.com/features): Overview of the monitoring, analysis, and reporting features
- [Pricing](https://citedspy.com/pricing): Plan tiers, feature comparison, and enterprise options
- [Changelog](https://citedspy.com/changelog): Recent product updates and new engine support
## Blog
- [What is GEO?](https://citedspy.com/blog/generative-engine-optimization): Introduction to Generative Engine Optimization for marketers
- [AI Citation Guide](https://citedspy.com/blog/ai-citation): How AI engines decide what to cite and how to measure it
## Optional
- [Privacy Policy](https://citedspy.com/privacy): Data handling and GDPR compliance
- [Terms of Service](https://citedspy.com/terms): Usage terms and acceptable use policy
- [About](https://citedspy.com/about): Company background and founding storyFor Next.js App Router, serve the file via a route handler to avoid static file issues:
// app/llms.txt/route.ts
export const dynamic = "force-static";
export async function GET() {
const content = [
"# CitedSpy",
"",
"> CitedSpy monitors brand citations across AI engines including",
"> ChatGPT, Perplexity, Gemini, Claude, and Copilot.",
"",
"## Product",
"- [Features](https://citedspy.com/features): Full feature overview and engine coverage",
"- [Pricing](https://citedspy.com/pricing): Plan comparison and pricing details",
"",
"## Optional",
"- [Privacy Policy](https://citedspy.com/privacy): Data handling practices",
].join("\n");
return new Response(content, {
headers: { "Content-Type": "text/plain; charset=utf-8" },
});
}The dynamic = "force-static" directive generates the file at build time, eliminating runtime overhead.
llms-full.txt: The Extended Version
llms-full.txt is an optional companion file served at /llms-full.txt. Where llms.txt is a navigation index (links and descriptions), llms-full.txt is a single flat document containing the complete prose content of every referenced page.
The two files serve different use cases:
- llms.txt is for conversational AI tools that need a quick map to decide which URL to fetch. Small, fast, fits easily in a context window.
- llms-full.txt is for IDE integrations, agent frameworks, and RAG pipelines that want to index your entire knowledge base without making individual HTTP requests per page.
Anthropic's documentation exemplifies this pattern: docs.claude.com/llms.txt is a slim index, while docs.claude.com/llms-full.txt is a large export of their complete documentation. Mintlify automatically generates both files for every documentation site they host.
When to publish both:
- Your documentation is extensive enough to not fit in a single context window
- You serve developer users who work with AI coding assistants
- You want to support agent-based workflows needing offline ingestion
llms-full.txt has no formal specification beyond being flat Markdown. By convention, concatenate pages with H1 or H2 dividers between sections so models can parse where one document ends and the next begins.
Does llms.txt Actually Work?
Honestly? The evidence for direct citation impact is weak. Here is what the research shows:
Studies finding no effect:
- SE Ranking's analysis of 300,000 domains found no statistically significant correlation between having llms.txt and being cited by AI engines. Removing the variable actually improved their model's prediction accuracy.
- IndexLab's before/after study in late 2025 found no measurable effect on citation rates.
- Search Engine Land tracked 10 sites and found no change in AI citation behavior after adding the file.
- OtterlyAI's 90-day server log study: llms.txt received 84 AI bot visits out of 62,100+ - about 3x fewer visits than a typical content page.
Moderately positive finding:
- Presenc AI research found a "moderately positive correlation" between well-curated llms.txt files and citation uplift - but specifically on Anthropic and Perplexity platforms, and specifically for sites with complex navigation structures where the file provides genuine disambiguation.
The practitioner consensus as of 2026 is clear: AI citation visibility is driven primarily by topical authority, consistent mentions across high-quality external sources, well-structured content that directly answers questions, and strong entity signals in structured data. llms.txt does not substitute for these factors.
The strongest ROI from llms.txt is not citation ranking - it is developer tooling. Coding assistants like Cursor, GitHub Copilot, and IDE integrations actively consume this content. If your product has an API or SDK, a well-structured llms.txt and llms-full.txt meaningfully improves how those tools explain your product to developers.
Treat llms.txt as low-cost infrastructure with a specific confirmed use case, not as a GEO silver bullet. It takes 1-2 hours to create and imposes no ongoing burden if automated.
llms.txt Best Practices
Keep it curated, not exhaustive:
- Aim for 5-20 links total; 50+ links defeats the purpose
- One thoughtful description per link beats no descriptions
- Mirror your actual information architecture, not your sitemap
Write the blockquote for AI, not for humans:
- This text frequently gets used verbatim by models describing your product
- Include: what you do, who you serve, what problem you solve
- Avoid marketing superlatives; write for accuracy
Content to include:
- Product overview and key feature pages
- Getting started guides and onboarding paths
- API reference and developer documentation
- Pricing page
- High-quality blog articles that establish topical authority
Content to exclude:
- Pagination, tag archives, search result pages
- Pages behind authentication (models cannot fetch them)
- UTM-tagged or duplicate-content URLs
- Admin paths, internal tooling, staging URLs
Use the `## Optional` section: Mark secondary content under ## Optional. Tools that process the file are permitted to skip these links when building a shorter context. Legal pages, changelog entries, and older blog posts typically belong here.
Maintenance:
- Update when you publish major new documentation sections
- Remove redirected or deleted pages promptly
- Validate that every URL returns a 200 response
- Do not add to sitemap.xml
- Serve with
Content-Type: text/plain; charset=utf-8
How to Track Whether llms.txt Is Working
Measuring llms.txt impact requires monitoring what actually matters - not file fetches, but whether AI engines are citing you accurately and frequently.
Server log analysis: Check your web server logs for AI bot user-agents (ClaudeBot, GPTBot, PerplexityBot) fetching /llms.txt. A baseline before and after publishing lets you measure whether bots are discovering and fetching the file. Per OtterlyAI's research, expect very low absolute numbers.
Citation monitoring: The more meaningful metric is whether your brand is being cited correctly across AI engines - and whether your framing from the llms.txt blockquote shows up in AI-generated descriptions of your product. Running a citation baseline before publishing llms.txt and monitoring for 60-90 days after gives you actual before/after data rather than assumptions. For this, you need a tool that runs your tracked prompts across ChatGPT, Perplexity, Gemini, Claude, and Copilot on a consistent schedule - exactly what CitedSpy automates.
Developer tool feedback: If you have an API or SDK, ask your developer community whether coding assistants are accurately describing your product. This is anecdotal but often the clearest signal that llms.txt is having its intended effect.
What not to expect: Do not expect a spike in AI bot traffic to your site. llms.txt reduces unnecessary fetches rather than increasing them - that is partly the point.
Frequently Asked Questions
llms.txt is worth implementing - not because the evidence for citation uplift is strong (it is not, yet), but because it is low-effort infrastructure that will matter more as AI tool support matures and as the developer tooling use case compounds. Stripe, Vercel, Cloudflare, and Anthropic all publish it. The cost is two hours. The downside is zero.
If you want to measure whether any of this is actually moving the needle on your AI visibility, CitedSpy tracks brand citations, sentiment, and mention frequency across every major AI engine - giving you the before/after data to evaluate what actually works.