llms.txt is a plain Markdown file you place at `/llms.txt` on your domain. It gives AI language models a curated map of your most important content - what your site is, what it does, and which pages are worth reading. It takes under two hours to create and requires no ongoing infrastructure.

A diagram showing a website serving its llms.txt file to multiple AI engines including Claude, Perplexity, and coding assistants

What Is llms.txt?

llms.txt is a Markdown-formatted file served at the root of a domain (yourdomain.com/llms.txt) that provides AI language models with a structured, author-curated index of a website's most important content.

The specification was proposed by Jeremy Howard - co-founder of Answer.AI and fast.ai - on September 3, 2024. Howard published the original proposal at answer.ai/posts/2024-09-03-llmstxt.html. The canonical specification now lives at llmstxt.org, and the project is maintained at github.com/answerdotai/llms-txt.

The problem Howard identified is straightforward: LLMs have finite context windows. When an AI assistant or agent needs to understand a product, service, or documentation site, it cannot ingest every page. HTML pages are cluttered with navigation, footers, cookie banners, and marketing copy that consume tokens without adding meaning. Even sitemaps just list hundreds of URLs with no context about which pages matter most.

llms.txt solves this by letting site owners write, in plain language, exactly what their site is and exactly which pages the model should prioritize. It is purely additive - it does not restrict access the way robots.txt does. It provides context.

Why llms.txt Matters for AI Visibility

Roughly one in ten domains has now published an llms.txt file. An SE Ranking study of approximately 300,000 domains found a 10.13% adoption rate as of early 2026. Adoption is concentrated in developer tools, documentation sites, and technical SaaS products.

Notably, adoption is fairly even across traffic tiers:

Low-traffic sites: 9.88%
Mid-traffic sites: 10.54%
High-traffic sites: 8.27%

The largest consumer platforms (Google, Facebook, YouTube, Amazon) have not adopted it. llms.txt is primarily a tool for the technical and professional web - the segment most likely to be discovered through AI assistants rather than social feeds.

For GEO practitioners, the relevance is practical rather than speculative. AI tools - from ChatGPT to Perplexity to IDE coding assistants - increasingly serve as the first point of contact between a buyer or developer and your product. If an AI model cannot accurately describe what you do, you lose that touchpoint. llms.txt gives you a direct line to write that description yourself.

The strongest confirmed use case is developer-facing products: coding assistants like Cursor and GitHub Copilot actively consume documentation context. A well-structured llms.txt directly improves how those tools answer developer questions about your API.

How llms.txt Works

When an AI tool, agent, or IDE needs to understand a website, it can fetch /llms.txt as a first step before deciding which deeper pages to read. The file's H1 and blockquote give the model an immediate summary. The linked sections tell it where to go next.

The file is designed for three distinct use cases:

Inference-time use - A user asks an AI assistant about your product. The assistant fetches your llms.txt to quickly orient itself before answering.
Agent-based workflows - An AI agent autonomously navigates your site. The llms.txt file acts as a table of contents, reducing unnecessary fetches.
IDE and tooling ingestion - Coding assistants like Cursor index your documentation. Your llms.txt tells them which pages contain the most relevant technical content.

A diagram showing which AI crawlers and tools currently support llms.txt, with confirmed vs unconfirmed status

Which crawlers actually use it? As of mid-2026, confirmed support is limited:

Crawler / Platform	Status	Notes
Anthropic / Claude	Confirmed	Publishes its own llms.txt; Claude-based tools report using it
Perplexity	Confirmed	Public support statement
OpenAI / ChatGPT	Observed, unconfirmed	GPTBot fetches the file but no public statement
Mistral	Maturing	Listed as having partial support
Google / Gemini	No support	Explicitly stated no plans to use it
Bing / Copilot	No confirmed support	Not documented
LangChain, LlamaIndex	Variable	Plugins exist; depends on developer configuration

A 90-day server log study by OtterlyAI found that /llms.txt received about 84 total AI bot visits out of over 62,000 AI bot visits - roughly 0.1% of AI traffic. Even where support exists, the file is not being heavily prioritized yet.

llms.txt vs robots.txt: Key Differences

These two files are often confused but serve opposite purposes and should both exist on your domain.

A side-by-side comparison of robots.txt and llms.txt showing their different purposes and mechanisms

Dimension	robots.txt	llms.txt
Purpose	Access control - tells crawlers what NOT to fetch	Semantic guidance - tells AI what IS most worth reading
Direction	Restrictive	Additive
Mechanism	Allow/disallow rules with user-agent targeting	Curated Markdown index with descriptions
Scope	All crawlers (search engines, AI bots, scrapers)	AI models and agents specifically
Enforcement	Technical + legal weight; widely enforced	Proposal only; no enforcement mechanism
Support	25+ years, near-universal implementation	~18 months, ~10% adoption
Effect on Google	Controls indexing and crawl budget	No effect (Google does not use it)

Think of robots.txt as a bouncer who controls which rooms people can enter. llms.txt is a tour guide who explains which exhibits are worth seeing. You need both.

robots.txt blocks content you do not want crawled. llms.txt promotes content you want AI to understand. Use robots.txt to exclude low-value paths (admin, search results, pagination). Use llms.txt to point AI toward your best content.

How to Create Your llms.txt File (Step by Step)

Audit your content hierarchy. Identify the 5-20 pages an AI most needs to understand your product. For a SaaS product, this typically includes: product overview, key features page, pricing, getting started guide, API reference, and FAQ. Do not list every page - curation is the entire point.

Write the H1. Use your product or company name exactly as you brand it. Nothing else on this line.

Write the blockquote. One to three sentences describing what your product does, who it serves, and what problem it solves. Write this carefully - AI models often use this text verbatim when describing your product to users.

Organize into sections. Group your links under H2 headings. Common patterns: ## Docs, ## API Reference, ## Pricing, ## Blog, ## Legal, ## Optional.

Write descriptions for every link. One sentence per link, placed after a colon following the URL. Models use these to decide whether to fetch a given URL.

Create the file. For static sites, place a plain text file named llms.txt in your public root. For Next.js App Router, use a route handler (see format section below).

Optionally create llms-full.txt. For extensive documentation, concatenate the full Markdown content of all referenced pages into /llms-full.txt.

Do not add it to sitemap.xml. llms.txt is not an indexable HTML page. Do not list it in your sitemap.

Verify. Confirm your URL returns HTTP 200 with Content-Type: text/plain and valid Markdown.

llms.txt Format and Syntax

The specification defines a strict element order:

Optional UTF-8 byte-order mark (BOM)
H1 heading (REQUIRED) - the only mandatory element
Blockquote (optional but strongly recommended) - short summary of the site
Optional unstructured Markdown (no H2/H3 at this level)
H2-delimited sections each containing a Markdown list of links

A section titled ## Optional has special meaning: tools processing the file may safely skip those links when building a shorter context window.

An annotated diagram of a valid llms.txt file showing each element and its purpose

Here is a complete, real-world example:

# CitedSpy

> CitedSpy is a GEO (Generative Engine Optimization) tracking platform for brands and agencies.
> It monitors how often your brand is cited, mentioned, and recommended across AI engines -
> ChatGPT, Perplexity, Gemini, Claude, and Copilot.

## Product
- [How CitedSpy Works](https://citedspy.com/features): Overview of the monitoring, analysis, and reporting features
- [Pricing](https://citedspy.com/pricing): Plan tiers, feature comparison, and enterprise options
- [Changelog](https://citedspy.com/changelog): Recent product updates and new engine support

## Blog
- [What is GEO?](https://citedspy.com/blog/generative-engine-optimization): Introduction to Generative Engine Optimization for marketers
- [AI Citation Guide](https://citedspy.com/blog/ai-citation): How AI engines decide what to cite and how to measure it

## Optional
- [Privacy Policy](https://citedspy.com/privacy): Data handling and GDPR compliance
- [Terms of Service](https://citedspy.com/terms): Usage terms and acceptable use policy
- [About](https://citedspy.com/about): Company background and founding story

For Next.js App Router, serve the file via a route handler to avoid static file issues:

// app/llms.txt/route.ts
export const dynamic = "force-static";

export async function GET() {
  const content = [
    "# CitedSpy",
    "",
    "> CitedSpy monitors brand citations across AI engines including",
    "> ChatGPT, Perplexity, Gemini, Claude, and Copilot.",
    "",
    "## Product",
    "- [Features](https://citedspy.com/features): Full feature overview and engine coverage",
    "- [Pricing](https://citedspy.com/pricing): Plan comparison and pricing details",
    "",
    "## Optional",
    "- [Privacy Policy](https://citedspy.com/privacy): Data handling practices",
  ].join("\n");

  return new Response(content, {
    headers: { "Content-Type": "text/plain; charset=utf-8" },
  });
}

The dynamic = "force-static" directive generates the file at build time, eliminating runtime overhead.

llms-full.txt: The Extended Version

llms-full.txt is an optional companion file served at /llms-full.txt. Where llms.txt is a navigation index (links and descriptions), llms-full.txt is a single flat document containing the complete prose content of every referenced page.

The two files serve different use cases:

llms.txt is for conversational AI tools that need a quick map to decide which URL to fetch. Small, fast, fits easily in a context window.
llms-full.txt is for IDE integrations, agent frameworks, and RAG pipelines that want to index your entire knowledge base without making individual HTTP requests per page.

Anthropic's documentation exemplifies this pattern: docs.claude.com/llms.txt is a slim index, while docs.claude.com/llms-full.txt is a large export of their complete documentation. Mintlify automatically generates both files for every documentation site they host.

When to publish both:

Your documentation is extensive enough to not fit in a single context window
You serve developer users who work with AI coding assistants
You want to support agent-based workflows needing offline ingestion

llms-full.txt has no formal specification beyond being flat Markdown. By convention, concatenate pages with H1 or H2 dividers between sections so models can parse where one document ends and the next begins.

Does llms.txt Actually Work?

Honestly? The evidence for direct citation impact is weak. Here is what the research shows:

Studies finding no effect:

SE Ranking's analysis of 300,000 domains found no statistically significant correlation between having llms.txt and being cited by AI engines. Removing the variable actually improved their model's prediction accuracy.
IndexLab's before/after study in late 2025 found no measurable effect on citation rates.
Search Engine Land tracked 10 sites and found no change in AI citation behavior after adding the file.
OtterlyAI's 90-day server log study: llms.txt received 84 AI bot visits out of 62,100+ - about 3x fewer visits than a typical content page.

Moderately positive finding:

Presenc AI research found a "moderately positive correlation" between well-curated llms.txt files and citation uplift - but specifically on Anthropic and Perplexity platforms, and specifically for sites with complex navigation structures where the file provides genuine disambiguation.

The practitioner consensus as of 2026 is clear: AI citation visibility is driven primarily by topical authority, consistent mentions across high-quality external sources, well-structured content that directly answers questions, and strong entity signals in structured data. llms.txt does not substitute for these factors.

The strongest ROI from llms.txt is not citation ranking - it is developer tooling. Coding assistants like Cursor, GitHub Copilot, and IDE integrations actively consume this content. If your product has an API or SDK, a well-structured llms.txt and llms-full.txt meaningfully improves how those tools explain your product to developers.

Treat llms.txt as low-cost infrastructure with a specific confirmed use case, not as a GEO silver bullet. It takes 1-2 hours to create and imposes no ongoing burden if automated.

llms.txt Best Practices

Keep it curated, not exhaustive:

Aim for 5-20 links total; 50+ links defeats the purpose
One thoughtful description per link beats no descriptions
Mirror your actual information architecture, not your sitemap

Write the blockquote for AI, not for humans:

This text frequently gets used verbatim by models describing your product
Include: what you do, who you serve, what problem you solve
Avoid marketing superlatives; write for accuracy

Content to include:

Product overview and key feature pages
Getting started guides and onboarding paths
API reference and developer documentation
Pricing page
High-quality blog articles that establish topical authority

Content to exclude:

Pagination, tag archives, search result pages
Pages behind authentication (models cannot fetch them)
UTM-tagged or duplicate-content URLs
Admin paths, internal tooling, staging URLs

Use the `## Optional` section: Mark secondary content under ## Optional. Tools that process the file are permitted to skip these links when building a shorter context. Legal pages, changelog entries, and older blog posts typically belong here.

Maintenance:

Update when you publish major new documentation sections
Remove redirected or deleted pages promptly
Validate that every URL returns a 200 response
Do not add to sitemap.xml
Serve with Content-Type: text/plain; charset=utf-8

How to Track Whether llms.txt Is Working

Measuring llms.txt impact requires monitoring what actually matters - not file fetches, but whether AI engines are citing you accurately and frequently.

Server log analysis: Check your web server logs for AI bot user-agents (ClaudeBot, GPTBot, PerplexityBot) fetching /llms.txt. A baseline before and after publishing lets you measure whether bots are discovering and fetching the file. Per OtterlyAI's research, expect very low absolute numbers.

Citation monitoring: The more meaningful metric is whether your brand is being cited correctly across AI engines - and whether your framing from the llms.txt blockquote shows up in AI-generated descriptions of your product. Running a citation baseline before publishing llms.txt and monitoring for 60-90 days after gives you actual before/after data rather than assumptions. For this, you need a tool that runs your tracked prompts across ChatGPT, Perplexity, Gemini, Claude, and Copilot on a consistent schedule - exactly what CitedSpy automates.

Developer tool feedback: If you have an API or SDK, ask your developer community whether coding assistants are accurately describing your product. This is anecdotal but often the clearest signal that llms.txt is having its intended effect.

What not to expect: Do not expect a spike in AI bot traffic to your site. llms.txt reduces unnecessary fetches rather than increasing them - that is partly the point.

Frequently Asked Questions

No. Google publicly stated in 2025 that it has no plans to use llms.txt for any product, including Gemini or AI Overviews. Google's John Mueller stated that no AI system currently uses it. Google continues to rely on traditional signals: structured data, authority, links. This is the single most important fact to understand about llms.txt's current limitations.

No. The file is plain text, not linked from your site's navigation, and not intended for search engine indexing. It has no effect on how Googlebot or Bingbot crawl your site. Simply do not add it to your sitemap.xml and ensure your robots.txt does not block it.

No. robots.txt controls which crawlers can access which URLs. llms.txt provides semantic guidance for AI models about which content is most worth understanding. They serve opposite purposes and should both exist on your domain.

Keep it under 5,000 words. If it is longer, you are listing too many pages. The purpose is curation - a model should be able to read your entire llms.txt in a single context window and have a clear picture of your site.

Only if your documentation is extensive and you serve a developer audience using AI coding tools. For a standard SaaS marketing site, llms.txt alone is sufficient. llms-full.txt is primarily for documentation-heavy products where IDEs and agent frameworks need to index your full knowledge base.

Serve it as text/plain; charset=utf-8. Do not serve it as text/html or application/json.

No. llms.txt and llms-full.txt are not indexable HTML pages. Do not add them to sitemap.xml, and do not include any robots.txt disallow rules for them - you want AI tools to be able to fetch these files freely.

llms.txt is worth implementing - not because the evidence for citation uplift is strong (it is not, yet), but because it is low-effort infrastructure that will matter more as AI tool support matures and as the developer tooling use case compounds. Stripe, Vercel, Cloudflare, and Anthropic all publish it. The cost is two hours. The downside is zero.

If you want to measure whether any of this is actually moving the needle on your AI visibility, CitedSpy tracks brand citations, sentiment, and mention frequency across every major AI engine - giving you the before/after data to evaluate what actually works.

llms.txt: The Complete Guide for Marketers (2026)