Definition
llms.txt is a proposed standard for telling large language models which parts of a website to read. It’s a Markdown file placed at the root of a domain — https://example.com/llms.txt — that gives an LLM a concise, curated index of a site’s most important content, with a one-line description for each link. The idea is to hand a model a clean reading list in plain text, stripped of the navigation, ads, and JavaScript that clutter a normal HTML page and eat into a model’s limited context.
Australian technologist Jeremy Howard, co-founder of Answer.AI, proposed the format in September 2024. His reasoning was practical: context windows are too small to hold most websites in full, and converting messy HTML into LLM-friendly text is both hard and imprecise. A short, expert-level summary in one predictable location solves that, especially for documentation and developer tools where a model needs fast access to API references and guides.
One thing to be clear about up front. llms.txt is a community proposal, not a ratified standard like robots.txt, and as of 2026 no major AI provider has confirmed using it in production. That gap between the idea and its real-world use shapes almost every honest discussion of the file, and it’s covered in the trends and FAQ sections below.
How It Relates to Marketing
For marketers, llms.txt sits inside the broader push toward agent readiness and AI discoverability. The pitch is that an LLM asked about your brand or product could read a clean, curated file instead of guessing from raw page markup, and come away with an accurate picture of what you offer and where the important content lives. That maps neatly onto generative engine optimization and the general goal of being legible to AI systems.
The honest version of the marketing story is more measured. There’s no confirmed evidence that llms.txt currently improves citation rates in ChatGPT, Perplexity, or Gemini, or that it influences Google’s AI Overviews. Search Engine Land reported that eight of nine sites saw no measurable traffic change after adding the file. So the realistic framing is that llms.txt is a low-cost, low-risk step that may pay off as the agentic web matures — not a lever that moves visibility today. Documentation-heavy and developer-facing brands get the clearest near-term use; general ecommerce and consumer brands see less proven benefit so far.
How to Create an llms.txt File
The spec defines a specific Markdown structure. A file starts with an H1 naming the project or brand, followed by a blockquote that summarizes it in a sentence or two. After that, optional plain Markdown can add context. Then H2 headers organize lists of links, each link followed by a short description. An “Optional” section flags resources a model can skip when it’s short on context.
A minimal example looks like this:
# Acme Running
> Acme makes trail-running shoes and gear for ultramarathoners and weekend runners alike.
Key information for understanding Acme's catalog and policies.
## Products
- [Trail Shoes](https://example.com/trail): Grip-focused shoes for technical terrain, sizes 6–15.
- [Hydration Vests](https://example.com/vests): Race vests with 1.5L–2L capacity and soft flasks.
## Policies
- [Returns](https://example.com/returns): 60-day free returns, including worn shoes.
- [Shipping](https://example.com/shipping): Delivery windows, costs, and carrier options.
## Optional
- [Company History](https://example.com/about): Background, not needed for product questions.
Two files are common. The base /llms.txt is the index for orientation, and /llms-full.txt dumps the full text of the linked content for deep ingestion. Anthropic, Vercel, and LangGraph all publish both. The descriptions matter more than people expect — “Payment Intents API: create, confirm, and capture payments” does far more work than a bare “API Reference.” Tools have made generation easy: Mintlify auto-creates both files for the docs sites it hosts, the Yoast plugin adds one to WordPress with a click, and Firecrawl can build one from an existing site.
How to Utilize llms.txt
The clearest fit is documentation. A site like Stripe or Cloudflare uses llms.txt to point models and IDE agents at the exact API references and guides that answer developer questions, which keeps a coding assistant from re-deriving what each page is about from its URL. Software companies use it to make their products usable by agents in tools like Cursor, Claude Code, and Copilot. Beyond docs, brands use it as a curated summary layer so that an AI browsing the live web for an answer is steered toward the cleanest, most relevant pages.
A few patterns work better than others. For a multi-product site, group links by major product area, with quickstarts, concepts, and references under each. For a single-product site, a flat list of the handful of pages that matter is enough. The throughline is curation: the file is a recommended reading list, not a second sitemap, so it should hold the pages you’d most want a model to rely on and leave the rest out.
It’s also worth noting where llms.txt stops. It describes content for reading. It doesn’t let an agent take actions like checking inventory or placing an order — that’s the job of the Model Context Protocol. The two are complementary, and Mintlify and others now generate both an llms.txt and an MCP server from the same docs.
Comparison to Similar Approaches
| File / Standard | Purpose | Who it’s for | Status |
|---|---|---|---|
| llms.txt | Curated index of priority content for AI to read at inference time | LLMs and AI agents | Proposed; not yet used in production by major providers |
| robots.txt | Exclusion — tells crawlers where they may not go | Search and AI crawlers | Established; respected, including AI-bot User-Agent rules |
| sitemap.xml | Discovery — lists every page that exists | Search engine crawlers | Established standard |
| llms-full.txt | Full-text dump of linked content for deep ingestion | LLMs needing complete context | Companion to llms.txt; same proposal |
| Model Context Protocol (MCP) | Lets agents take actions, not just read | AI agents performing tasks | Active, widely adopted in agentic tooling |
The cleanest way to keep these straight: robots.txt says where bots can’t go, sitemap.xml lists everything that exists, and llms.txt curates what matters most for a model to read. The one that actually controls how AI crawlers treat a site today is still robots.txt, where AI-specific User-Agent rules (GPTBot, Google-Extended, ClaudeBot, and others) are honored.
Best Practices
- Keep the file curated and short. It’s a priority reading list, so include only the pages you’d most want a model to rely on.
- Write a real description for every link. Specific, content-rich glosses help a model far more than page titles.
- Lead with a clear value proposition in the H1 and blockquote, so a model that reads only the top already knows what you are and who you serve.
- Pair llms.txt with llms-full.txt when the linked content is worth ingesting in full, as Anthropic and Vercel do.
- Don’t build hidden, bot-only Markdown versions of pages. A public index file is fine; a separate version no human sees risks crossing into cloaking.
- Keep expectations grounded. Treat it as cheap insurance for an agentic future, not a confirmed visibility or ranking signal.
- Verify in your server logs whether AI crawlers actually request the file, rather than assuming they do.
Future Trends
The debate over llms.txt is unusually public, and it cuts both ways. On the skeptical side, Google has been blunt. John Mueller compared the file to the long-discredited keywords meta tag — site-owner-controlled and therefore easy to manipulate — and noted that server logs show AI crawlers don’t even request it. Gary Illyes confirmed at Google Search Central Live that Google doesn’t support llms.txt and has no plans to. When the file briefly appeared in Google’s own developer docs in December 2025, it was removed within the day and explained as a CMS quirk, not an endorsement. Log studies back the skeptics on usage: one analysis of more than 500 million AI bot visits over 90 days found only 408 that targeted llms.txt directly.
Adoption is real but modest. An SE Ranking study of 300,000 domains put the adoption rate around 10%, roughly one site in ten after eighteen months of discussion. The bull case rests on direction rather than current results. Supporters point out that shipping the file is nearly free and harmless, that it’s the first widely adopted business-to-agent convention, and that IDE agents and MCP servers already make use of structured context like it. There’s an institutional foothold, too: in May 2026 Google added an llms.txt check to Chrome Lighthouse’s new “Agentic Browsing” audit, which evaluates how ready a site is for machine interaction — notably a Chrome and developer-tools signal, separate from Google Search. Whether llms.txt becomes a durable standard or fades like other site-owner-controlled formats remains genuinely open.
Frequently Asked Questions
1. Do AI systems actually use llms.txt? Not in production, as far as anyone has confirmed. No major provider — OpenAI, Anthropic, Google, Meta, Mistral — has committed to using it as a signal in their search or answer surfaces, and server logs show AI crawlers rarely even request the file. IDE and documentation agents are the main place it’s read today.
2. Is llms.txt an official standard? No. It’s a community proposal from Jeremy Howard of Answer.AI, not a ratified standard like robots.txt. The IETF has an AI Preferences Working Group for related standards, but llms.txt isn’t part of it.
3. How is llms.txt different from robots.txt and sitemap.xml? robots.txt controls where crawlers may go. sitemap.xml lists every page. llms.txt curates the most important content for an AI to read, in clean Markdown. They solve different problems.
4. Will adding llms.txt improve my SEO or AI citations? There’s no evidence it does today. Google has said it isn’t a ranking signal, and traffic tests have shown no measurable change. It may matter more as agentic systems mature.
5. What’s the difference between llms.txt and llms-full.txt? llms.txt is a short index with links and descriptions. llms-full.txt contains the full text of that content for deeper ingestion. Many sites publish both.
6. How is llms.txt different from MCP? llms.txt describes content for an AI to read. The Model Context Protocol lets an agent take actions, like querying inventory or completing a task. Reading versus doing.
7. Should I add one to my site? It’s cheap and low-risk, so many teams ship one as a step toward agent readiness — particularly documentation and developer-facing sites. Just keep expectations realistic and don’t treat it as a confirmed visibility tactic.
8. Can llms.txt be manipulated? Because the site owner writes it, it can overstate what a site contains, which is the basis for Google’s comparison to the keywords meta tag. An AI can always check the live pages instead, which is part of why uptake has been slow.
Related Terms
- Agentic Commerce
- Shopping Agent
- Brand Visibility for Agentic Commerce (BVAC)
- Generative Engine Optimization (GEO)
- Model Context Protocol (MCP)
- Answer Engine Optimization (AEO)
- Product Feed Optimization for AI
- llms.txt
- Protocol Readiness
- Large Language Model (LLM)
- Multi-Agent System (MAS)
- Human-in-the-Loop (HITL)
- Large Action Model (LAM)
- Retrieval-Augmented Generation (RAG)
- Agent Orchestration
- Tool Use / Function Calling
- Structured Data for Agents
- Agent Discoverability
- AI Search Optimization
Sources
- Search Engine Land — Meet llms.txt, a proposed standard for AI website content crawling: https://searchengineland.com/llms-txt-proposed-standard-453676
- Mintlify — What is llms.txt? Breaking down the skepticism: https://www.mintlify.com/blog/what-is-llms-txt
- Firecrawl — How to Create an llms.txt File for Any Website: https://www.firecrawl.dev/blog/How-to-Create-an-llms-txt-File-for-Any-Website
- Codersera — llms.txt Explained (May 2026): The Honest Guide to the Spec, Adoption, and How to Ship One: https://codersera.com/blog/llms-txt-complete-guide-2026/
- Yotpo — What Is LLMs.txt? The Guide To AI Search & GEO: https://www.yotpo.com/blog/what-is-llms-txt/
- Search Engine Roundtable — Google Says No AI System Currently Uses LLMs.txt: https://www.seroundtable.com/google-ai-llms-txt-39607.html
- Limy — LLMs.txt in 2026: The Full Guide: https://limy.ai/blog/llms.txt-in-2026-the-full-guide
- Webyes — Does llms.txt Improve AI Visibility and Citations? (Expert Take): https://www.webyes.com/blogs/does-llms-txt-improve-rankings/
