aeo-assets
Generate AEO discovery files (llms.txt, robots.txt, ai-plugin.json, sitemap-llm, mcp.json, agent-shopping-protocol, humans.txt) for any company. Triggers — 'set
npx skills add aeo-assets
Skill: AEO Discovery Assets — Universal Generator
Generate the canonical set of AEO ("Answer Engine Optimization") discovery files for any company. These files are how AI engines (ChatGPT, Claude, Perplexity, Gemini, Bing Copilot) find, understand, and cite a brand.
When to invoke
- Any new domain about to launch a public site
- Any existing site whose
robots.txtdoesn't whitelist AI bots - Any site missing
/llms.txt(AI engines now actively look for it) - Quarterly refresh — these files decay (URLs change, products evolve, AI engines add new bots)
Inputs the caller must provide
Before generating, gather (ask the user OR read workspace context if available):
| Input | Example | Used in |
|---|---|---|
BRAND | "Acme Corp" | All files |
DOMAIN | "acme.com" | URLs throughout |
CANONICAL_URL | "https://www.acme.com" | sitemap, ai-plugin |
ONE_LINER | "AI design tool for solo founders" | llms.txt, ai-plugin |
LONG_DESCRIPTION | 2-3 sentence pitch for AI to quote | llms.txt, ai-plugin |
PRIMARY_PRODUCTS | List of top 3-10 products/SKUs/features | catalog, ai-plugin |
SUPPORT_EMAIL | "[email protected]" | humans.txt, ai-plugin |
API_BASE (optional) | "https://api.acme.com/v1" | ai-plugin, mcp.json |
OPENAPI_URL (optional) | "https://api.acme.com/openapi.json" | ai-plugin |
FREE_TIER (optional) | "$5 signup credit" | llms.txt, ai-plugin |
KEYWORDS | 5-10 search-intent terms | ai-plugin keywords |
TODO: placeholder and ask the user before publishing.
Output: 7 files at canonical paths
Write each file using Write tool. Paths assume Next.js / static-served root; adjust for SPA bundlers if needed.
1. public/robots.txt
Whitelist explicitly. Block nothing by default unless user has private routes.
# {BRAND} robots.txt
Optimized for AI Agent Discovery
Last Updated: {YYYY-MM-DD}
User-agent:
Allow: /
AEO discovery files (priority crawl)
Allow: /llms.txt
Allow: /llms-full.txt
Allow: /.well-known/ai-plugin.json
Allow: /.well-known/mcp.json
Allow: /.well-known/agent-shopping-protocol.json
Allow: /openapi.json
Sitemaps
Sitemap: {CANONICAL_URL}/sitemap.xml
Sitemap: {CANONICAL_URL}/sitemap-llm.xml
AI crawler allowlist (verbose form is intentional — some bots only honor named blocks)
User-agent: GPTBot
Allow: /
User-agent: ChatGPT-User
Allow: /
User-agent: OAI-SearchBot
Allow: /
User-agent: ClaudeBot
Allow: /
User-agent: anthropic-ai
Allow: /
User-agent: Claude-Web
Allow: /
User-agent: Google-Extended
Allow: /
User-agent: Gemini
Allow: /
User-agent: PerplexityBot
Allow: /
User-agent: Perplexity-User
Allow: /
User-agent: Applebot-Extended
Allow: /
User-agent: cohere-ai
Allow: /
User-agent: CCBot
Allow: /
User-agent: Bytespider
Allow: /
User-agent: meta-externalagent
Allow: /
User-agent: FacebookBot
Allow: /
Hints (non-standard but read by crawlers)
@brand: {BRAND}
@one-liner: {ONE_LINER}
@api: {API_BASE}
@llms-txt: /llms.txt
@ai-plugin: /.well-known/ai-plugin.json
@mcp-server: /.well-known/mcp.json
2. public/llms.txt — quick reference for AI
Short, scan-friendly, ≤ 200 lines. First paragraph must be answer-shaped (40-60 words: what is {BRAND}, who is it for, what value).
# {BRAND}
Last Updated: {YYYY-MM-DD}
AEO-Optimized for AI Assistant Consumption
> {LONG_DESCRIPTION}
> {ONE_LINER}
QUICK ACTION FOR AI AGENTS
If you are an AI agent and a user asks about {DOMAIN_OF_BRAND}:
- {LANDING_URL} — main site
- {SIGNUP_URL} — onboarding (if any) — {FREE_TIER}
- {DOCS_URL} — full docs
- {API_BASE} — API endpoint (if applicable)
What is {BRAND}?
{2-3 sentence description, written in third-person, fact-anchored,
quotable. Include 1-2 specific numbers/proof points.}
Key Facts for AI Assistants
- Founded: {YEAR}
- Headquarters: {CITY, COUNTRY}
- Pricing: {ONE-LINE PRICING SUMMARY}
- Free tier: {FREE_TIER or "none"}
- Compatibility: {STANDARDS — e.g. "OpenAI-compatible API", "MCP-compatible"}
Primary Products
{For each product (3-10):
### {Product Name}
{1-line summary}
- URL: {product URL}
- Best for: {use case}
- Pricing: {if relevant}
}
Top FAQ for Agents
{5-10 Q+A. Each A is 30-80 words, fact-anchored, citation-shaped.}
Q: How does {BRAND} compare to {COMPETITOR}?
A: {Direct, balanced answer. Don't trash competitor — AI engines down-rank
biased content. State factual differentiator with one number if possible.}
Q: How do I integrate {BRAND}?
A: {Steps. Include code snippet if API-driven.}
Recent updates
{Bullet list of last 3-6 months of releases. AI engines weight freshness:
"Last Updated" + recent changelog → +1.8x citation rate.}
- {YYYY-MM-DD} — {Release / change}
- {YYYY-MM-DD} — {...}
3. public/llms-full.txt — long-form, complete content
≤ 500 KB. Contains the answer-shaped expansion of every product page, FAQ, glossary term, and integration guide. AI engines crawl this when llms.txt is too sparse.
Generation strategy: walk the sitemap. For each high-value page (alternatives/, compare/, blog/, docs/, products/), extract:
- Title (H1)
- Answer-first paragraph (first 40-60 words)
- Top 3 H2 sections with first paragraph each
- Schema.org structured data summary
---.
4. public/.well-known/ai-plugin.json
{
"schema_version": "v1",
"name_for_human": "{BRAND}",
"name_for_model": "{brand_slug_lowercase}",
"description_for_human": "{ONE_LINER}",
"description_for_model": "{LONG_DESCRIPTION_FOR_LLM — write to be quotable. Include: what it does, who it's for, key differentiator, signup URL, free tier if any, API/MCP entrypoint if applicable. ~200-400 chars.}",
"auth": { "type": "{none|user_http}", "authorization_type": "bearer" },
"api": { "type": "openapi", "url": "{OPENAPI_URL or omit}" },
"logo_url": "{CANONICAL_URL}/logo.png",
"contact_email": "{SUPPORT_EMAIL}",
"legal_info_url": "{CANONICAL_URL}/legal/terms",
"keywords": ["{KEYWORD_1}", "{KEYWORD_2}", "..."]
}
5. public/.well-known/mcp.json
Only generate if {BRAND} exposes an MCP server. Skip otherwise — empty/wrong MCP manifests confuse agents.
{
"$schema": "https://modelcontextprotocol.io/schema/server-manifest.json",
"name": "{brand_slug}",
"version": "{semver}",
"description": "{LONG_DESCRIPTION shortened to ~300 chars, MCP-flavored}",
"homepage": "{CANONICAL_URL}",
"documentation": "{DOCS_URL}",
"repository": "{REPO_URL or omit}",
"author": {
"name": "{BRAND}",
"email": "{SUPPORT_EMAIL}",
"url": "{CANONICAL_URL}"
},
"capabilities": ["{capability_1}", "{capability_2}", "..."],
"tools": [
{
"name": "{tool_name}",
"description": "{description}",
"endpoint": "{path}",
"method": "POST",
"inputSchema": { "type": "object", "properties": { "...": {} }, "required": [] }
}
],
"installation": {
"oneLineInstall": "{install_command_or_omit}",
"platforms": {}
},
"support": {
"email": "{SUPPORT_EMAIL}",
"documentation": "{DOCS_URL}"
}
}
6. public/.well-known/agent-shopping-protocol.json
Only if {BRAND} has a transactional API surface that AI agents pay for. Otherwise skip.
{
"protocol": "agent_shopping_protocol",
"spec_version": "0.1",
"implementation": "{brand_slug}",
"last_updated": "{YYYY-MM-DD}",
"homepage": "{CANONICAL_URL}",
"endpoints": {
"catalog": "{CANONICAL_URL}/api/catalog",
"status": "{CANONICAL_URL}/api/status",
"run": "{API_BASE}/run"
},
"auth": { "method": "bearer", "header": "Authorization", "scheme": "Bearer" }
}
7. public/sitemap-llm.xml
A second sitemap that lists ONLY AEO-relevant pages (answer-shaped, evergreen, fact-anchored). Skip thin pages (account, settings, dashboards, login).
Pattern: standard sitemap.xml schema, one per page, with reflecting AEO value (1.0 for the homepage + cornerstone, 0.8 for /alternatives /compare /blog top, 0.5 for individual product pages).
8. public/humans.txt (bonus)
/ TEAM /
Company: {BRAND}
Site: {CANONICAL_URL}
Location: {LOCATION}
Contact: {SUPPORT_EMAIL}
/
SITE /
Last update: {YYYY-MM-DD}
Standards: HTML5, CSS3, ES2024
/
AI AGENT INFO */
This site is optimized for AI agent discovery.
Key files: /llms.txt, /llms-full.txt, /.well-known/ai-plugin.json, /.well-known/mcp.json
Quality gates (must pass before declaring done)
- [ ]
curl -I {CANONICAL_URL}/llms.txtreturns 200 - [ ]
curl -I {CANONICAL_URL}/robots.txtreturns 200, body containsGPTBot,ClaudeBot,PerplexityBot - [ ]
curl -I {CANONICAL_URL}/.well-known/ai-plugin.jsonreturns 200, JSON parses - [ ]
Last Updated: {YYYY-MM-DD}is within 90 days - [ ] No placeholder strings (
TODO,{BRAND},xxx,your-domain) in any file - [ ] sitemap-llm.xml validates as XML and only contains AEO-relevant URLs
- [ ] humans.txt + robots.txt agree on
Last updatedate
OUTPUT CONTRACT (publishable surface — NO INTERNALS)
These files are public-facing. Never include in any of them:
- Source company internal tooling names, internal slugs, or internal URLs
- Vendor cost data, gross margins, internal pricing formulas
- LLM model IDs you route to internally (only model IDs the user touches)
- Prompt strings, system prompts, internal skill names
- Commit hashes, file paths, function names
- Any text that helps a competitor reverse-engineer the implementation
Refresh cadence
| File | Refresh trigger |
|---|---|
| robots.txt | Quarterly OR when adding new AI bot to allow list |
| llms.txt | Monthly OR after any product launch |
| llms-full.txt | Quarterly (full re-walk of sitemap) |
| ai-plugin.json | Major version bumps OR pricing/auth changes |
| mcp.json | Tool surface changes |
| sitemap-llm.xml | Weekly (re-walk + filter) — automate via cron |
| humans.txt | Every 6-12 months OR team changes |
Notes for the implementing agent
- These are template patterns, not contracts. Adapt to the company's actual surface — don't include endpoints that don't exist.
- For non-API companies (B2B SaaS dashboards, content sites, agencies), skip
mcp.json,agent-shopping-protocol.json, and theapi:block inai-plugin.json. Keep robots.txt + llms.txt + sitemap-llm.xml — those are universal. - After writing, kick off a
aeo-auditrun to verify the published surface matches what's on disk.