AI Skill

aeo-markdown-render

Serve a markdown version of every page for LLM crawlers. The +300% AI citation trick (G2 experiment, AEO Conf 2026). Triggers — 'markdown for LLMs', 'AI crawler

Quick Install

npx skills add aeo-markdown-render

Skill: AEO Markdown Rendering

Serve a parallel markdown version of every public HTML page so LLM crawlers can ingest content without parsing HTML, JavaScript, or CSS noise.

Why this works (the +300% number)

G2 ran this experiment in 2025 and reported a +300% increase in AI citations and AI context-fetch requests after publishing markdown versions of their pages. Same content, basic markdown rendering, no rich text required — the gain came purely from making content easier for LLM crawlers to parse.

Mechanism: when an LLM engine (Claude, ChatGPT, Perplexity) needs to deeply read a page (vs. snippet from search), it issues a context-fetch request. Markdown is ~5-10× smaller and 100× cleaner than the corresponding HTML. Crawlers prefer it; some crawlers refuse to deeply ingest pages above a certain HTML/JS payload threshold.

Approach: 3 viable patterns

Pick whichever fits the stack. They're equivalent for AEO purposes.

Pattern A: `.md` route alongside `.html`

For every page at /foo/bar, also serve /foo/bar.md (or /foo/bar/index.md). LLM crawlers learn the convention and prefer the .md URL.

Next.js: dynamic route app/[locale]/[...slug]/route.ts returns text/markdown based on Accept header OR .md extension in path.
Static site / Hugo / Jekyll: emit .md files alongside .html at build time.
Custom CMS: middleware that intercepts .md requests, looks up the page, and renders.

Pattern B: Content negotiation on the same URL

Same URL /foo/bar returns either HTML or markdown depending on Accept: text/markdown request header. Cleaner URL space, but some crawlers don't send the header.

Pattern C: Single dump in llms-full.txt

If the site is small (< 500 pages), skip per-page markdown and instead generate one consolidated llms-full.txt (see aeo-assets skill). Less granular but lower engineering cost.
Recommendation: Pattern A for sites > 100 pages. Pattern C for small sites. Skip Pattern B unless content negotiation is already in use.
Markdown generation rules

Markdown should be:

Answer-first: First 40-60 words must directly answer the page's primary question. If the HTML lede is fluff, rewrite for the markdown.

Schema preserved as YAML frontmatter:

--- title: "Best AI Design Tools 2026" description: "Six tools compared by price, output quality, and use case." url: "https://acme.com/best-ai-design-tools" updated: "2026-04-26" schema_type: "Article" author: "Jane Doe" ---

Headings preserved exactly — H1 / H2 / H3 must match the rendered HTML. AI engines use heading hierarchy to index sub-questions.

Tables preserved as markdown tables — don't convert to text. AI parses markdown tables as structured data.

Code blocks preserved with language tags — improves citation in technical queries.

No JS/CSS/navigation chrome — strip nav, footer, sidebars, cookie banners, modals. Just main content.

Internal links preserved — text → AI follows links to build site context.

Images: alt text only — ![alt text]() (omit src). LLMs only need the alt for context.

No duplicate content — if HTML page is mostly programmatically generated boilerplate, the markdown will too. Make sure each page has unique signal (≥ 300 unique words).

Quality template

--- title: "{H1 from HTML}" description: "{meta description}" url: "{canonical URL}" updated: "{YYYY-MM-DD from page Last-Modified or schema}" schema_type: "{Article|Product|FAQPage|HowTo|...}" {H1} {Answer-first paragraph, 40-60 words, fact-anchored, citation-shaped. First sentence directly answers the title's implied question.} {H2 — first major section} {Content. Preserve tables, lists, bullet structure.} Header Header Cell Cell {H2 — second section} ... FAQ {Question 1} {30-80 word answer.} {Question 2} {...} Last updated: {YYYY-MM-DD} • Read this on the web: {URL}

Robots.txt + sitemap glue

Wire the markdown into the AEO discovery surface:

Add to robots.txt:

Allow: /.md$

Add to sitemap-llm.xml: list both /foo AND /foo.md for high-value pages, OR list only .md to nudge crawlers.
Cross-reference in llms.txt:

## Markdown versions for LLMs
   Every page on this site is also available in markdown by appending .md:
   - {CANONICAL_URL}/about → {CANONICAL_URL}/about.md
   - {CANONICAL_URL}/pricing → {CANONICAL_URL}/pricing.md

Cache & headers

For each .md response:

Content-Type: text/markdown; charset=utf-8
Cache-Control: public, max-age=3600, stale-while-revalidate=86400
Last-Modified: {date}
X-Robots-Tag: index, follow
Link: <{html-canonical-url}>; rel="canonical"

The Link: rel=canonical header tells LLM engines that the markdown is a representation of the HTML page, not a duplicate. Critical: without it, you risk getting both versions de-duplicated by Google's canonical chooser.

Implementation checklist

[ ] Build a markdown renderer (page → .md string)
[ ] Wire route handler (Pattern A) OR content negotiation (Pattern B) OR static dump (Pattern C)
[ ] Add Allow: /*.md$ to robots.txt (via aeo-assets)
[ ] List markdown URLs in sitemap-llm.xml
[ ] Reference markdown convention in llms.txt
[ ] Verify: curl -H "Accept: text/markdown" {URL} returns text/markdown
[ ] Verify: curl {URL}.md returns clean markdown (no HTML chrome)
[ ] Track impact via aeo-citation-track — measure mention rate before / 30 days after

Measurement (proves this is worth doing)

Before launching the markdown layer, capture baseline:

4-engine citation rate via aeo-citation-track
AI bot user-agent traffic from server logs (look for GPTBot, ClaudeBot, PerplexityBot, Perplexity-User, OAI-SearchBot, ChatGPT-User)
AI-referred sessions in analytics (referer header https://chat.openai.com, https://perplexity.ai, https://claude.ai, https://gemini.google.com)

30 days after launch, re-measure. G2's reported gain was +300% citations + +300% context-fetch requests. Real-world results vary 50-400% depending on content quality and category competitiveness.

Common gotchas

Don't double-publish thin content — if a page is < 300 unique words in HTML, the markdown is also thin. Fix the page first.
Don't use markdown to publish content not on the HTML page — Google may treat the divergence as cloaking. Markdown should be a clean representation of HTML, not a longer/different version.
Don't forget locale prefix — /zh-CN/foo.md for international sites; pair with hreflang.
Don't lose schema — frontmatter must include the schema fields the original HTML carries.

aeo-markdown-render

Skill: AEO Markdown Rendering

Why this works (the +300% number)

Approach: 3 viable patterns

Pattern A: .md route alongside .html

Pattern B: Content negotiation on the same URL

Pattern C: Single dump in llms-full.txt

Markdown generation rules

Quality template

{H1}

{H2 — first major section}

{H2 — second section}

FAQ

{Question 1}

{Question 2}

Robots.txt + sitemap glue

Cache & headers

Implementation checklist

Measurement (proves this is worth doing)

Common gotchas

Pattern A: `.md` route alongside `.html`

Pattern C: Single dump in `llms-full.txt`