← All Skills
AI Skill

seo-postmortem

Last updated: 2026-05-17

Incident postmortem for SEO/AEO regressions — triggered automatically when a metric drops ≥30% week-over-week or an experiment fails with delta < -0.5. Runs bla

Quick Install
npx skills add seo-postmortem

seo-postmortem — Blameless Root Cause, Durable Prevention

Regressions happen. What matters is whether we learn from them. A postmortem is NOT a blame exercise; it's the mechanism that turns one loss into a durable guardrail.

When to run

Triggered automatically:

  • By seo-retro when an experiment verdict is a severe loss (delta_vs_expected < -0.5)
  • By seo-rhythm-weekly when a top-level KPI drops ≥30% week-over-week (clicks, impressions, indexed pages, citation rate)
  • By seo-rhythm-monthly when a page's traffic drops ≥50% month-over-month
Manually:
  • User says "postmortem on /use/gpt-5" or "what happened to our indexing?"

Inputs

  • The triggering event (experiment ID, page URL, or metric name + magnitude)
  • Recent change history (git log on content dir, experiments logged in the window)
  • Data source (GSC/GA4/AEO probe) for the affected metric
  • External context (Google algo updates, AEO platform changes)

Workflow

Phase 1: Scope the incident

Define precisely:

  • What metric changed, by how much, over what window
  • Which pages/queries are affected (single page? site-wide? cluster?)
  • When the change started (date of first decline)
Example scope:
Indexed pages dropped from 324 to 97 between 2026-03-15 and 2026-03-18. Affects the /use/ cluster primarily (72% of de-indexed pages). First observed in GSC Coverage report on 2026-03-19.

Phase 2: Timeline reconstruction

Build a timeline from multiple sources:

  • git log --since="X weeks ago" -- {affected paths} for our changes
  • experiments/.yaml deployed in the window
  • Google algo update trackers (SearchEngineRoundtable, semrush sensor)
  • AEO platform status (OpenAI, Anthropic, Perplexity release notes if citation-related)
  • Site-level events (deploy logs, DNS changes, robots.txt history)
Present as a table:
Date        Source       Event
2026-03-14  git          Merged PR #2847: canonical URL restructure
2026-03-15  deploy       Deployed to prod
2026-03-16  GSC          First indexed-page drop detected
2026-03-17  GSC          Drop accelerates, /use/ cluster worst hit
2026-03-17  google_algo  Minor core update rolled out (unlikely cause — limited scope)
2026-03-19  retro        Flagged for postmortem

Phase 3: Root cause analysis

Use the "5 whys" discipline, but stop when you hit something you can act on:

Why did indexed pages drop?
  → Because /use/ URLs returned 404 for Google's crawler.
Why did they return 404?
  → Because we changed canonical from /use/X to /use/X/ without redirects.
Why did we not add redirects?
  → Because the seo-create skill doesn't flag canonical structure changes
    as needing redirect pairs.
Why doesn't it flag this?
  → Because we didn't previously have a lesson documenting the risk.
(Stop — this is the actionable layer.)

Differentiate root causes from contributing factors. A Google algo update in the same week might be noise; prove it's the cause before blaming it.

Phase 4: Fix the immediate damage

Postmortem doesn't just document — it also ships the fix if one is obvious:

  • Restore redirects for the 227 de-indexed URLs
  • Request re-indexing via GSC URL Inspection API
  • Log a new experiment to measure recovery
Capture:
  • What was fixed
  • When fix deployed
  • Expected recovery timeline (days until re-indexation)

Phase 5: Write the lesson

Write {workspace}/reports/seo/memory/lessons/lesson-{YYYY-MM-DD}-{short_id}.md:

# Lesson: Canonical URL changes require paired 301 redirects

Incident

2026-03-15 → 2026-03-18. Indexed pages dropped from 324 to 97 (-70%). Cause: changed canonical from /use/X to /use/X/ in PR #2847 without adding 301 redirects from old URLs.

Root cause

Missed the redirect step. Canonical change alone is not enough — Google treats old URLs as 404 and de-indexes.

Rule going forward

Any change that modifies URL structure (canonical, path, locale prefix, trailing slash) MUST include paired 301 redirects deployed in the same PR.

Concretely:

  • If changing href or canonical tag structure → audit all old URLs → add redirects
  • Pre-flight check: grep -r 'canonical' app/ before and after, diff must be paired with redirects
  • If using Next.js, check next.config.js redirects array before merging

Enforcement

  • Added pre-flight checklist item in seo-create skill: "URL structure change?"
  • Added to seo-experiment-log warnings: experiments of type=canonical or type=redirect
  • prompt "is the redirect pair included?"

Recovery

  • Redirects deployed 2026-03-19
  • GSC re-indexing requested 2026-03-20
  • Measurement experiment: exp-2026-03-20-rec tracks recovery (measure 2026-04-20)

Related

  • Triggered by: postmortem-2026-03-19
  • Updates: seo-create/preflight-checklist.md
  • Updates: seo-experiment-log/SKILL.md warnings section

Phase 6: Update guardrails (THE KEY STEP)

A lesson that doesn't change behavior is a lesson that will re-teach itself. For each lesson:

  1. Identify which skill should carry the guardrail
  2. Edit that skill to explicitly reference the lesson
  3. If there's a pre-flight checklist or warning section, add the specific check
Without this step, the postmortem is just a journal entry. With it, the agent is durably smarter.

Phase 7: Write postmortem summary

{workspace}/reports/seo/postmortems/pm-{YYYY-MM-DD}-{short_id}.md:
# Postmortem — {incident name}

Severity: high | medium | low
Affected: {what broke}
Detected: {date + source}
Resolved: {date + action}
Duration: {hours/days}
Data loss / impact: {numeric summary}

Timeline

{table from Phase 2}

Root cause

{concise from Phase 3}

Fix

{Phase 4 actions}

Lesson written

lessons/lesson-{date}-{id}.md

Guardrails updated

  • {skill 1 + section}
  • {skill 2 + section}

Follow-up tasks

  • {recovery measurement experiment}
  • {review related pages for same issue}
  • {anything else}

Could we have prevented this?

{Honest assessment. Was this predictable with existing knowledge? If yes, why didn't we? If no, this is the class of incident we couldn't avoid.}

Could we detect it faster next time?

{What signal would have caught this on day 1 instead of day 4? Add detection — alert thresholds, dashboard views, etc.}

Quality bar

  • [ ] Root cause is actionable, not a description ("lost topical authority" is not a root cause; "aggressive content rewrite removed cited sections" is)
  • [ ] Lesson file exists with a concrete "rule going forward"
  • [ ] At least one skill file was updated to enforce the lesson
  • [ ] Immediate fix is deployed OR tracked as a P0 task if it's larger than a quick patch
  • [ ] Recovery is being measured (not just "fixed, moving on")

What I refuse

  • To write a postmortem without a "rule going forward" section. That makes the whole exercise performative.
  • To attribute everything to "Google algo update" without evidence. That's how we learn nothing.
  • To skip Phase 6 (updating guardrails). Without it, we WILL repeat the mistake.
  • To write blame-laden prose. The goal is durable prevention, not accountability theater.

Integration

Reads:

  • The triggering experiment or metric
  • git log on affected paths
  • experiments/.yaml in window
  • External algo/platform update trackers
Writes:
  • lessons/lesson-{date}-{id}.md
  • postmortems/pm-{date}-{id}.md
  • Edits to upstream skill files (adding guardrail references)
Triggers:
  • New recovery experiment via seo-experiment-log
  • Follow-up tasks via tyctl task create (or workspace todo if standalone)