seo-postmortem
Incident postmortem for SEO/AEO regressions — triggered automatically when a metric drops ≥30% week-over-week or an experiment fails with delta < -0.5. Runs bla
npx skills add seo-postmortem
seo-postmortem — Blameless Root Cause, Durable Prevention
Regressions happen. What matters is whether we learn from them. A postmortem is NOT a blame exercise; it's the mechanism that turns one loss into a durable guardrail.
When to run
Triggered automatically:
- By
seo-retrowhen an experiment verdict is a severe loss (delta_vs_expected < -0.5) - By
seo-rhythm-weeklywhen a top-level KPI drops ≥30% week-over-week (clicks, impressions, indexed pages, citation rate) - By
seo-rhythm-monthlywhen a page's traffic drops ≥50% month-over-month
- User says "postmortem on /use/gpt-5" or "what happened to our indexing?"
Inputs
- The triggering event (experiment ID, page URL, or metric name + magnitude)
- Recent change history (git log on content dir, experiments logged in the window)
- Data source (GSC/GA4/AEO probe) for the affected metric
- External context (Google algo updates, AEO platform changes)
Workflow
Phase 1: Scope the incident
Define precisely:
- What metric changed, by how much, over what window
- Which pages/queries are affected (single page? site-wide? cluster?)
- When the change started (date of first decline)
Indexed pages dropped from 324 to 97 between 2026-03-15 and 2026-03-18. Affects the /use/ cluster primarily (72% of de-indexed pages). First observed in GSC Coverage report on 2026-03-19.
Phase 2: Timeline reconstruction
Build a timeline from multiple sources:
git log --since="X weeks ago" -- {affected paths}for our changesexperiments/.yamldeployed in the window- Google algo update trackers (SearchEngineRoundtable, semrush sensor)
- AEO platform status (OpenAI, Anthropic, Perplexity release notes if citation-related)
- Site-level events (deploy logs, DNS changes, robots.txt history)
Date Source Event
2026-03-14 git Merged PR #2847: canonical URL restructure
2026-03-15 deploy Deployed to prod
2026-03-16 GSC First indexed-page drop detected
2026-03-17 GSC Drop accelerates, /use/ cluster worst hit
2026-03-17 google_algo Minor core update rolled out (unlikely cause — limited scope)
2026-03-19 retro Flagged for postmortem
Phase 3: Root cause analysis
Use the "5 whys" discipline, but stop when you hit something you can act on:
Why did indexed pages drop?
→ Because /use/ URLs returned 404 for Google's crawler.
Why did they return 404?
→ Because we changed canonical from /use/X to /use/X/ without redirects.
Why did we not add redirects?
→ Because the seo-create skill doesn't flag canonical structure changes
as needing redirect pairs.
Why doesn't it flag this?
→ Because we didn't previously have a lesson documenting the risk.
(Stop — this is the actionable layer.)
Differentiate root causes from contributing factors. A Google algo update in the same week might be noise; prove it's the cause before blaming it.
Phase 4: Fix the immediate damage
Postmortem doesn't just document — it also ships the fix if one is obvious:
- Restore redirects for the 227 de-indexed URLs
- Request re-indexing via GSC URL Inspection API
- Log a new experiment to measure recovery
- What was fixed
- When fix deployed
- Expected recovery timeline (days until re-indexation)
Phase 5: Write the lesson
Write {workspace}/reports/seo/memory/lessons/lesson-{YYYY-MM-DD}-{short_id}.md:
# Lesson: Canonical URL changes require paired 301 redirects
Incident
2026-03-15 → 2026-03-18. Indexed pages dropped from 324 to 97 (-70%).
Cause: changed canonical from /use/X to /use/X/ in PR #2847 without
adding 301 redirects from old URLs.
Root cause
Missed the redirect step. Canonical change alone is not enough —
Google treats old URLs as 404 and de-indexes.
Rule going forward
Any change that modifies URL structure (canonical, path, locale
prefix, trailing slash) MUST include paired 301 redirects deployed
in the same PR.
Concretely:
- If changing
href or canonical tag structure → audit all old URLs → add redirects
- Pre-flight check:
grep -r 'canonical' app/ before and after, diff must be paired with redirects
- If using Next.js, check
next.config.js redirects array before merging
Enforcement
- Added pre-flight checklist item in seo-create skill: "URL structure change?"
- Added to seo-experiment-log warnings: experiments of type=canonical or type=redirect
prompt "is the redirect pair included?"
Recovery
- Redirects deployed 2026-03-19
- GSC re-indexing requested 2026-03-20
- Measurement experiment: exp-2026-03-20-rec tracks recovery (measure 2026-04-20)
Related
- Triggered by: postmortem-2026-03-19
- Updates: seo-create/preflight-checklist.md
- Updates: seo-experiment-log/SKILL.md warnings section
Phase 6: Update guardrails (THE KEY STEP)
A lesson that doesn't change behavior is a lesson that will re-teach itself. For each lesson:
- Identify which skill should carry the guardrail
- Edit that skill to explicitly reference the lesson
- If there's a pre-flight checklist or warning section, add the specific check
Phase 7: Write postmortem summary
{workspace}/reports/seo/postmortems/pm-{YYYY-MM-DD}-{short_id}.md:
# Postmortem — {incident name}
Severity: high | medium | low
Affected: {what broke}
Detected: {date + source}
Resolved: {date + action}
Duration: {hours/days}
Data loss / impact: {numeric summary}
Timeline
{table from Phase 2}
Root cause
{concise from Phase 3}
Fix
{Phase 4 actions}
Lesson written
lessons/lesson-{date}-{id}.md
Guardrails updated
- {skill 1 + section}
- {skill 2 + section}
Follow-up tasks
- {recovery measurement experiment}
- {review related pages for same issue}
- {anything else}
Could we have prevented this?
{Honest assessment. Was this predictable with existing knowledge?
If yes, why didn't we? If no, this is the class of incident we
couldn't avoid.}
Could we detect it faster next time?
{What signal would have caught this on day 1 instead of day 4?
Add detection — alert thresholds, dashboard views, etc.}
Quality bar
- [ ] Root cause is actionable, not a description ("lost topical authority" is not a root cause; "aggressive content rewrite removed cited sections" is)
- [ ] Lesson file exists with a concrete "rule going forward"
- [ ] At least one skill file was updated to enforce the lesson
- [ ] Immediate fix is deployed OR tracked as a P0 task if it's larger than a quick patch
- [ ] Recovery is being measured (not just "fixed, moving on")
What I refuse
- To write a postmortem without a "rule going forward" section. That makes the whole exercise performative.
- To attribute everything to "Google algo update" without evidence. That's how we learn nothing.
- To skip Phase 6 (updating guardrails). Without it, we WILL repeat the mistake.
- To write blame-laden prose. The goal is durable prevention, not accountability theater.
Integration
Reads:
- The triggering experiment or metric
git logon affected pathsexperiments/.yamlin window- External algo/platform update trackers
lessons/lesson-{date}-{id}.mdpostmortems/pm-{date}-{id}.md- Edits to upstream skill files (adding guardrail references)
- New recovery experiment via
seo-experiment-log - Follow-up tasks via
tyctl task create(or workspace todo if standalone)