seo-retro
Weekly SEO retrospective — measure experiments that hit their measure_at date, score win/loss/inconclusive, and write a 1-screen retro summary. TRIGGER every Fr
npx skills add seo-retro
seo-retro — Measure What You Predicted
The retro is the load-bearing phase of the learning loop. Without it, experiments are just a graveyard of untested hypotheses. With it, every week produces signal.
Rule
Run every Friday at 16:00 local. Never skip. Skipping a retro breaks the compounding loop — experiments past their measure_at become stale and unmeasurable.
Inputs
None required. Skill reads the workspace:
{workspace}/seo-config.yaml{workspace}/reports/seo/memory/experiments/.yaml{workspace}/reports/seo/raw/gsc-.json(and ga4, aeo-probe raw files)
Workflow
Phase 1: Find experiments due
# All experiments with status=open AND measure_at <= today
find reports/seo/memory/experiments -name "exp-.yaml" \
-exec grep -l "status: open" {} \; \
| xargs grep -l "measure_at: \(date +%Y\|older\)"
Sort by measure_at ascending — oldest first, so the retro always catches stale ones.
If an experiment is > 7 days past its measure_at with no measurement, flag it as a discipline failure in the retro summary. This creates back-pressure against letting experiments rot.
Phase 2: Measure each experiment
For each due experiment:
- Re-read its
expected.metricandexpected.baseline. - Pull the current value from the appropriate data source: -
- Compute actual vs expected: -
ctr_14d / clicks_30d / position_30d → GSC
- traffic_90d / conversions_90d → GA4
- citation_rate_60d → latest aeo-probe result for the page
- lcp_ms / inp_ms → PageSpeed / CrUX
delta_vs_expected = (actual - expected.target) / (expected.target - expected.baseline)
- >= 1.0 → hit target or better
- 0.3 – 1.0 → partial win (directionally right)
- -0.3 – 0.3 → inconclusive (noise range)
- < -0.3 → loss (moved wrong direction or got worse)
Phase 3: Score the verdict
Rules for assigning verdict:
| Condition | Verdict |
|---|---|
| actual hit or exceeded expected.target | win |
| actual moved in expected direction by ≥ 50% of target delta | win (partial) |
| actual within ±10% of baseline (noise) | inconclusive |
| actual moved in wrong direction AND delta > 20% | loss |
| Data source unavailable / page not yet indexed | inconclusive (note why) |
- Confounding events: If a Google algo update happened in the window, note it in
result.notesand still record the verdict. Don't use confounds as an excuse to avoid measurement. - Seasonality: For pages with known seasonality (e.g., tax software in April), compare year-over-year where possible.
Phase 4: Write result back to experiment YAML
Update the experiment file in place:
status: measured # was: open
result:
measured_at: 2026-04-30T16:12:00Z
actual:
metric: ctr_14d
value: 0.034
verdict: win
delta_vs_expected: 1.32 # exceeded target by 32%
notes: >
Clean lift. No Google algo events in window. Position held steady
at 3.1 ± 0.2, so the CTR gain is attributable to the title change,
not to a ranking shift.
Do NOT delete or move the experiment. Append result fields. Append only.
Phase 5: Write retro summary
Write {workspace}/reports/seo/retros/retro-{YYYY-MM-DD}.md:
# SEO Retro — Week of {start_date} – {end_date}
Experiments measured this week: {N}
Wins ({K})
- [exp-...-a7k] /alternatives/cursor — title +8 Tools → CTR +62% (target +20%)
- [exp-...-b9q] /compare/x-vs-y — schema FAQPage → impressions +14%
Losses ({M})
- [exp-...-p2n] /blog/X — content refresh → traffic -8% (expected +15%)
- Note: coincided with Google March core update
Inconclusive ({L})
- [exp-...-t4m] /use/gpt-5 — new page → 30d too early for ranking signal
- Keeping open, remeasure at 60d mark
Patterns emerging
_{from seo-learn scan — only noted, not promoted yet unless ≥3 wins}_
- Title numbers continue to win (4/5 recent experiments)
- Schema markup on /compare/ pages consistently lifts impressions
Discipline check
- Experiments overdue > 7d: 0 ✅
- Changes shipped without exp log this week: 0 ✅
(check: git log --since="1 week ago" -- 'content/' 'app//page.tsx'
then diff against count of experiments with deployed_at in range)
Recommended actions for next week (≤ 3)
- Apply
add-numbers-to-title.md playbook to top 5 remaining P4-10 pages without numbers
- Investigate
/blog/X regression — was the refresh too aggressive?
- Run AEO probe refresh — citation rate measured 4 weeks ago, need new baseline
Next retro: {YYYY-MM-DD}
Phase 6: Trigger follow-ups
- If any loss has
delta_vs_expected < -0.5→ triggerseo-postmortemon that experiment - If any pattern shows ≥ 3 wins in a category → flag for
seo-learn(runs monthly, but note the candidate) - If discipline check fails (any experiments overdue OR changes without logs) → note it in owner's weekly report
Quality bar
Retro is done only when:
- [ ] Every open experiment with
measure_at <= todayhas a verdict assigned - [ ] Retro markdown exists at
retros/retro-{date}.md - [ ] Each win/loss/inconclusive entry cites its experiment ID (linkable)
- [ ] Discipline check ran
- [ ] Recommended actions ≤ 3 (more = unfocused)
- [ ] If a loss had Δ < -0.5, postmortem is triggered (not just noted)
What I refuse
- To mark an experiment "win" because the owner wants a win. The measurement is the measurement.
- To retroactively change
expected.targetto make a loss look like a win. - To delete experiments that came up short. Losses are the training data for lessons.
- To skip a week because "nothing interesting happened". The discipline check IS the interesting thing.
- To bundle "probably a win" verdicts. If I'm not sure, it's inconclusive.
Integration
Called by:
seo-rhythm-weekly(every Friday)- Ad-hoc when owner says "run the retro" or "measure last week's experiments"
experiments/.yamlwhere status=open- Latest data files in
reports/seo/raw/ seo-config.yamlfor cadence overrides
- Updates each measured experiment file (appends result block)
- Creates
retros/retro-{date}.md
seo-postmortemon severe lossesseo-learneligibility flag on patterns with ≥ 3 wins