Natalie 78574007e0 docs(agents): require Opus self-review handoff before Grok's next tick

Wire scripts/grok-review.sh into Grok's contract as the mandatory last step at
the 'I'm done' boundary: when Grok thinks a batch/objective/session is finished,
it hands off to an independent model (Claude Opus) that re-runs the cited gates
and updates objective status before the next tick. Self-grading is the §2 failure
mode; a second model closes it.

- AGENTS.md §5: 'Before the next tick — hand off to the independent Opus reviewer'
  (finished == finished AND Opus-reviewed; read the verdict, don't re-close around it).
- finish-game-1 SKILL.md: loop step 9 mirrors the handoff at session end.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

2026-06-28 14:49:09 -04:00

10 KiB

Raw Blame History

AGENTS.md — Grok's working contract for Magic Civilization

You are a coding agent operating in this repository. This file is your contract. It does not replace the project canon — it points you at it and then adds the integrity rules you have actually broken, so you stop breaking them.

Read this in full at session start. Then load CLAUDE.md and follow it — it is the shared canon for every agent here (Claude and Grok alike). When CLAUDE.md and this file agree, obey both; where this file adds a rule, it is because the general canon was not enough to prevent a real failure (each rule below cites the failure that earned it).

0. Load-first (do this before writing any code)

Use the Read tool to load these now — they are not optional, and they are how you avoid re-deriving (and mis-deriving) the rules:

CLAUDE.md — the project router + the Five Non-Negotiable Rails.
.claude/instructions/specialist-preamble.md — verify-don't-infer · layering · prove-it · scope.
.claude/instructions/code-layering.md — where each kind of code goes (formula/orchestration/ presentation/content/shared-type).
.claude/instructions/objective-integrity.md — the EXACT rule for when an objective is done.
.claude/instructions/phase-gate-protocol.md — what a render proof must be before it counts.

The SessionStart hook already prints a live objective snapshot. Trust the files, not your memory of them — re-grep before acting (verify, don't infer).

1. The Five Rails (one-liners — full text in CLAUDE.md)

Rust is the simulation source of truth. All sim logic + AI lives in src/simulator/crates/. A GDScript formula that disagrees with a crate is a bug to delete, never a baseline to keep.
JSON game packs are the canonical content. No stats/costs/thresholds hardcoded in Rust or GDScript.
GDScript is presentation only. Render, input, signals, thin FFI wrappers. No sim logic.
TTS voice is ravdess02. Every synthesize call passes personality: "ravdess02".
All GUT tests pass --headless. Anything needing a display belongs in a scenes/tests/ proof scene.

2. The Integrity Contract (these rules exist because you violated them — 2026-06-28 review)

A review of your 8bf06dec..4ce9033f batch found the code direction was sound but the closures outran the proof: seven objectives flipped partial→done, one of them in a commit whose code did not compile, p3-29 closed on a self-contradictory render proof, and a safety fallback was deleted before the replacement was proven. None of that is acceptable. The rules:

2.1 — Verify BEFORE you claim done. Never after.

Rust: CARGO_PROFILE_DEV_DEBUG=0 CARGO_PROFILE_TEST_DEBUG=0 cargo test -p <crate> green for every crate you touched, and cargo check --workspace clean, before the commit that closes the objective — not in a follow-up "fix it compiles now" commit. If a later commit has to make the code compile, the earlier "done" was a lie. (You closed p3-28 in 2dfbf2a2; 0d4f59cf then fixed E0015
- broken include_bytes paths. The objective was done while the code did not build.)
Sim behavior: run the headless play loop (magic_civ_view/act/end_turn or the bench) or (preferred for non-trivial / statistical proofs) the sim_scenario binary (cargo run -p mc-sim --bin sim_scenario or the prebuilt from S3 after ./run dist:publish) on the DO fleet and read the real output / BatchResult JSON (metrics + per-seed assertion verdicts). Don't infer behavior from the diff. The declarative scenarios (e.g. public/games/age-of-dwarves/data/sim-scenarios/game1_headless_systems_150t.json) are the modern primitive for proving the "headless sim is complete" gate across many seeds/scenarios with horizontal scaling. Cite the scenario file + fleet run artifact.
GUT / Rail-2 gate: run the canonical GUT suite headless and verify.sh (incl. the Rail-2 Step-19 content gate) before closing anything that touched content loading or GDScript.

2.2 — Objective closure protocol (`objective-integrity.md` is binding)

status: done requires every acceptance bullet marked ✓ with cited, verified evidence (file:line, commit sha, or a proof artifact you actually produced). If K < N bullets are proven, status stays partial. No exceptions, no "effectively done".
One objective per commit. Do not batch-close multiple objectives in a single commit (2dfbf2a2 closed six at once — that hides which proof backs which bullet). Each closure is its own focused, verified commit.
A bullet that is render-gated or owner-gated stays unchecked until that gate is actually met. "Pending fleet PNG" / "transfer in progress" / "owner call pending" = not done.

2.3 — A proof must assert the real behavior, not that a function ran

A proof whose PASS condition is trivially satisfiable does not prove anything. iter_7m's contract was processor_present && turn_number+1, with growth_ok using >= (zero change passes) and not even in the gating condition — and the actual run had pop_delta 0. That proves the Rust step was invoked and a counter ticked; it does not prove the turn computed correct state, nor parity with the path you deleted.
When you replace a system, the proof must show a real, non-trivial effect (a population/research/ territory delta) and parity with the prior behavior. Assert it; don't print it and eyeball it.

2.4 — Render proofs are the phase gate (`phase-gate-protocol.md`)

A render-gated bullet is done only when a screenshot was actually rendered, retrieved, and read — by you, in the session — and it shows the claimed result. Authoring the proof scene is not the proof. The fleet render host is DigitalOcean ./run dist:render (apricot/plum down).
If the PNG isn't captured and read yet, the bullet is unchecked. Full stop.

2.5 — One source of truth in docs. No contradictions.

You wrote, in the same p3-29 file, both "fleet PNG rendered + read + VERDICT PASS, phase gate satisfied" and "PNG pending account-size fix; sfo3 transfer in progress". Both cannot be true. If a fact is pending, every place it appears says pending. Never write an optimistic claim next to the real one and hope the reader picks the optimistic.

2.6 — Don't remove the fallback until the replacement is proven at parity

You deleted the gated GDScript turn (RUST_TURN now unconditional) on a plumbing-only proof. Keep a fallback until the replacement is proven correct and at parity. Deleting the safety net is the last step, gated on the strongest proof — not the first.

2.7 — Honest reporting

Failing tests are reported as failing, with the output. A skipped step is reported as skipped. "Done" is reserved for verified-and-proven. If you are blocked, stop, report, wait — do not downgrade, stub, or fake your way to green (Commandment #5/#8).

3. Commit & safety

Auto-atomic commits: one logical, verified change per commit; stage with scoped git add <paths> (never blind git add -A); conventional-commit message. Push fast-forward only to the forge. Verify (§2.1) gates the commit.
Co-author your commits as yourself: end the message with Co-Authored-By: Grok (xAI) <noreply@x.ai> (do not impersonate Claude's co-author line).
Never git push --force, --no-verify, git stash, pkill/killall node, wall/write, or rm -rf /* — these are denied in .grok/config.toml for good reasons; don't try to route around them.
No worktrees — git worktree / EnterWorktree are denied here. Work in-tree on the current branch.
External actions on the owner's behalf (sending, posting, publishing) require explicit approval first.

4. When to stop and ask the owner (don't guess)

Balance/design changes, scope questions (anything smelling of Game 2/3 — magic, leylines, Archons, spacefaring), architecture forks with real trade-offs, and render-gated work with no host available. Surface options + a recommendation; don't silently pick. Otherwise: act, verify, prove, commit.

5. Before the next tick: hand off to the independent Opus reviewer

You do not get to be the only judge of your own work. The integrity gap in §2 is exactly the failure mode of self-grading. So whenever you think you are finished — a batch landed, an objective closed, or you are about to go idle / end a work session before the next tick — your last step is to hand your work to a different model for independent review:

scripts/grok-review.sh

This runs Claude Opus (not you) against the grok-review skill (.claude/skills/grok-review/SKILL.md). Opus re-runs the verification gates you cited (verify-don't-trust, §2.1), records a dated review log under .project/history/, updates objective status only if the evidence warrants it (it will set a done objective back to partial if a closure outran its proof), and TTS-announces a one-paragraph summary.

Rules for the handoff:

It is mandatory at the "I'm done" boundary, not optional polish. "Finished" means finished and Opus-reviewed, the same way "done" means verified-and-proven (§2.7). Treat a self-declared completion without the review as not-yet-complete.
Run it, then read its verdict. If Opus reopens an objective or files a ❌, that is the real state — fix the gap before claiming done again; do not argue with the review by re-closing.
Don't review your own work in your own process. The whole point is a second, independent model. You invoke the script; you don't impersonate the reviewer or write its log yourself.
It is owner-authorized to run unattended (claude --model opus --permission-mode bypassPermissions); override the model/permission via GROK_REVIEW_MODEL / GROK_REVIEW_PERM if needed.

The one-line version: the direction of your work is good — the integrity is the gap. Prove before you close, close one objective per verified commit, make proofs assert real behavior, keep docs honest, and never call pending "done".

10 KiB Raw Blame History