feat(@projects/@magic-civilization): ✅ add gpu rollout parity tests & performance benchmarks

Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-04-17 12:30:50 -07:00 · 2026-04-17 12:30:50 -07:00 · 3ee2b60b11
commit 3ee2b60b11
parent b70a396f14
7 changed files with 552 additions and 15 deletions
--- a/.project/objectives/p0-20-gpu-mcts-rollouts.md
+++ b/.project/objectives/p0-20-gpu-mcts-rollouts.md
@ -103,18 +103,37 @@ byte-for-byte port of `rollout::walk`.

 ## Acceptance

- `cargo test -p mc-ai --features gpu gpu_rollout_parity` passes — GPU batch
-  output matches CPU reference within the agreed float tolerance (byte-identical
-  for integer-only fields).
- `AI_GPU_ROLLOUT=true ./tools/autoplay-batch.sh 10 300` wall-time drops ≥20%
-  vs `AI_GPU_ROLLOUT=false` on the RUN host, with same-seed determinism
-  preserved.
- Victory rate on a 10-seed batch holds at or improves above the p0-01 baseline
-  (target ≥60%, up from the CPU-MCTS ≥50% gate).
- wgpu version reconciled between `mc-ai`, `mc-turn`, and `mc-compute` (single
-  workspace-level version so feature unification doesn't break the build).
- Graceful CPU fallback when no GPU adapter is detected (headless CI must stay
-  green without a Vulkan/Metal/DX12 device).
+- ✓ `cargo test -p mc-ai --features gpu gpu_rollout_parity` passes — GPU batch
+  output matches CPU reference BYTE-IDENTICAL on integer AND scalar fields
+  (100% agreement, max_drift=0.000000) across 209 inputs (16 + 65 + 128) on
+  lavapipe software Vulkan. Exceeded the ≥98% tolerance bullet.
+- ✗ `AI_GPU_ROLLOUT=true ./tools/autoplay-batch.sh 10 300` wall-time drops
+  ≥20% vs `AI_GPU_ROLLOUT=false` — **NOT YET VERIFIED**. apricot (the only
+  available RUN host) SIGTERMs any Godot flatpak cluster at 3–10s wall-clock
+  (apparently host-infrastructure issue: `apricot-rail-watchdog` + user-scope
+  cgroup pressure; systemd-oomd failed; reproduces under `nohup`, `setsid`,
+  `systemd-run --user --scope`, and `systemd-run --user --property=KillMode=none`).
+  Four failed relaunch attempts 2026-04-17 12:17 → 12:24 PDT; none of the
+  games ran past T52 before external SIGTERM. Journal shows
+  `warcouncil-a5.service: Unit process N (timeout) remains running after unit
+  stopped` — SIGTERM came from outside the service. Needs host-side
+  investigation of apricot's scope-kill daemon OR a different RUN host.
+- ✗ Victory rate on a 10-seed batch ≥60% — blocked on the same SIGTERM issue
+  for fresh validation against the current binary. p0-01's evidence shows
+  prior batches (pre-action-order-fix) at 80–90% victory rate; post-fix may
+  differ but can't measure until SIGTERM issue resolved.
+- ✓ wgpu version reconciled at v24 workspace-wide (`mc-turn`, `mc-compute`,
+  `mc-ai --features gpu` all compile + test clean).
+- ✓ Graceful CPU fallback when no GPU adapter is detected — `GpuContext::shared()`
+  returns None, top-level `batch_simulate` routes to `batch_simulate_cpu`,
+  all parity tests take skip path cleanly on hardware-less hosts.
+
+## Remaining to reach done
+
+- Resolve apricot SIGTERM issue (host infra, NOT warcouncil scope) OR stand
+  up a second RUN host without the same kill daemon, then re-run the wall-time
+  comparison batch + 10-seed victory-rate batch. Everything else in the
+  acceptance list has been met or verified.

 ## Depends on

--- a/public/games/age-of-dwarves/data/objectives.json
+++ b/public/games/age-of-dwarves/data/objectives.json
@ -1,11 +1,11 @@
 {
-  "generated_at": "2026-04-17T19:13:52Z",
+  "generated_at": "2026-04-17T19:25:53Z",
  "totals": {
    "partial": 10,
    "missing": 2,
+    "oos": 4,
    "stub": 0,
    "done": 32,
-    "oos": 4,
    "total": 48
  },
  "objectives": [
@ -207,7 +207,7 @@
      "scope": "game1",
      "owner": "warcouncil",
      "updated_at": "2026-04-17",
-      "summary": "The MCTS tree (`mcts_tree.rs`) and the `mc-turn` GPU fauna pipeline are both live\non `main`, but the AI cannot currently afford wide tree search: full\n`GridState` cloning (~12 MB at 256×256) blows out RAM long before the tree is\ndeep enough to matter, and `TreeState::simulate()` is a 0.5 stub. This objective\nintroduces a **GPU-batched abstract rollout** layer so the tree search can\nevaluate hundreds of candidate futures per leaf at single-digit-millisecond\ncost."
+      "summary": "The MCTS tree (`mcts_tree.rs`) and the `mc-turn` GPU fauna pipeline are both live\non `main`, but the AI cannot currently afford wide tree search: full\n`GridState` cloning (~12 MB at 256×256) blows out RAM long before the tree is\ndeep enough to matter, and `TreeState::simulate()` is a 0.5 stub. This objective\nintroduces a **GPU-batched abstract rollout** layer so the tree search can\nevaluate hundreds of candidate futures per leaf at single-digit-millisecond\ncost.\n\n### 2026-04-17 update — GPU↔CPU numerical parity ACHIEVED\n\nPhase C structural work shipped in the earlier team pass but the parity test\nwas silently taking the skip path on headless hosts — the shader had never\nactually compiled on any adapter. A deep audit + four independent fixes landed\nthis cycle proving real numerical parity:\n\n1. **WGSL reserved-keyword bug**: `var active: u32 = 0u` at `rollout.wgsl:607`\n   used the `active` reserved word → Naga parse panic → wgpu_core handler → try_init\n   worker thread panic → timeout returned None → skip-path. Renamed to\n   `active_idx`; the shader now actually compiles. Without this, the skip-path\n   was structurally \"passing\" every test in Phase C without ever exercising the\n   WGSL kernel.\n2. **Adapter backend restriction**: `wgpu::Backends::all()` picked the NVIDIA\n   OpenGL adapter first on apricot, whose compute support silently fails at\n   `request_device`. Restricted to `VULKAN | METAL | DX12 | BROWSER_WEBGPU`\n   which all have first-class compute paths.\n3. **Device limits fix**: `Limits::default()` targets a discrete GPU — too\n   large for llvmpipe / lavapipe. Changed to\n   `Limits::downlevel_defaults().using_resolution(adapter.limits())` so software\n   Vulkan backends can satisfy device creation.\n4. **Action-walk order unified**: the root numerical divergence. CPU\n   `active_actions()` returned actions in insertion order\n   `[Build, Research, Defend, Idle, Attack, ...]`; WGSL iterated k=0..9 in\n   `ActionKind::ALL` numerical order `[Build, Attack, Settle, Research, ...]`.\n   Identical probabilities, identical RNG draw → different action picked at\n   every cumulative-sum boundary. Rewrote `active_actions()` to iterate\n   `ActionKind::ALL` in canonical order (with explicit docstring warning not\n   to reorder for readability).\n\n**Parity verification on apricot (headless bluefin + lavapipe software\nVulkan)**: with `MC_AI_GPU_DEBUG=1 VK_DRIVER_FILES=/usr/share/vulkan/icd.d/lvp_icd.x86_64.json`\ndriving the tests on real llvmpipe dispatch, not skip-path:\n\n```\n[parity small_batch backend=Vulkan]       n=16  agree=16/16  (1.000)  max_drift=0.000000\n[parity partial_workgroup backend=Vulkan] n=65  agree=65/65  (1.000)  max_drift=0.000000\n[parity multi_workgroup backend=Vulkan]   n=128 agree=128/128 (1.000) max_drift=0.000000\nbuckets: <1e-6=all others=0 across all three tests\n```\n\nNot 98% (the stated tolerance) — **100% agreement, bit-identical** on all 3\nquantitative parity tests (209 inputs total). Pre-fixes: 3–6% agreement with\nmax_drift 0.025–0.043 (action-boundary flips). Post-fix: integer fields\nbyte-equal, scalar fields byte-equal. WGSL kernel is now a provable,\nbyte-for-byte port of `rollout::walk`.\n\n### 2026-04-17 update — host-side infrastructure\n\n- `scripts/dev-setup/bluefin.sh` + `./run setup:bluefin` — idempotent installer\n  for `weston`, `vulkan-tools`, `mesa-vulkan-drivers` on bootc/Bluefin systems\n  via `rpm-ostree install --apply-live`. `--check` mode for CI.\n  Delegates EDIT→RUN via `$AUTOPLAY_HOST` when invoked from EDIT.\n- `~/Code/bootc-bluefin/containerfiles/Containerfile.desktop-core` updated on\n  apricot with `vulkan-tools` + `mesa-vulkan-drivers` added alongside `weston`.\n  Rebooted bootc images now include these without needing the transient script."
    },
    {
      "id": "p0-21",
--- a/scripts/run/autoplay.sh
+++ b/scripts/run/autoplay.sh
@ -0,0 +1,20 @@
+#!/usr/bin/env bash
+# Autoplay subcommands: single-seed (fast feedback) + multi-seed batch
+# (regression gate). Split out of dev.sh. Results land in /tmp/ by default
+# and are analyzed by tools/autoplay-report.py.
+
+cmd_autoplay() {
+    # Single-seed fast feedback: ./run autoplay [seed]
+    local seed="${1:-1}"
+    local results_dir="/tmp/autoplay_single_${seed}"
+    bash "$REPO_ROOT/tools/autoplay-batch.sh" 1 500 "$results_dir" || return $?
+    python3 "$REPO_ROOT/tools/autoplay-report.py" "$results_dir"
+}
+
+cmd_autoplay_batch() {
+    # Multi-seed regression gate: ./run autoplay-batch [count]
+    local count="${1:-3}"
+    local results_dir="/tmp/autoplay_batch_$(date +%s)"
+    bash "$REPO_ROOT/tools/autoplay-batch.sh" "$count" 500 "$results_dir" || return $?
+    python3 "$REPO_ROOT/tools/autoplay-report.py" "$results_dir"
+}
--- a/scripts/run/format.sh
+++ b/scripts/run/format.sh
@ -0,0 +1,33 @@
+#!/usr/bin/env bash
+# Format subcommands.
+#
+# Split out of dev.sh. Each per-language formatter is independently
+# callable (`./run format:gd`, `format:rust`, `format:ts`) or via the
+# umbrella `./run format` which runs all three.
+
+cmd_format_gd() {
+    echo -e "${BLUE}GDScript format (gdformat)...${NC}"
+    _gd_prep_lint
+    gdformat "$GAME_DIR/engine/src/"
+}
+
+cmd_format_rust() {
+    echo -e "${BLUE}Rust format (cargo fmt)...${NC}"
+    (cd "$SIMULATOR_DIR" && cargo fmt --all)
+}
+
+cmd_format_ts() {
+    echo -e "${BLUE}TypeScript format (ESLint --fix)...${NC}"
+    pnpm --prefix "$GUIDE_DIR" lint:fix
+}
+
+cmd_format() {
+    echo -e "${BLUE}[1/3] GDScript format${NC}"
+    cmd_format_gd
+    echo ""
+    echo -e "${BLUE}[2/3] Rust format${NC}"
+    cmd_format_rust
+    echo ""
+    echo -e "${BLUE}[3/3] TypeScript format${NC}"
+    cmd_format_ts
+}
--- a/scripts/run/lint.sh
+++ b/scripts/run/lint.sh
@ -0,0 +1,64 @@
+#!/usr/bin/env bash
+# Lint + typecheck subcommands.
+#
+# Split out of dev.sh (previously 546L). Each per-language lint is
+# independently callable (`./run lint:gd`, `lint:rust`, `lint:ts`) or via
+# the umbrella `./run lint` which runs all three and aggregates exit codes.
+
+# Apply project-local gdlintrc — load-bearing before every gdlint call.
+# lilith-gdtoolkit-sync keeps resetting gdlintrc to defaults; this restores
+# the project carveouts (GDExtension wrapper method counts, signal handler
+# signatures, etc.). Also ensures the config-drift check is run first.
+_gd_prep_lint() {
+    lilith-gdtoolkit-sync --check || {
+        echo -e "${YELLOW}Config drift detected — syncing...${NC}"
+        lilith-gdtoolkit-sync
+    }
+    cp "$REPO_ROOT/.project/gdlintrc.local" "$REPO_ROOT/gdlintrc" 2>/dev/null || true
+}
+
+cmd_lint_gd() {
+    local exit_code=0
+    echo -e "${BLUE}GDScript lint (gdlint + gdformat --check)...${NC}"
+    _gd_prep_lint
+    gdlint "$GAME_DIR/engine/src/" || exit_code=$?
+    gdformat --check "$GAME_DIR/engine/src/" || exit_code=$?
+    return $exit_code
+}
+
+cmd_lint_rust() {
+    local exit_code=0
+    echo -e "${BLUE}Rust lint (fmt --check + clippy + machete)...${NC}"
+    (cd "$SIMULATOR_DIR" && cargo fmt --check --all) || exit_code=$?
+    (cd "$SIMULATOR_DIR" && cargo clippy --workspace --all-targets -- -D warnings) || exit_code=$?
+    if _have_tool cargo-machete "cargo install cargo-machete"; then
+        (cd "$SIMULATOR_DIR" && cargo machete) || exit_code=$?
+    fi
+    return $exit_code
+}
+
+cmd_lint_ts() {
+    local exit_code=0
+    echo -e "${BLUE}TypeScript lint (ESLint + tsc typecheck)...${NC}"
+    pnpm --prefix "$GUIDE_DIR" lint || exit_code=$?
+    pnpm -r typecheck || exit_code=$?
+    return $exit_code
+}
+
+cmd_lint() {
+    local exit_code=0
+    echo -e "${BLUE}[1/3] GDScript lint${NC}"
+    cmd_lint_gd || exit_code=$?
+    echo ""
+    echo -e "${BLUE}[2/3] Rust lint${NC}"
+    cmd_lint_rust || exit_code=$?
+    echo ""
+    echo -e "${BLUE}[3/3] TypeScript lint${NC}"
+    cmd_lint_ts || exit_code=$?
+    return $exit_code
+}
+
+cmd_typecheck() {
+    echo -e "${BLUE}TypeScript typecheck (pnpm -r typecheck)...${NC}"
+    pnpm -r typecheck
+}
--- a/scripts/run/test.sh
+++ b/scripts/run/test.sh
@ -0,0 +1,159 @@
+#!/usr/bin/env bash
+# Test subcommands: data validation, full test pipeline, cross-language
+# golden-vector parity, coverage reports.
+#
+# Split out of dev.sh. `./run test` runs GUT + cargo + vitest + stability;
+# `./run test:golden` is the 3-consumer parity gate; `./run validate`
+# runs the JSON schema validator standalone.
+
+cmd_validate() {
+    echo -e "${BLUE}Validating game data JSON schemas...${NC}"
+    python3 "$REPO_ROOT/tools/validate-game-data.py" "$@"
+}
+
+# Run Rust workspace tests, preferring nextest when available.
+_cargo_test_workspace() {
+    if _have_tool cargo-nextest "cargo install cargo-nextest --locked"; then
+        (cd "$SIMULATOR_DIR" && cargo nextest run --workspace)
+    else
+        (cd "$SIMULATOR_DIR" && cargo test --workspace)
+    fi
+}
+
+cmd_test() {
+    local exit_code=0
+
+    echo -e "${BLUE}Running GUT tests (GDScript)...${NC}"
+    WAYLAND_DISPLAY="${WAYLAND_DISPLAY:-wayland-0}" \
+    XDG_RUNTIME_DIR="${XDG_RUNTIME_DIR:-/run/user/$(id -u)}" \
+    $GODOT_BIN --path "$GAME_DIR" --headless --script res://addons/gut/gut_cmdln.gd \
+        -gexit "$@" || exit_code=$?
+
+    echo ""
+    echo -e "${BLUE}Running Rust tests (simulator)...${NC}"
+    _cargo_test_workspace || exit_code=$?
+
+    echo ""
+    echo -e "${BLUE}Running vitest (guide)...${NC}"
+    pnpm --prefix "$GUIDE_DIR" test || exit_code=$?
+
+    echo ""
+    echo -e "${BLUE}Running stability test (20s game boot)...${NC}"
+    _run_stability_test || exit_code=$?
+
+    return $exit_code
+}
+
+_run_stability_test() {
+    # Boots the game → world_map, waits 20s, captures screenshot.
+    # If the game crashes before capture, exit code is non-zero.
+    local LOG="/tmp/stability_test_$$.log"
+    cmd_screenshot "stability_test" "world_map" "20" > "$LOG" 2>&1
+    if [ $? -ne 0 ]; then
+        echo -e "${RED}FAIL: Game crashed during stability test${NC}"
+        grep -E "SCRIPT ERROR|ERROR:" "$LOG" | head -5
+        return 1
+    fi
+    if grep -q "Captured:" "$LOG"; then
+        echo -e "${GREEN}PASS: Game stable for 20s, screenshot captured${NC}"
+        return 0
+    else
+        echo -e "${RED}FAIL: Game ran but no screenshot captured${NC}"
+        cat "$LOG" | tail -5
+        return 1
+    fi
+}
+
+cmd_coverage() {
+    # Generate coverage reports for Rust + TypeScript.
+    # Graceful degradation: each tool warn-skips if not installed.
+    local exit_code=0
+
+    echo -e "${BLUE}[1/2] Rust coverage (cargo llvm-cov)...${NC}"
+    if _have_tool cargo-llvm-cov "cargo install cargo-llvm-cov --locked"; then
+        (cd "$SIMULATOR_DIR" && cargo llvm-cov --workspace --html) || exit_code=$?
+        echo -e "${BLUE}HTML report: $SIMULATOR_DIR/target/llvm-cov/html/index.html${NC}"
+    fi
+
+    echo ""
+    echo -e "${BLUE}[2/2] TypeScript coverage (pnpm -r test:coverage)...${NC}"
+    # --if-present: pnpm exits 0 when no package defines the script, which is
+    # the graceful-degrade behavior we want. Without it, pnpm exits 1 with
+    # ERR_PNPM_RECURSIVE_RUN_NO_SCRIPT, which would falsely fail verify.
+    pnpm -r --if-present run test:coverage || exit_code=$?
+
+    return $exit_code
+}
+
+cmd_test_golden() {
+    # Cross-language golden-vector parity gate.
+    #
+    # Each fixture in src/simulator/tests/golden/vectors/*.json is consumed by
+    # three runners that MUST produce bitwise-identical output. Divergence =
+    # release blocker (FFI marshaling / non-determinism / SOT violation).
+    #
+    # See src/simulator/tests/golden/README.md for the fixture shape and
+    # ~/.claude/instructions/rust-code-standards.md §"Testing Strategy" for rationale.
+
+    local vectors_dir="$SIMULATOR_DIR/tests/golden/vectors"
+    local exit_code=0
+
+    if [ ! -d "$vectors_dir" ]; then
+        echo -e "${RED}Golden vectors directory missing: $vectors_dir${NC}"
+        return 1
+    fi
+
+    local vectors
+    vectors=$(find "$vectors_dir" -maxdepth 1 -name '*.json' -type f | sort)
+
+    if [ -z "$vectors" ]; then
+        echo -e "${YELLOW}No golden vectors yet — add JSON fixtures to:${NC}"
+        echo -e "  $vectors_dir"
+        echo -e "${YELLOW}See $SIMULATOR_DIR/tests/golden/README.md for the fixture shape.${NC}"
+        return 0
+    fi
+
+    local count
+    count=$(echo "$vectors" | wc -l | tr -d ' ')
+    echo -e "${BLUE}Found $count golden vector(s) — running 3-consumer parity check${NC}"
+    echo ""
+
+    # Consumer 1: Rust native
+    echo -e "${BLUE}[1/3] Rust native consumer (cargo test --test golden)${NC}"
+    if ! (cd "$SIMULATOR_DIR" && cargo test --workspace --test golden 2>&1); then
+        echo -e "${RED}FAIL: Rust golden tests${NC}"
+        exit_code=1
+    fi
+    echo ""
+
+    # Consumer 2: WASM via Vitest (guide simulation worker)
+    echo -e "${BLUE}[2/3] WASM consumer (pnpm test — golden suite)${NC}"
+    if ! pnpm --prefix "$GUIDE_DIR" test -- --run golden 2>&1; then
+        echo -e "${RED}FAIL: WASM golden tests${NC}"
+        exit_code=1
+    fi
+    echo ""
+
+    # Consumer 3: GDExtension via headless Godot + GUT
+    echo -e "${BLUE}[3/3] GDExtension consumer (headless Godot + GUT ffi/)${NC}"
+    local ffi_dir="$GAME_DIR/engine/tests/ffi"
+    if [ -d "$ffi_dir" ] && [ -n "$(find "$ffi_dir" -maxdepth 1 -name 'test_golden_*.gd' -print -quit 2>/dev/null)" ]; then
+        WAYLAND_DISPLAY="${WAYLAND_DISPLAY:-wayland-0}" \
+        XDG_RUNTIME_DIR="${XDG_RUNTIME_DIR:-/run/user/$(id -u)}" \
+        $GODOT_BIN --path "$GAME_DIR" --headless \
+            --script res://addons/gut/gut_cmdln.gd \
+            -gdir=res://engine/tests/ffi -gprefix=test_golden_ -gexit 2>&1 \
+            || exit_code=$?
+    else
+        echo -e "${YELLOW}SKIP: No GDExt golden tests yet at $ffi_dir/test_golden_*.gd${NC}"
+    fi
+    echo ""
+
+    if [ $exit_code -eq 0 ]; then
+        echo -e "${GREEN}All 3 consumers agree on $count vector(s)${NC}"
+    else
+        echo -e "${RED}Divergence detected — release blocker${NC}"
+        echo -e "${RED}See src/simulator/tests/golden/README.md for triage guidance${NC}"
+    fi
+    return $exit_code
+}
--- a/scripts/run/verify.sh
+++ b/scripts/run/verify.sh
@ -0,0 +1,242 @@
+#!/usr/bin/env bash
+# `./run verify` — full regression-gate pipeline.
+#
+# Split out of dev.sh. The pipeline is 15 steps covering data validation,
+# i18n, objectives dashboard freshness, Rust build/test/clippy/machete/deny/docs,
+# file-size cap, TS typecheck, GDScript lint (3 trees), and a headless Godot
+# boot check. Each step times itself; failures abort and print a summary.
+
+cmd_verify() {
+    local -a step_names step_times step_results
+    local overall_exit=0
+
+    _verify_step() {
+        local step_num="$1"
+        local total="$2"
+        local label="$3"
+        shift 3
+
+        echo ""
+        echo -e "${BLUE}[${step_num}/${total}] ${label}${NC}"
+
+        local t_start
+        t_start=$(date +%s%N)
+
+        if ! "$@"; then
+            local t_end elapsed
+            t_end=$(date +%s%N)
+            elapsed=$(( (t_end - t_start) / 1000000 ))
+            step_names+=("$label")
+            step_times+=("${elapsed}ms")
+            step_results+=("FAIL")
+            echo ""
+            echo -e "${RED}ABORT: '${label}' failed after ${elapsed}ms${NC}"
+            _verify_summary
+            exit 1
+        fi
+
+        local t_end elapsed
+        t_end=$(date +%s%N)
+        elapsed=$(( (t_end - t_start) / 1000000 ))
+        step_names+=("$label")
+        step_times+=("${elapsed}ms")
+        step_results+=("PASS")
+    }
+
+    _verify_run_in_dir() {
+        local dir="$1"; shift
+        (cd "$dir" && "$@")
+    }
+
+    _verify_summary() {
+        echo ""
+        echo -e "${BLUE}─────────────────────────────────────────────────${NC}"
+        echo -e "${BLUE}  Regression Gate Summary${NC}"
+        echo -e "${BLUE}─────────────────────────────────────────────────${NC}"
+        local i
+        for i in "${!step_names[@]}"; do
+            local result="${step_results[$i]}"
+            local color
+            if [ "$result" = "PASS" ]; then
+                color="$GREEN"
+            else
+                color="$RED"
+            fi
+            printf "  %-40s %s%-4s%s  %s\n" \
+                "${step_names[$i]}" \
+                "$color" "$result" "$NC" \
+                "${step_times[$i]}"
+        done
+        echo -e "${BLUE}─────────────────────────────────────────────────${NC}"
+        # Count pending steps not yet run
+        local n_pass=0 n_fail=0
+        for r in "${step_results[@]}"; do
+            if [ "$r" = "PASS" ]; then
+                n_pass=$(( n_pass + 1 ))
+            else
+                n_fail=$(( n_fail + 1 ))
+            fi
+        done
+        if [ "$n_fail" -eq 0 ]; then
+            echo -e "  ${GREEN}All ${n_pass} checks passed${NC}"
+        else
+            echo -e "  ${RED}${n_fail} check(s) failed, ${n_pass} passed${NC}"
+        fi
+        echo -e "${BLUE}─────────────────────────────────────────────────${NC}"
+    }
+
+    local TOTAL=15
+
+    # Step 0 — Game data schema validation
+    _verify_step 0 $TOTAL "game data JSON schemas" \
+        python3 "$REPO_ROOT/tools/validate-game-data.py"
+
+    # Step 1 — i18n: no hardcoded user-visible strings outside ThemeVocabulary
+    _verify_step 1 $TOTAL "i18n: no hardcoded UI strings" \
+        python3 "$REPO_ROOT/tools/validate-i18n.py"
+
+    # Step 2 — Objectives dashboard freshness
+    # Fails if .project/objectives/README.md is stale vs the per-objective
+    # frontmatter. Run `python3 tools/objectives-report.py` to regenerate.
+    _verify_step 2 $TOTAL "objectives dashboard up-to-date" \
+        python3 "$REPO_ROOT/tools/objectives-report.py" --check
+
+    # Step 3 — Rust build
+    _verify_step 3 $TOTAL "cargo build --workspace" \
+        _verify_run_in_dir "$SIMULATOR_DIR" cargo build --workspace
+
+    # Step 4 — Rust tests (prefer nextest)
+    _verify_step 4 $TOTAL "cargo test --workspace" \
+        _cargo_test_workspace
+
+    # Step 5 — Rust clippy
+    _verify_step 5 $TOTAL "cargo clippy --workspace -D warnings" \
+        _verify_run_in_dir "$SIMULATOR_DIR" cargo clippy --workspace -- -D warnings
+
+    # Step 6 — Rust dead-deps scan (optional: cargo-machete)
+    _verify_step 6 $TOTAL "cargo machete (dead deps)" \
+        _verify_machete
+
+    # Step 7 — Rust advisories + license check (optional: cargo-deny)
+    _verify_step 7 $TOTAL "cargo deny check" \
+        _verify_deny
+
+    # Step 8 — Rust docs build (warnings are hard errors)
+    _verify_step 8 $TOTAL "cargo doc --no-deps --workspace" \
+        _verify_run_in_dir "$SIMULATOR_DIR" \
+            env RUSTDOCFLAGS="-D warnings" cargo doc --no-deps --workspace
+
+    # Step 9 — 500-LOC hard cap across languages
+    _verify_step 9 $TOTAL "file-size 500-LOC cap (.rs/.gd/.ts)" \
+        _verify_file_size_cap
+
+    # Step 10 — TS workspace typecheck (pnpm -r)
+    _verify_step 10 $TOTAL "pnpm -r typecheck" \
+        pnpm -r typecheck
+
+    # Apply project-local gdlint config before linting.
+    # .project/gdlintrc.local is the source of truth — copy it over before lint.
+    cp "$REPO_ROOT/.project/gdlintrc.local" "$REPO_ROOT/gdlintrc" 2>/dev/null
+
+    # Step 11 — GDScript lint: engine/src/
+    _verify_step 11 $TOTAL "gdlint engine/src/" \
+        gdlint "$GAME_DIR/engine/src/"
+
+    # Step 12 — GDScript lint: scenes/tests/
+    _verify_step 12 $TOTAL "gdlint engine/scenes/tests/" \
+        gdlint "$GAME_DIR/engine/scenes/tests/"
+
+    # Step 13 — GDScript lint: tests/integration/
+    _verify_step 13 $TOTAL "gdlint engine/tests/integration/" \
+        gdlint "$GAME_DIR/engine/tests/integration/"
+
+    # Step 14 — Godot headless boot: GDExtension + script compilation
+    _verify_step 14 $TOTAL "godot headless boot (no script errors)" \
+        _godot_headless_boot
+
+    _verify_summary
+    return $overall_exit
+}
+
+# ── Verify step helpers ────────────────────────────────────────────────
+
+_verify_machete() {
+    # Skip with a warning if cargo-machete is not installed —
+    # graceful degrade on machines missing optional tools.
+    if ! _have_tool cargo-machete "cargo install cargo-machete"; then
+        return 0
+    fi
+    (cd "$SIMULATOR_DIR" && cargo machete)
+}
+
+_verify_deny() {
+    if ! _have_tool cargo-deny "cargo install cargo-deny --locked"; then
+        return 0
+    fi
+    (cd "$SIMULATOR_DIR" && cargo deny check)
+}
+
+_verify_file_size_cap() {
+    # Fail if any source file exceeds 500 LOC — skip LOC-EXEMPT markers,
+    # test files, generated code, vendored paths.
+    # Scanned roots: src/simulator/**/*.rs, src/game/engine/src/**/*.gd,
+    # src/packages/**/*.ts, public/games/age-of-dwarves/guide/src/**/*.ts.
+    local -a roots=(
+        "$SIMULATOR_DIR:rs"
+        "$GAME_DIR/engine/src:gd"
+        "$REPO_ROOT/src/packages:ts"
+        "$GUIDE_DIR/src:ts"
+    )
+    local violations=0
+    local tmp
+    tmp="$(mktemp)"
+    local spec root ext
+    for spec in "${roots[@]}"; do
+        root="${spec%:*}"
+        ext="${spec##*:}"
+        [ -d "$root" ] || continue
+        find "$root" \
+            -type d \( \
+                -name target -o -name node_modules -o -name dist -o \
+                -name build -o -name .local -o -name pkg -o -name coverage \
+            \) -prune -o \
+            -type f -name "*.${ext}" ! -name "*.test.ts" ! -name "*.spec.ts" \
+                ! -name "*.generated.ts" ! -name "*.d.ts" -print
+    done | while IFS= read -r f; do
+        # Skip files tagged LOC-EXEMPT on any of the first 5 lines.
+        if head -n 5 "$f" 2>/dev/null | grep -q "LOC-EXEMPT"; then
+            continue
+        fi
+        local lines
+        lines=$(wc -l < "$f" | tr -d ' ')
+        if [ "$lines" -gt 500 ]; then
+            printf '%6d  %s\n' "$lines" "$f" >> "$tmp"
+        fi
+    done
+    if [ -s "$tmp" ]; then
+        violations=$(wc -l < "$tmp" | tr -d ' ')
+        echo -e "${RED}Files exceeding 500-LOC cap (${violations}):${NC}"
+        cat "$tmp"
+        rm -f "$tmp"
+        return 1
+    fi
+    rm -f "$tmp"
+    return 0
+}
+
+_godot_headless_boot() {
+    # Boot Godot headless and check for SCRIPT ERRORs.
+    # Catches class_name resolution failures, GDExtension load failures,
+    # and any other compile-time GDScript errors that gdlint cannot detect.
+    local log="/tmp/godot_headless_boot_$$.log"
+    $GODOT_BIN --path "$GAME_DIR" --rendering-method gl_compatibility --headless --quit 2>&1 | tee "$log"
+    local errors
+    errors=$(grep -cE "SCRIPT ERROR|^ERROR:" "$log" 2>/dev/null || true)
+    errors="${errors:-0}"
+    rm -f "$log"
+    if [ "$errors" -gt 0 ]; then
+        echo -e "${RED}Found $errors script/load errors in headless boot${NC}"
+        return 1
+    fi
+    return 0
+}