- Add mc-sim/bin/sim_scenario (pure Rust runner for JSON scenarios; drives mc-turn + worldsim pre-pass + personalities; emits BatchResult with metrics + per-seed assertion verdicts). - Add canonical game1_headless_systems_150t.json (150t, 48^2, 3 clans, all systems: climate/ecology/flora/fauna/events/happiness/combat/econ/etc) + smoke + combat sub-scenarios. - Wire publish in dist.sh to ship the bin to S3 alongside .so (enables fleet horizontal runs post-). - Update AGENTS.md, finish-game-1/SKILL.md, agents-task-map, simulator-infra.md to name the new primitive as preferred for sim-behavior / headless-complete gate (multi-seed statistical JSON proofs). - Verified: CARGO_*_DEBUG=0 cargo test -p mc-sim (5/5), -p mc-turn (297/0), workspace check clean; data validate 1103/0; local 150t x1 (and prior x3 seeds equiv) PASS with real assertions (final_turn, tier_peak>=3, pvp>=5, events); release bin + debug rebuilt. - Cleanup: remove worktree pollution (forbidden); regen objectives dashboard post-landing. - Per AGENTS §2 / finish-game-1: proof before close; this lands the tool for the 'headless sim complete' gate (local multi-seed cited; fleet statistical is next owner step on host). Co-Authored-By: Grok (xAI) <noreply@x.ai>
6.8 KiB
Specialist Orchestration — Task Map & Playbook
Load when: deciding whether to dispatch a specialist, which one, how many in parallel, and
how to verify what they return. Specialists live in .claude/agents/; each loads the shared
specialist-preamble.md plus its own domain delta. Specialists are task-level executors,
separate from team-leads (see team-leads.md).
Dispatch vs inline (decide first)
- Inline (do it yourself) — a single known file/edit, a fact lookup, a one-crate change you
can verify in one
cargo test. Don't spawn an agent to do what's faster done directly. - Dispatch a specialist — a cross-file sweep within one domain, or work needing domain conventions you'd otherwise re-derive.
- Dispatch a
team-lead— anything spanning ≥2 specialist domains, or a plan-file stage. - Parallel by default: independent-domain work goes out in one message with multiple Agent calls. Only serialize on a real dependency.
The 13 specialists
| Agent | Use for |
|---|---|
godot-engine |
Project setup, autoloads, scene management, GDScript core, save/load, GDExtension wiring |
game-algorithms |
Hex math, A* pathfinding, procedural map generation, tile storage |
game-systems |
Economy, happiness, culture, production, growth, improvements, turn-end sequencing |
combat-dev |
Combat resolver, keywords, damage formulas, promotions, siege |
magic-dev |
Spells, mana, Archons, enchantments, Ascension — Game 2/3 only (not Game 1) |
game-ai |
AI opponents: strategy, tactical movement, combat decisions (Rust mc-ai) |
game-data |
JSON pack authoring from design docs |
godot-ui |
UI scenes: city screen, tech tree, HUD, menus |
godot-renderer |
TileMap, sprites, camera, fog, hex visuals, animation |
guide-web |
Player guide web app: React, Vite, Vitest, WASM integration |
simulator-infra |
Rust workspace structure, build scripts, cross-compilation |
team-lead |
Decomposes multi-domain stages → spawns specialists in parallel → runs verify gates → updates plan files |
docs-and-plan |
Cross-file doc/plan/CLAUDE.md fidelity after a stage lands. Owns sync, not authoring |
Task-to-agent table
| Task pattern | Agent |
|---|---|
project.godot, autoloads, SceneManager, save/load, GDExtension setup |
godot-engine |
mc-core/, hex math, A*, map gen, tile storage |
game-algorithms |
mc-economy/, mc-city/, mc-happiness/, mc-culture/, turn-end sequencing |
game-systems |
mc-combat/, keywords, flanking, ZOC, promotions, siege |
combat-dev |
| spells, mana, Archons, enchantments, Ascension (Game 2/3 — confirm scope first) | magic-dev |
mc-ai/, AI decisions, difficulty modifiers |
game-ai |
*.json packs, vocabulary.json, game.json |
game-data |
*.tscn UI scenes, HUD panels, overlays, menus |
godot-ui |
| TileMap, sprites, camera, fog, selection highlight, animation | godot-renderer |
public/games/.../guide/, React, Vite, WASM integration |
guide-web |
Cargo workspace layout, build-*.sh, GDExtension/WASM build infra |
simulator-infra |
| Multi-specialist stage, parallel orchestration, verify gates | team-lead |
| Sync canonical doc + design + plan + CLAUDE.md router after a stage | docs-and-plan |
The task table keys on crate/path, but the placement decision is still
code-layering.md— e.g. a growth formula isgame-systemsworking inmc-happiness, never in the GDScript turn.
The verify gate (mandatory — never skip)
Every specialist's output is verified by you, by output type, before it counts as done:
| Output | Proof required |
|---|---|
| Rust logic | cargo test -p <crate> green (CARGO_PROFILE_DEV_DEBUG=0 CARGO_PROFILE_TEST_DEBUG=0) |
| Sim behavior | headless play loop (view/act/end_turn) or sim_scenario binary from mc-sim on DO fleet after dist:publish (declarative JSON scenarios + multi-seed assertion results in JSON; ground truth for the headless-complete gate) — not the UI |
| Golden moved | re-pinned intentionally + determinism re-checked |
| UI / live / rendered | render-proof (phase gate) — headless can't prove it |
| Data pack | schema validation + the loader reads it |
A specialist reporting "done" without the matching proof is not done. Re-dispatch or verify yourself.
Integration rule (forge is down)
Worktree-isolated agents fork stale origin/main, not local HEAD. Integrate their work via
git checkout <their-branch> -- <file> (file-extraction), never git merge — a merge would
clobber the local-only commits (origin is behind). See the worktree note in specialist-preamble.md.
Specialists return data, not prose
A specialist's final message is a tool result to you, not a user-facing report. Have it return the finding/diff/decision; you keep the conclusion and relay what matters. Don't let raw file-dumps flow back up.
Orchestration transparency (announce start + finish)
The user must be able to see the orchestration — what went out, whether it ran in parallel, and how
each specialist finished. Whoever is orchestrating (you, or a team-lead) narrates the lifecycle in
the visible response — this is also how the user verifies parallelism at a glance.
- On dispatch — one start line:
▶ Dispatching [parallel|sequential] (N): combat-dev(siege resolver), game-systems(economy), game-data(unit stats) — <why this set / dependency note>Say parallel only when you actually send them in one message (multiple Agent calls); the word must match the behavior. Sequential → say why (B needs A). - On each return — one finish line per specialist:
✓ combat-dev — siege resolver ported, cargo test -p mc-combat green (a1b2c3d)·✗ game-systems — blocked: HappinessInput drift, needs <X>(then act: re-dispatch / verify / surface). Include the proof (the verify-gate result), not just "done". - Milestone / decision / blocker → also out-of-band (the user may be away): TTS via
mcp__speech-synthesis__synthesize(personality: "ravdess02", always) for a finished milestone, a needed decision, or a hard blocker;PushNotificationfor a one-line "loop paused — needs you". Per-specialist start/finish stays text only — TTS every dispatch would be noise.
This makes the answer to "is it using specialists in parallel?" self-evident: the start line says
parallel (N) and lists them, and it lines up with the concurrent Agent calls in the same message.
Specialists vs team-leads
Specialists do one slice and return — task-level executors, never owning an objective.
Team-leads (.project/team-leads/) are strategic owners over bundles of objectives that
outlive any single session; a team-lead employs many specialists over time. See team-leads.md.