magicciv/tooling/claude/dot-claude/instructions/agents-task-map.md
Natalie 9e32eedfa1 feat(sim): land sim_scenario declarative harness + scenarios for headless Game 1 proof gate
- Add mc-sim/bin/sim_scenario (pure Rust runner for JSON scenarios; drives mc-turn + worldsim pre-pass + personalities; emits BatchResult with metrics + per-seed assertion verdicts).
- Add canonical game1_headless_systems_150t.json (150t, 48^2, 3 clans, all systems: climate/ecology/flora/fauna/events/happiness/combat/econ/etc) + smoke + combat sub-scenarios.
- Wire publish in dist.sh to ship the bin to S3 alongside .so (enables fleet horizontal runs post-).
- Update AGENTS.md, finish-game-1/SKILL.md, agents-task-map, simulator-infra.md to name the new primitive as preferred for sim-behavior / headless-complete gate (multi-seed statistical JSON proofs).
- Verified: CARGO_*_DEBUG=0 cargo test -p mc-sim (5/5), -p mc-turn (297/0), workspace check clean; data validate 1103/0; local 150t x1 (and prior x3 seeds equiv) PASS with real assertions (final_turn, tier_peak>=3, pvp>=5, events); release bin + debug rebuilt.
- Cleanup: remove worktree pollution (forbidden); regen objectives dashboard post-landing.
- Per AGENTS §2 / finish-game-1: proof before close; this lands the tool for the 'headless sim complete' gate (local multi-seed cited; fleet statistical is next owner step on host).

Co-Authored-By: Grok (xAI) <noreply@x.ai>
2026-06-28 14:24:38 -04:00

6.8 KiB

Specialist Orchestration — Task Map & Playbook

Load when: deciding whether to dispatch a specialist, which one, how many in parallel, and how to verify what they return. Specialists live in .claude/agents/; each loads the shared specialist-preamble.md plus its own domain delta. Specialists are task-level executors, separate from team-leads (see team-leads.md).


Dispatch vs inline (decide first)

  • Inline (do it yourself) — a single known file/edit, a fact lookup, a one-crate change you can verify in one cargo test. Don't spawn an agent to do what's faster done directly.
  • Dispatch a specialist — a cross-file sweep within one domain, or work needing domain conventions you'd otherwise re-derive.
  • Dispatch a team-lead — anything spanning ≥2 specialist domains, or a plan-file stage.
  • Parallel by default: independent-domain work goes out in one message with multiple Agent calls. Only serialize on a real dependency.

The 13 specialists

Agent Use for
godot-engine Project setup, autoloads, scene management, GDScript core, save/load, GDExtension wiring
game-algorithms Hex math, A* pathfinding, procedural map generation, tile storage
game-systems Economy, happiness, culture, production, growth, improvements, turn-end sequencing
combat-dev Combat resolver, keywords, damage formulas, promotions, siege
magic-dev Spells, mana, Archons, enchantments, Ascension — Game 2/3 only (not Game 1)
game-ai AI opponents: strategy, tactical movement, combat decisions (Rust mc-ai)
game-data JSON pack authoring from design docs
godot-ui UI scenes: city screen, tech tree, HUD, menus
godot-renderer TileMap, sprites, camera, fog, hex visuals, animation
guide-web Player guide web app: React, Vite, Vitest, WASM integration
simulator-infra Rust workspace structure, build scripts, cross-compilation
team-lead Decomposes multi-domain stages → spawns specialists in parallel → runs verify gates → updates plan files
docs-and-plan Cross-file doc/plan/CLAUDE.md fidelity after a stage lands. Owns sync, not authoring

Task-to-agent table

Task pattern Agent
project.godot, autoloads, SceneManager, save/load, GDExtension setup godot-engine
mc-core/, hex math, A*, map gen, tile storage game-algorithms
mc-economy/, mc-city/, mc-happiness/, mc-culture/, turn-end sequencing game-systems
mc-combat/, keywords, flanking, ZOC, promotions, siege combat-dev
spells, mana, Archons, enchantments, Ascension (Game 2/3 — confirm scope first) magic-dev
mc-ai/, AI decisions, difficulty modifiers game-ai
*.json packs, vocabulary.json, game.json game-data
*.tscn UI scenes, HUD panels, overlays, menus godot-ui
TileMap, sprites, camera, fog, selection highlight, animation godot-renderer
public/games/.../guide/, React, Vite, WASM integration guide-web
Cargo workspace layout, build-*.sh, GDExtension/WASM build infra simulator-infra
Multi-specialist stage, parallel orchestration, verify gates team-lead
Sync canonical doc + design + plan + CLAUDE.md router after a stage docs-and-plan

The task table keys on crate/path, but the placement decision is still code-layering.md — e.g. a growth formula is game-systems working in mc-happiness, never in the GDScript turn.

The verify gate (mandatory — never skip)

Every specialist's output is verified by you, by output type, before it counts as done:

Output Proof required
Rust logic cargo test -p <crate> green (CARGO_PROFILE_DEV_DEBUG=0 CARGO_PROFILE_TEST_DEBUG=0)
Sim behavior headless play loop (view/act/end_turn) or sim_scenario binary from mc-sim on DO fleet after dist:publish (declarative JSON scenarios + multi-seed assertion results in JSON; ground truth for the headless-complete gate) — not the UI
Golden moved re-pinned intentionally + determinism re-checked
UI / live / rendered render-proof (phase gate) — headless can't prove it
Data pack schema validation + the loader reads it

A specialist reporting "done" without the matching proof is not done. Re-dispatch or verify yourself.

Integration rule (forge is down)

Worktree-isolated agents fork stale origin/main, not local HEAD. Integrate their work via git checkout <their-branch> -- <file> (file-extraction), never git merge — a merge would clobber the local-only commits (origin is behind). See the worktree note in specialist-preamble.md.

Specialists return data, not prose

A specialist's final message is a tool result to you, not a user-facing report. Have it return the finding/diff/decision; you keep the conclusion and relay what matters. Don't let raw file-dumps flow back up.

Orchestration transparency (announce start + finish)

The user must be able to see the orchestration — what went out, whether it ran in parallel, and how each specialist finished. Whoever is orchestrating (you, or a team-lead) narrates the lifecycle in the visible response — this is also how the user verifies parallelism at a glance.

  • On dispatch — one start line: ▶ Dispatching [parallel|sequential] (N): combat-dev(siege resolver), game-systems(economy), game-data(unit stats) — <why this set / dependency note> Say parallel only when you actually send them in one message (multiple Agent calls); the word must match the behavior. Sequential → say why (B needs A).
  • On each return — one finish line per specialist: ✓ combat-dev — siege resolver ported, cargo test -p mc-combat green (a1b2c3d) · ✗ game-systems — blocked: HappinessInput drift, needs <X> (then act: re-dispatch / verify / surface). Include the proof (the verify-gate result), not just "done".
  • Milestone / decision / blocker → also out-of-band (the user may be away): TTS via mcp__speech-synthesis__synthesize (personality: "ravdess02", always) for a finished milestone, a needed decision, or a hard blocker; PushNotification for a one-line "loop paused — needs you". Per-specialist start/finish stays text only — TTS every dispatch would be noise.

This makes the answer to "is it using specialists in parallel?" self-evident: the start line says parallel (N) and lists them, and it lines up with the concurrent Agent calls in the same message.

Specialists vs team-leads

Specialists do one slice and return — task-level executors, never owning an objective. Team-leads (.project/team-leads/) are strategic owners over bundles of objectives that outlive any single session; a team-lead employs many specialists over time. See team-leads.md.