The sim_scenario fullgame driver stepped the turn loop but never boot-loaded
the content packs the live harness loads, so process_science ran research-less
(tier-1 fallback) and process_trade_phase saw no resource categories — the
strategic systems were inert. The four strategic assertions (median_tier_peak,
trades_formed, border_growth, clan_winrate) were therefore skipped, leaving
trade_forms / time_to_tier / culture_borders_expand / clan_fairness_band
vacuously green (passing on `terminates` alone).
This wires the systems for real and measures them:
- drive_fullgame boot-loads the tech web (concatenated public/resources/techs/
*.json) and the resource→category map (public/resources/resources.json), the
same payloads GdPlayerApi feeds set_tech_web_json / set_resource_categories_json.
Now: median tier reaches 10, trades form, culture borders expand for real, and
outcomes vary by seed (previously combat/founding were terrain-blind).
- Extract real metrics: tier_peak_p{i} + median_tier_peak (max tier among a
player's researched techs), trades_formed (traded luxuries+strategics),
owned_tiles_p{i} (culture-claimed territory), and the per-seed winner.
- Un-skip MedianTierPeak / TradesFormed / BorderGrowth — they evaluate against
the run. ClanWinrateMax is wired as a batch-level assertion (win fraction of
the most-winning clan across the seed set) with the measured value surfaced in
the JSON output.
- Strengthen the game1_headless_systems_150t umbrella with median_tier_peak>=4
and trades_formed>=1, and re-calibrate final_turn 120->90: a winner now emerges
~98-113t once the systems actually drive the game, instead of running flat to
the cap (calibration-rule: lock the threshold to the real all-systems run).
Determinism fix: PlayerTechState.researched (HashSet) now serializes sorted, so
GameState serialization — and the determinism_same_seed end_state_hash check —
is stable run-to-run regardless of hash iteration order. The set has no
meaningful order; the in-memory type and researched_techs() accessor are
unchanged.
Full suite: 19/20 green. clan_fairness_band is the single honest FAIL — over 50
seeds / 6 clans only 3 ever win (winrates 0.14 / 0.46 / 0.40; clans 1,2,3 never
win), max 0.46 > the 0.4 band. That is a real fairness gap from the bench's fixed
asymmetric start positions + personality balance — surfaced, not tuned away
(owner decision).
Verified: cargo test -p mc-tech (28 passed); full sim_scenario suite run locally
on plum (release), determinism + canonical + the three strategic scenarios green
on real metrics.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>