magicciv

History

Natalie 78945e9df1 feat(sim): make the headless fullgame runner exercise tech/trade/culture for real The sim_scenario fullgame driver stepped the turn loop but never boot-loaded the content packs the live harness loads, so process_science ran research-less (tier-1 fallback) and process_trade_phase saw no resource categories — the strategic systems were inert. The four strategic assertions (median_tier_peak, trades_formed, border_growth, clan_winrate) were therefore skipped, leaving trade_forms / time_to_tier / culture_borders_expand / clan_fairness_band vacuously green (passing on `terminates` alone). This wires the systems for real and measures them: - drive_fullgame boot-loads the tech web (concatenated public/resources/techs/ *.json) and the resource→category map (public/resources/resources.json), the same payloads GdPlayerApi feeds set_tech_web_json / set_resource_categories_json. Now: median tier reaches 10, trades form, culture borders expand for real, and outcomes vary by seed (previously combat/founding were terrain-blind). - Extract real metrics: tier_peak_p{i} + median_tier_peak (max tier among a player's researched techs), trades_formed (traded luxuries+strategics), owned_tiles_p{i} (culture-claimed territory), and the per-seed winner. - Un-skip MedianTierPeak / TradesFormed / BorderGrowth — they evaluate against the run. ClanWinrateMax is wired as a batch-level assertion (win fraction of the most-winning clan across the seed set) with the measured value surfaced in the JSON output. - Strengthen the game1_headless_systems_150t umbrella with median_tier_peak>=4 and trades_formed>=1, and re-calibrate final_turn 120->90: a winner now emerges ~98-113t once the systems actually drive the game, instead of running flat to the cap (calibration-rule: lock the threshold to the real all-systems run). Determinism fix: PlayerTechState.researched (HashSet) now serializes sorted, so GameState serialization — and the determinism_same_seed end_state_hash check — is stable run-to-run regardless of hash iteration order. The set has no meaningful order; the in-memory type and researched_techs() accessor are unchanged. Full suite: 19/20 green. clan_fairness_band is the single honest FAIL — over 50 seeds / 6 clans only 3 ever win (winrates 0.14 / 0.46 / 0.40; clans 1,2,3 never win), max 0.46 > the 0.4 band. That is a real fairness gap from the bench's fixed asymmetric start positions + personality balance — surfaced, not tuned away (owner decision). Verified: cargo test -p mc-tech (28 passed); full sim_scenario suite run locally on plum (release), determinism + canonical + the three strategic scenarios green on real metrics. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>		2026-06-28 23:20:13 -04:00
..
game	fix(tests): mark wild-creature-ai private _method tests pending after Rail-1 Rust port	2026-06-28 12:29:02 -04:00
packages
simulator	feat(sim): make the headless fullgame runner exercise tech/trade/culture for real	2026-06-28 23:20:13 -04:00