magicciv/public/games
Natalie 57b326b670
Some checks failed
ci / regression gate (push) Waiting to run
deploy-next / deploy dev guide to mc.next.black.lan (push) Failing after 44s
feat(ai): clan-conditioned training pipeline (harness + env + reward overlays)
The wiring for per-clan trained AI. Each training episode samples a clan, stamps it
on the LEARNER slot so the obs one-hots it, and scales the SHAPING rewards by that
clan's overlay (terminal win/loss stay universal):

- player_api_main.gd: CP_LEARNER_CLAN stamps the learner slot's clan via
  set_player_personality_json -> PlayerState.clan_id -> PlayerView.clan_index ->
  obs clan one-hot. (Previously only non-learner slots got a clan.)
- reward_overlays.json: per-clan group multipliers (combat/expansion/production/
  economy/tech) derived from ai_personalities.json strategic_axes, normalized per
  clan to mean 1.0 (no fairness confound). Archetypes emerge: blackhammer combat 1.5,
  goldvein economy 1.64, deepforge expansion 0.42.
- magic_civ_env.py: samples the clan per episode (seeded), passes CP_LEARNER_CLAN,
  scales the 8 shaping reward terms by self._ov(group).
- harness_client.py: HarnessConfig.learner_clan -> CP_LEARNER_CLAN.
- train.py: --clan ('' generalist | 'all' samples every clan | comma list).

Local checks: py_compile clean; overlays cover all 6 clans. Next: fleet smoke
(clan_index in the learner view + a tiny training run) before scaling out.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-30 13:06:02 -04:00
..
age-of-dwarves feat(ai): clan-conditioned training pipeline (harness + env + reward overlays) 2026-06-30 13:06:02 -04:00
age-of-elves/guide chore(mc): npmrc registry config + claude settings 2026-06-29 11:47:33 -04:00
age-of-kzzkyt/guide chore(mc): npmrc registry config + claude settings 2026-06-29 11:47:33 -04:00
sandbox