Skip to content

Configuration

PolicyArena experiments are defined in YAML config files, validated with Pydantic at load time.

YAML Schema

name: "Experiment name"           # optional, default: "Unnamed Scenario"
game: prisoners_dilemma           # required — game ID from the registry
rounds: 200                       # optional, default: 100
seed: 42                          # optional — makes the run reproducible
agents:                           # required — at least one agent group
  - name: tft                     # label prefix for this group
    strategy: tit_for_tat         # brain factory key (see `policy-arena info <game>`)
    count: 3                      # optional, default: 1
    parameters:                   # optional — passed to the brain factory
      learning_rate: 0.15
      epsilon: 0.2
game_params:                      # optional — passed to the Mesa model constructor
  payoff_matrix:
    cc: [3, 3]
    cd: [0, 5]
    dc: [5, 0]
    dd: [1, 1]

Fields

Field Type Default Description
name string "Unnamed Scenario" Human-readable experiment name
game string required Game ID from the registry
rounds int 100 Number of simulation steps
seed int | null null Random seed for reproducibility
agents list required Agent group definitions
game_params dict {} Extra parameters passed to the game model

Agent Fields

Field Type Default Description
name string required Label prefix for agents in this group
strategy string required Brain factory key
count int 1 Number of agents with this strategy
parameters dict {} Parameters passed to the brain factory

Example: Prisoner's Dilemma

name: "PD  RL vs Rule-Based"
game: prisoners_dilemma
rounds: 200
seed: 42
agents:
  - name: tft
    strategy: tit_for_tat
    count: 3
  - name: always_defect
    strategy: always_defect
    count: 3
  - name: q_learner
    strategy: q_learning
    count: 2
    parameters:
      learning_rate: 0.15
      epsilon: 0.2
game_params:
  payoff_matrix:
    cc: [3, 3]
    cd: [0, 5]
    dc: [5, 0]
    dd: [1, 1]

Example: LLM Agents

name: "PD  LLM vs Rule-Based"
game: prisoners_dilemma
rounds: 50
seed: 42
agents:
  - name: claude_greedy
    strategy: llm
    parameters:
      provider: anthropic
      model: claude-sonnet-4-6
      persona: greedy
  - name: tft
    strategy: tit_for_tat
    count: 3

Discovering Available Strategies

# List all games
policy-arena games

# Show strategies for a specific game
policy-arena info prisoners_dilemma
import policy_arena as pa

registry = pa.get_registry()
reg = registry.get("prisoners_dilemma")
print(sorted(reg.brain_factories.keys()))

Overriding at Runtime

Both the Python API and CLI support overriding seed and rounds:

results = pa.run("config.yaml", seed=123, rounds=500)
policy-arena run config.yaml --seed 123

Validation

Validate a config without running:

policy-arena validate config.yaml
config = pa.load_config("config.yaml")  # raises ConfigValidationError on invalid input