Configuration¶
PolicyArena experiments are defined in YAML config files, validated with Pydantic at load time.
YAML Schema¶
name: "Experiment name" # optional, default: "Unnamed Scenario"
game: prisoners_dilemma # required — game ID from the registry
rounds: 200 # optional, default: 100
seed: 42 # optional — makes the run reproducible
agents: # required — at least one agent group
- name: tft # label prefix for this group
strategy: tit_for_tat # brain factory key (see `policy-arena info <game>`)
count: 3 # optional, default: 1
parameters: # optional — passed to the brain factory
learning_rate: 0.15
epsilon: 0.2
game_params: # optional — passed to the Mesa model constructor
payoff_matrix:
cc: [3, 3]
cd: [0, 5]
dc: [5, 0]
dd: [1, 1]
Fields¶
| Field | Type | Default | Description |
|---|---|---|---|
name |
string | "Unnamed Scenario" |
Human-readable experiment name |
game |
string | required | Game ID from the registry |
rounds |
int | 100 |
Number of simulation steps |
seed |
int | null | null |
Random seed for reproducibility |
agents |
list | required | Agent group definitions |
game_params |
dict | {} |
Extra parameters passed to the game model |
Agent Fields¶
| Field | Type | Default | Description |
|---|---|---|---|
name |
string | required | Label prefix for agents in this group |
strategy |
string | required | Brain factory key |
count |
int | 1 |
Number of agents with this strategy |
parameters |
dict | {} |
Parameters passed to the brain factory |
Example: Prisoner's Dilemma¶
name: "PD — RL vs Rule-Based"
game: prisoners_dilemma
rounds: 200
seed: 42
agents:
- name: tft
strategy: tit_for_tat
count: 3
- name: always_defect
strategy: always_defect
count: 3
- name: q_learner
strategy: q_learning
count: 2
parameters:
learning_rate: 0.15
epsilon: 0.2
game_params:
payoff_matrix:
cc: [3, 3]
cd: [0, 5]
dc: [5, 0]
dd: [1, 1]
Example: LLM Agents¶
name: "PD — LLM vs Rule-Based"
game: prisoners_dilemma
rounds: 50
seed: 42
agents:
- name: claude_greedy
strategy: llm
parameters:
provider: anthropic
model: claude-sonnet-4-6
persona: greedy
- name: tft
strategy: tit_for_tat
count: 3
Discovering Available Strategies¶
# List all games
policy-arena games
# Show strategies for a specific game
policy-arena info prisoners_dilemma
import policy_arena as pa
registry = pa.get_registry()
reg = registry.get("prisoners_dilemma")
print(sorted(reg.brain_factories.keys()))
Overriding at Runtime¶
Both the Python API and CLI support overriding seed and rounds:
results = pa.run("config.yaml", seed=123, rounds=500)
policy-arena run config.yaml --seed 123
Validation¶
Validate a config without running:
policy-arena validate config.yaml
config = pa.load_config("config.yaml") # raises ConfigValidationError on invalid input