Skip to content

Hidden Agenda

hidden_agenda/hidden_agenda is a native PettingZoo port of Melting Pot's Hidden Agenda substrate. Four crewmates try to collect and deposit gems while one impostor can freeze crewmates and influence deliberation votes.

Screenshot

Hidden Agenda global state

API

See the generated Hidden Agenda API reference for signatures and public objects.

Use the family dispatcher:

from mp.hidden_agenda import env, parallel_env

parallel = parallel_env("hidden_agenda", render_mode="rgb_array")
aec = env("hidden_agenda")

Or import the variant directly:

from mp.hidden_agenda.hidden_agenda import HiddenAgendaConfig, parallel_env

config = HiddenAgendaConfig(observation_mode="global")
env = parallel_env(config=config)

Agents are named player_0 through player_4. The default roles are four crewmates and one impostor.

Actions and observations

The action space is Discrete(14):

Action Meaning
0 no-op
1 forward
2 backward
3 step left
4 step right
5 turn left
6 turn right
7 tag
8-12 vote for players 1-5
13 no-vote

Default per-agent observations are:

Key Shape Type
RGB (88, 88, 3) uint8
INVENTORY (1,) float64
READY_TO_SHOOT () float64
VOTING (5, 7) float64

state() and render_mode="rgb_array" return the global world RGB frame with shape (176, 264, 3). Pass observation_mode="global" to return a global RGB observation for each agent.

Mechanics

Crewmates collect one gem at a time and deposit gems at the central grate. Reaching 32 deposited gems gives crewmates a win. The impostor can tag active crewmates with a cooldown-limited beam; tagged crewmates freeze and become inactive.

Deliberation starts on a timer or from a visible tag. Active players teleport to the voting room and can vote for any player or no-vote. A majority removes the target. Voting out the impostor gives crewmates a win; reducing active crewmates to one or fewer gives the impostor a win.

Reward strategy and failure modes

Crewmates maximize reward by dividing gem collection routes, depositing quickly, watching for visible tags, and coordinating votes only when evidence is strong. The impostor maximizes reward by freezing isolated crewmates, avoiding visible tags that trigger deliberation, and exploiting split votes or excessive no-votes to survive until the active crewmate count is low.

Bad equilibria include crewmates over-focusing on voting while gem progress stalls, no-vote norms that let the impostor remove the crew one by one, and panic majorities that vote out crewmates after ambiguous evidence. The impostor can also lose by tagging too publicly, causing fast deliberation and coordinated removal.

Playable notebook

Launch the playable notebook with:

uv run marimo run notebooks/hidden_agenda/hidden_agenda_mo.py

Controls:

  • WASD or arrow keys move the active player.
  • Q/E turn the active player.
  • SPACE tags.
  • Number keys 1-5 vote for players.
  • 0 submits no-vote.
  • TAB switches the controlled player.
  • ESC closes the pygame window.