Skip to content

Factory Commons

factory_commons/either_or is a native PettingZoo port of Melting Pot's factory_commons__either_or substrate. Three players collect apples, move blue cubes into factory hoppers, and choose between machines that preserve the blue cube economy or convert cubes into short-term apple rewards.

Screenshot

Factory Commons Either-Or global state

API

See the generated Factory Commons API reference for signatures and public objects.

Use the family dispatcher:

from mp.factory_commons import env, parallel_env

parallel = parallel_env("either_or", render_mode="rgb_array")
aec = env("factory_commons__either_or")

Or import the variant directly:

from mp.factory_commons.either_or import FactoryCommonsEitherOrConfig, parallel_env

config = FactoryCommonsEitherOrConfig(observation_mode="global")
env = parallel_env(config=config)

Agents are named player_0, player_1, and player_2.

Actions and observations

The action space is Discrete(12):

Action Meaning
0 no-op
1 forward
2 backward
3 step left
4 step right
5 turn left
6 turn right
7 pickup
8 grasp or drop
9 hold
10 hold and shove
11 hold and pull

Default per-agent observations are:

Key Shape Type
RGB (88, 88, 3) uint8
READY_TO_SHOOT () float64
STAMINA () float64

state() and render_mode="rgb_array" return the global world RGB frame with shape (128, 184, 3). Pass observation_mode="global" to return a global RGB observation for each agent.

Mechanics

Apples reward the entering avatar by +1 and then disappear into the wait pool. Blue cubes can be grasped and dropped into open hoppers. The left machines output a blue cube plus an apple, while the right machines output two apples.

Grappling actions use a short hold beam. A held avatar briefly cannot move or hold, and shove/pull moves that avatar if the destination is passable. Stamina falls with movement and recovers while resting; low stamina creates movement freezes.

Reward strategy and failure modes

Optimal long-run play balances immediate apples with keeping blue cubes in circulation. Dropping cubes into cube-preserving hoppers maintains future production, while two-apple machines can be useful when the cube stock is high or a player needs quick reward. Agents also need to budget stamina and use grappling to clear paths rather than grief productive carriers.

Bad equilibria include consuming the cube economy for short-term apples until few cubes remain, hoarding or blocking cubes near hoppers, and grappling loops where players shove each other instead of feeding machines. Stamina exhaustion can lock agents into slow, low-productivity movement if policies never rest.

Playable notebook

Launch the playable notebook with:

uv run marimo run notebooks/factory_commons/either_or_mo.py

Controls:

  • WASD or arrow keys move the active player.
  • Q/E turn the active player.
  • C sends the pickup action.
  • SPACE grasps or drops objects.
  • Shift holds another player.
  • F shoves and R pulls a held player.
  • TAB switches the controlled player.
  • ESC closes the pygame window.