Skip to content

Chemistry

mp.chemistry provides native PettingZoo ports of Melting Pot's Chemistry substrates. Eight players move over a bounded molecule grid, swap one molecule at a time into a carried vesicle, and receive rewards when useful reactions transition inside their vesicle.

Screenshots

Variant Screenshot
Two metabolic cycles Chemistry two metabolic cycles global state
Two metabolic cycles with distractors Chemistry two metabolic cycles with distractors global state
Three metabolic cycles Chemistry three metabolic cycles global state
Three metabolic cycles with plentiful distractors Chemistry three metabolic cycles with plentiful distractors global state

API

Use the family dispatcher:

from mp.chemistry import env, parallel_env

parallel = parallel_env("two_metabolic_cycles")
aec = env("two_metabolic_cycles")

Every short name also accepts its upstream Melting Pot config name, for example "chemistry__two_metabolic_cycles".

Direct imports are also available:

from mp.chemistry.two_metabolic_cycles import TwoMetabolicCyclesConfig, parallel_env

env = parallel_env(config=TwoMetabolicCyclesConfig(), render_mode="rgb_array")

Variants

Short name Upstream alias Players RGB
two_metabolic_cycles chemistry__two_metabolic_cycles 8 local (88, 88, 3), global (112, 200, 3)
two_metabolic_cycles_with_distractors chemistry__two_metabolic_cycles_with_distractors 8 local (88, 88, 3), global (112, 200, 3)
three_metabolic_cycles chemistry__three_metabolic_cycles 8 local (88, 88, 3), global (112, 200, 3)
three_metabolic_cycles_with_plentiful_distractors chemistry__three_metabolic_cycles_with_plentiful_distractors 8 local (88, 88, 3), global (112, 200, 3)

Actions and observations

The action space is Discrete(8):

Action Meaning
0 no-op
1 forward
2 backward
3 step left
4 step right
5 turn left
6 turn right
7 swap the molecule under the avatar with the carried vesicle

Default observations are RGB only. Set observation_mode="global" for global RGB, or observation_format="objects" / "both" to expose COMPOUND_GRID, AVATAR_GRID, VESICLE, and READY_TO_IO.

Mechanics

Compounds react stochastically when required reactants are within radius one of one another across the ground and vesicle layers. Reactions first produce an activated molecule, then transition to the product on a later frame.

Rewards are paid only for reactions that transition inside an avatar vesicle: metabolizing food gives reward, combining x and y gives energy and high reward, and distractor variants reward holding the distractor.

Reward strategy and failure modes

Agents maximize reward by carrying a vesicle that completes productive cycles: pick up scarce reactants, position near complementary molecules, wait through activation, and swap away products only when that improves the next reaction. In three-cycle and distractor variants, good play also means triaging which cycle currently has the limiting molecule and whether holding a distractor is worth the opportunity cost.

Bad equilibria include everyone chasing the same high-value molecule, agents hoarding reactants that others need to complete cycles, and distractor-heavy policies that earn small holding rewards while starving the metabolic reactions that produce larger payoffs.

Playable notebooks

Launch a playable notebook with:

uv run marimo run notebooks/chemistry/two_metabolic_cycles_mo.py

Controls:

  • WASD or arrow keys move the active player.
  • Q/E turn the active player.
  • SPACE swaps the molecule under the avatar with the carried vesicle.
  • TAB switches the controlled player.
  • ESC closes the pygame window.