Chemistry¶
mp.chemistry provides native PettingZoo ports of Melting Pot's Chemistry
substrates. Eight players move over a bounded molecule grid, swap one molecule
at a time into a carried vesicle, and receive rewards when useful reactions
transition inside their vesicle.
Screenshots¶
| Variant | Screenshot |
|---|---|
| Two metabolic cycles | ![]() |
| Two metabolic cycles with distractors | ![]() |
| Three metabolic cycles | ![]() |
| Three metabolic cycles with plentiful distractors | ![]() |
API¶
Use the family dispatcher:
from mp.chemistry import env, parallel_env
parallel = parallel_env("two_metabolic_cycles")
aec = env("two_metabolic_cycles")
Every short name also accepts its upstream Melting Pot config name, for example
"chemistry__two_metabolic_cycles".
Direct imports are also available:
from mp.chemistry.two_metabolic_cycles import TwoMetabolicCyclesConfig, parallel_env
env = parallel_env(config=TwoMetabolicCyclesConfig(), render_mode="rgb_array")
Variants¶
| Short name | Upstream alias | Players | RGB |
|---|---|---|---|
two_metabolic_cycles |
chemistry__two_metabolic_cycles |
8 | local (88, 88, 3), global (112, 200, 3) |
two_metabolic_cycles_with_distractors |
chemistry__two_metabolic_cycles_with_distractors |
8 | local (88, 88, 3), global (112, 200, 3) |
three_metabolic_cycles |
chemistry__three_metabolic_cycles |
8 | local (88, 88, 3), global (112, 200, 3) |
three_metabolic_cycles_with_plentiful_distractors |
chemistry__three_metabolic_cycles_with_plentiful_distractors |
8 | local (88, 88, 3), global (112, 200, 3) |
Actions and observations¶
The action space is Discrete(8):
| Action | Meaning |
|---|---|
0 |
no-op |
1 |
forward |
2 |
backward |
3 |
step left |
4 |
step right |
5 |
turn left |
6 |
turn right |
7 |
swap the molecule under the avatar with the carried vesicle |
Default observations are RGB only. Set observation_mode="global" for global
RGB, or observation_format="objects" / "both" to expose COMPOUND_GRID,
AVATAR_GRID, VESICLE, and READY_TO_IO.
Mechanics¶
Compounds react stochastically when required reactants are within radius one of
one another across the ground and vesicle layers. Reactions first produce an
activated molecule, then transition to the product on a later frame.
Rewards are paid only for reactions that transition inside an avatar vesicle:
metabolizing food gives reward, combining x and y gives energy and high
reward, and distractor variants reward holding the distractor.
Reward strategy and failure modes¶
Agents maximize reward by carrying a vesicle that completes productive cycles: pick up scarce reactants, position near complementary molecules, wait through activation, and swap away products only when that improves the next reaction. In three-cycle and distractor variants, good play also means triaging which cycle currently has the limiting molecule and whether holding a distractor is worth the opportunity cost.
Bad equilibria include everyone chasing the same high-value molecule, agents hoarding reactants that others need to complete cycles, and distractor-heavy policies that earn small holding rewards while starving the metabolic reactions that produce larger payoffs.
Playable notebooks¶
Launch a playable notebook with:
Controls:
- WASD or arrow keys move the active player.
- Q/E turn the active player.
- SPACE swaps the molecule under the avatar with the carried vesicle.
- TAB switches the controlled player.
- ESC closes the pygame window.



