Skip to content

Collaborative Cooking

mp.collaborative_cooking contains native PettingZoo ports of all seven Melting Pot Collaborative Cooking layouts: asymmetric, circuit, cramped, crowded, figure_eight, forced, and ring.

Players coordinate to cook tomato soup: collect tomatoes, load a pot with three tomatoes, wait for it to cook, collect soup with a dish, and deliver it for a shared reward.

Screenshots

Variant Screenshot
Asymmetric Collaborative Cooking asymmetric global state
Circuit Collaborative Cooking circuit global state
Cramped Collaborative Cooking cramped global state
Crowded Collaborative Cooking crowded global state
Figure Eight Collaborative Cooking figure-eight global state
Forced Collaborative Cooking forced global state
Ring Collaborative Cooking ring global state

API

Use the family dispatcher:

from mp.collaborative_cooking import env, parallel_env

parallel = parallel_env("figure_eight", render_mode="rgb_array")
aec = env("figure_eight")

Upstream aliases also work:

from mp.collaborative_cooking import parallel_env

env = parallel_env("collaborative_cooking__figure_eight")

Or import a variant directly:

from mp.collaborative_cooking.cramped import (
    CollaborativeCookingCrampedConfig,
    parallel_env,
)

config = CollaborativeCookingCrampedConfig(cooking_time=20)
env = parallel_env(config=config, render_mode="rgb_array")

Variants

Variant Players Global RGB
asymmetric 2 (40, 72, 3)
circuit 2 (40, 72, 3)
cramped 2 (40, 72, 3)
crowded 9 (72, 104, 3)
figure_eight 6 (72, 128, 3)
forced 2 (40, 72, 3)
ring 2 (40, 72, 3)

Actions and observations

The action space is Discrete(8):

Action Meaning
0 no-op
1 forward
2 backward
3 step left
4 step right
5 turn left
6 turn right
7 interact

Default per-agent observations are local RGB:

Key Shape Type
RGB (40, 40, 3) uint8

Set observation_mode="global" on the config or dispatcher overrides to return the full world frame per agent. state() and render_mode="rgb_array" always return global RGB.

Mechanics

Each avatar can hold one item. Tomato and dish dispensers are infinite, counters hold one item, and pots accept exactly three tomatoes before cooking. Interacting with a cooked pot while holding a dish produces soup. Delivering soup gives every player +20, so the reward is fully shared.

Layouts change the coordination problem. Small maps emphasize blocking and handoffs, forced separates roles through the map geometry, figure_eight and ring reward route planning, and crowded tests whether nine players can avoid turning a shared kitchen into a traffic jam.

Reward strategy and failure modes

Optimal teams specialize implicitly: some players feed tomatoes into pots, some stage dishes or soup handoffs, and one player delivers as soon as soup is ready. Good policies keep counters useful, avoid standing in corridors, and time dish pickup so cooked soup is never left idle.

Bad equilibria include everyone chasing the same dispenser, players blocking the pot or delivery lane, and excessive counter shuffling that creates motion without finishing soups. In crowded layouts, locally reasonable movements can converge to gridlock unless agents learn conventions for yielding and one-way traffic.

Playable notebooks

Launch any variant notebook with:

uv run marimo run notebooks/collaborative_cooking/figure_eight_mo.py

Controls are WASD or arrows to move, Q/E to turn, SPACE to interact, TAB to switch active player, and ESC to close the pygame window.