Gift Refinements¶
gift_refinements/gift_refinements is a native PettingZoo port of Melting Pot's
Gift Refinements substrate. Six players collect dormant-regrowing tokens, hold
them in a three-type inventory, consume them for reward, or refine and gift
them to another player with a forward beam.
Screenshot¶

API¶
See the generated Gift Refinements API reference for signatures and public objects.
Use the family dispatcher:
from mp.gift_refinements import env, parallel_env
parallel = parallel_env("gift_refinements")
aec = env("gift_refinements")
Or import the variant directly:
from mp.gift_refinements.gift_refinements import GiftRefinementsConfig, parallel_env
config = GiftRefinementsConfig()
env = parallel_env(config=config, render_mode="rgb_array")
Agents are named player_0 through player_5. Each infos[agent] includes
meltingpot_player_index, preserving Melting Pot's 1-based player ID.
Actions and observations¶
The action space is Discrete(9):
| Action | Meaning |
|---|---|
0 |
no-op |
1 |
forward |
2 |
backward |
3 |
step left |
4 |
step right |
5 |
turn left |
6 |
turn right |
7 |
refine and gift |
8 |
consume tokens |
Default per-agent observations are:
| Key | Shape | Type |
|---|---|---|
RGB |
(88, 88, 3) |
uint8 |
READY_TO_SHOOT |
() |
float64 |
INVENTORY |
(3,) |
float64 |
state() and render_mode="rgb_array" return the global world RGB frame with
shape (216, 216, 3).
Mechanics¶
Live tokens regrow on unoccupied token sites. Collecting one adds a type-1 token to the collector's inventory without immediate reward. The consume action converts all held tokens into reward at one point per token.
The gift beam has cooldown 3 and length 5. A successful gift removes one token of the highest available type from the gifter. Type-1 and type-2 tokens become five tokens of the next type for the recipient. Type-3 tokens transfer as one type-3 token.
Episodes can terminate stochastically after step 1000, checked every 100 steps, and truncate at step 5000 by default.
Reward strategy and failure modes¶
Agents receive immediate reward by consuming inventory, but the high-growth path is to refine low-type tokens into larger bundles for partners. Productive groups build gift chains: collect type-1 tokens, gift them into type-2 bundles, gift again into type-3 value, then consume when refinement opportunities are temporarily scarce.
Bad equilibria include everyone consuming type-1 tokens immediately, refusing to gift because the reward lands with someone else, and reciprocal gifting loops that move value without eventually consuming it. Beam misses and cooldown mismanagement can also make refinement policies underperform simple foraging.
Playable notebook¶
Launch the playable notebook with:
Controls:
- WASD or arrow keys move the active player.
- Q/E turn the active player.
- SPACE fires the refine-and-gift beam.
- C consumes the active player's inventory.
- TAB switches the controlled player.
- ESC closes the pygame window.
The notebook exposes token regrowth and initial token sliders for playability. The environment defaults remain parity-oriented.