Skip to content

Gift Refinements

gift_refinements/gift_refinements is a native PettingZoo port of Melting Pot's Gift Refinements substrate. Six players collect dormant-regrowing tokens, hold them in a three-type inventory, consume them for reward, or refine and gift them to another player with a forward beam.

Screenshot

Gift Refinements global state

API

See the generated Gift Refinements API reference for signatures and public objects.

Use the family dispatcher:

from mp.gift_refinements import env, parallel_env

parallel = parallel_env("gift_refinements")
aec = env("gift_refinements")

Or import the variant directly:

from mp.gift_refinements.gift_refinements import GiftRefinementsConfig, parallel_env

config = GiftRefinementsConfig()
env = parallel_env(config=config, render_mode="rgb_array")

Agents are named player_0 through player_5. Each infos[agent] includes meltingpot_player_index, preserving Melting Pot's 1-based player ID.

Actions and observations

The action space is Discrete(9):

Action Meaning
0 no-op
1 forward
2 backward
3 step left
4 step right
5 turn left
6 turn right
7 refine and gift
8 consume tokens

Default per-agent observations are:

Key Shape Type
RGB (88, 88, 3) uint8
READY_TO_SHOOT () float64
INVENTORY (3,) float64

state() and render_mode="rgb_array" return the global world RGB frame with shape (216, 216, 3).

Mechanics

Live tokens regrow on unoccupied token sites. Collecting one adds a type-1 token to the collector's inventory without immediate reward. The consume action converts all held tokens into reward at one point per token.

The gift beam has cooldown 3 and length 5. A successful gift removes one token of the highest available type from the gifter. Type-1 and type-2 tokens become five tokens of the next type for the recipient. Type-3 tokens transfer as one type-3 token.

Episodes can terminate stochastically after step 1000, checked every 100 steps, and truncate at step 5000 by default.

Reward strategy and failure modes

Agents receive immediate reward by consuming inventory, but the high-growth path is to refine low-type tokens into larger bundles for partners. Productive groups build gift chains: collect type-1 tokens, gift them into type-2 bundles, gift again into type-3 value, then consume when refinement opportunities are temporarily scarce.

Bad equilibria include everyone consuming type-1 tokens immediately, refusing to gift because the reward lands with someone else, and reciprocal gifting loops that move value without eventually consuming it. Beam misses and cooldown mismanagement can also make refinement policies underperform simple foraging.

Playable notebook

Launch the playable notebook with:

uv run marimo run notebooks/gift_refinements/gift_refinements_mo.py

Controls:

  • WASD or arrow keys move the active player.
  • Q/E turn the active player.
  • SPACE fires the refine-and-gift beam.
  • C consumes the active player's inventory.
  • TAB switches the controlled player.
  • ESC closes the pygame window.

The notebook exposes token regrowth and initial token sliders for playability. The environment defaults remain parity-oriented.