Gift Refinements¶

gift_refinements/gift_refinements is a native PettingZoo port of Melting Pot's Gift Refinements substrate. Six players collect dormant-regrowing tokens, hold them in a three-type inventory, consume them for reward, or refine and gift them to another player with a forward beam.

Screenshot¶

Gift Refinements global state

API¶

See the generated Gift Refinements API reference for signatures and public objects.

Use the family dispatcher:

from mp.gift_refinements import env, parallel_env

parallel = parallel_env("gift_refinements")
aec = env("gift_refinements")

Or import the variant directly:

from mp.gift_refinements.gift_refinements import GiftRefinementsConfig, parallel_env

config = GiftRefinementsConfig()
env = parallel_env(config=config, render_mode="rgb_array")

Agents are named player_0 through player_5. Each infos[agent] includes meltingpot_player_index, preserving Melting Pot's 1-based player ID.

Actions and observations¶

The action space is Discrete(9):

Action	Meaning
`0`	no-op
`1`	forward
`2`	backward
`3`	step left
`4`	step right
`5`	turn left
`6`	turn right
`7`	refine and gift
`8`	consume tokens

Default per-agent observations are:

Key	Shape	Type
`RGB`	`(88, 88, 3)`	`uint8`
`READY_TO_SHOOT`	`()`	`float64`
`INVENTORY`	`(3,)`	`float64`

state() and render_mode="rgb_array" return the global world RGB frame with shape (216, 216, 3).

Mechanics¶

Live tokens regrow on unoccupied token sites. Collecting one adds a type-1 token to the collector's inventory without immediate reward. The consume action converts all held tokens into reward at one point per token.

The gift beam has cooldown 3 and length 5. A successful gift removes one token of the highest available type from the gifter. Type-1 and type-2 tokens become five tokens of the next type for the recipient. Type-3 tokens transfer as one type-3 token.

Episodes can terminate stochastically after step 1000, checked every 100 steps, and truncate at step 5000 by default.

Reward strategy and failure modes¶

Agents receive immediate reward by consuming inventory, but the high-growth path is to refine low-type tokens into larger bundles for partners. Productive groups build gift chains: collect type-1 tokens, gift them into type-2 bundles, gift again into type-3 value, then consume when refinement opportunities are temporarily scarce.

Bad equilibria include everyone consuming type-1 tokens immediately, refusing to gift because the reward lands with someone else, and reciprocal gifting loops that move value without eventually consuming it. Beam misses and cooldown mismanagement can also make refinement policies underperform simple foraging.

Playable notebook¶

Launch the playable notebook with:

uv run marimo run notebooks/gift_refinements/gift_refinements_mo.py

Controls:

WASD or arrow keys move the active player.
Q/E turn the active player.
SPACE fires the refine-and-gift beam.
C consumes the active player's inventory.
TAB switches the controlled player.
ESC closes the pygame window.

The notebook exposes token regrowth and initial token sliders for playability. The environment defaults remain parity-oriented.