Skip to content

Commons Harvest

Commons Harvest is available as a native three-variant PettingZoo family:

Variant Upstream alias Default regrowth probabilities
open commons_harvest__open (0.0, 0.0025, 0.005, 0.025)
closed commons_harvest__closed (0.0, 0.001, 0.005, 0.025)
partnership commons_harvest__partnership (0.0, 0.001, 0.005, 0.025)

All variants use seven agents, the same action order, apple-density regrowth, zapping, respawn, stochastic termination, native RGB rendering, and optional global observations. partnership renders its I corridor tiles as normal passable floor under default roles.

Screenshots

Variant Screenshot
Open Commons Harvest open global state
Closed Commons Harvest closed global state
Partnership Commons Harvest partnership global state

API

See the generated Commons Harvest API reference for signatures and public objects.

Use the family dispatcher:

from mp.commons_harvest import env, parallel_env

parallel = parallel_env("closed", render_mode="rgb_array")
aec = env("commons_harvest__partnership")

Or import a variant directly:

from mp.commons_harvest.partnership import (
    CommonsHarvestPartnershipConfig,
    parallel_env,
)

config = CommonsHarvestPartnershipConfig(observation_mode="global")
env = parallel_env(config=config)

Agents are named player_0 through player_6. Each infos[agent] includes meltingpot_player_index, preserving Melting Pot's 1-based player ID.

Actions and observations

The action space is Discrete(8):

Action Meaning
0 no-op
1 forward
2 backward
3 step left
4 step right
5 turn left
6 turn right
7 fire zap

Default per-agent observations are:

Key Shape Type
RGB (88, 88, 3) uint8
READY_TO_SHOOT () float64

state() and render_mode="rgb_array" return the global world RGB frame with shape (144, 192, 3). Pass observation_mode="global" to return that frame as each agent's RGB observation.

Mechanics

Live apples give the collecting player +1, then enter a wait state. Dormant apples regrow from local density: live neighbors within L2 radius 2 select an entry from the variant's regrowth table.

The zap action has cooldown 2, beam length 3, and beam radius 1. Walls block beams. Hit avatars wait for 4 frames, then respawn at normal spawn points. Zap reward and penalty are both 0.

Reward strategy and failure modes

Players maximize long-run reward by harvesting apples without driving local density below the regrowth threshold. Good policies spread out, leave seed clusters alive, and move to richer patches rather than stripping the nearest tree line. Zaps are mainly useful for space control because they do not directly pay reward.

Bad equilibria are over-harvesting and territorial exclusion. If every player takes the nearest apple, low-density regions stop regrowing and the commons collapses. If players use zaps to monopolize orchards, they can reduce competition locally while lowering total harvesting time and making recovery from depleted patches slower.

Playable notebooks

Launch a playable notebook with:

uv run marimo run notebooks/commons_harvest/open_mo.py
uv run marimo run notebooks/commons_harvest/closed_mo.py
uv run marimo run notebooks/commons_harvest/partnership_mo.py

Controls:

  • WASD or arrow keys move the active player.
  • Q/E turn the active player.
  • SPACE fires the zap beam.
  • TAB switches the controlled player.
  • ESC closes the pygame window.

The notebooks expose a higher apple regrowth scale for playability. The environment defaults remain parity-oriented.