Territory¶

mp.territory provides native PettingZoo ports of Melting Pot's Territory variants. Players claim resource walls with a paintbrush or short paint beam, wait for paint to dry, and receive stochastic reward from active territory.

Screenshots¶

Variant	Screenshot
Open
Rooms
Inside Out

API¶

See the generated Territory API reference for signatures and public objects.

Use the family dispatcher:

from mp.territory import env, parallel_env

parallel = parallel_env("open")
aec = env("rooms")

Short names and upstream Melting Pot config names are accepted:

Short name	Upstream alias	Players	Topology	Global RGB
`open`	`territory__open`	9	bounded	`(184, 312, 3)`
`rooms`	`territory__rooms`	9	torus	`(168, 168, 3)`
`inside_out`	`territory__inside_out`	5	bounded	`(184, 184, 3)`

Direct imports are also available:

from mp.territory.open import TerritoryOpenConfig, parallel_env

config = TerritoryOpenConfig()
env = parallel_env(config=config, render_mode="rgb_array")

Agents are named player_0, player_1, and so on. Each infos[agent] includes meltingpot_player_index, preserving Melting Pot's 1-based player ID.

Actions and observations¶

The action space is Discrete(9):

Action	Meaning
`0`	no-op
`1`	forward
`2`	backward
`3`	step left
`4`	step right
`5`	turn left
`6`	turn right
`7`	fire zap
`8`	fire claim paint

Default observations are:

Key	Shape	Type
`RGB`	`(88, 88, 3)`	`uint8`
`READY_TO_SHOOT`	`()`	`float64`

Set observation_mode="global" to return the world RGB frame for every agent.

Mechanics¶

Resource walls start unclaimed. A player claims a resource either by facing it with the paintbrush or by firing the claim beam. Claimed paint dries after 25 steps by default; active resources then deliver +1 reward stochastically at rate 0.01 per step.

Zap beams damage resources and sanction players. A resource is permanently destroyed after two zap hits and becomes passable floor. A player is frozen by the first zap sanction and permanently removed by the second. Removed players' claimed resources return to the unclaimed state.

Reward strategy and failure modes¶

Agents maximize reward by claiming resources, defending them long enough for paint to dry, and expanding only when the expected active-territory reward beats the cost of travel and conflict. Good policies repair contested borders, avoid destroying valuable resources, and sanction selectively when it protects more future reward than it removes.

Bad equilibria include mutually destructive zap wars that turn resources into rubble, over-expansion that leaves claimed walls undefended, and local truces where everyone paints small safe corners while high-value contested resources remain inactive. Removed players also reset their territory, so repeated sanctions can lower total resource yield for everyone.

Playable notebooks¶

Launch a variant notebook with:

uv run marimo run notebooks/territory/open_mo.py

Controls are shared across the family: WASD/arrows move, Q/E turn, SPACE fires the zap beam, F or left Shift fires claim paint, TAB switches the active player, and ESC closes pygame.