Skip to content

Territory

mp.territory provides native PettingZoo ports of Melting Pot's Territory variants. Players claim resource walls with a paintbrush or short paint beam, wait for paint to dry, and receive stochastic reward from active territory.

Screenshots

Variant Screenshot
Open Territory open global state
Rooms Territory rooms global state
Inside Out Territory inside-out global state

API

See the generated Territory API reference for signatures and public objects.

Use the family dispatcher:

from mp.territory import env, parallel_env

parallel = parallel_env("open")
aec = env("rooms")

Short names and upstream Melting Pot config names are accepted:

Short name Upstream alias Players Topology Global RGB
open territory__open 9 bounded (184, 312, 3)
rooms territory__rooms 9 torus (168, 168, 3)
inside_out territory__inside_out 5 bounded (184, 184, 3)

Direct imports are also available:

from mp.territory.open import TerritoryOpenConfig, parallel_env

config = TerritoryOpenConfig()
env = parallel_env(config=config, render_mode="rgb_array")

Agents are named player_0, player_1, and so on. Each infos[agent] includes meltingpot_player_index, preserving Melting Pot's 1-based player ID.

Actions and observations

The action space is Discrete(9):

Action Meaning
0 no-op
1 forward
2 backward
3 step left
4 step right
5 turn left
6 turn right
7 fire zap
8 fire claim paint

Default observations are:

Key Shape Type
RGB (88, 88, 3) uint8
READY_TO_SHOOT () float64

Set observation_mode="global" to return the world RGB frame for every agent.

Mechanics

Resource walls start unclaimed. A player claims a resource either by facing it with the paintbrush or by firing the claim beam. Claimed paint dries after 25 steps by default; active resources then deliver +1 reward stochastically at rate 0.01 per step.

Zap beams damage resources and sanction players. A resource is permanently destroyed after two zap hits and becomes passable floor. A player is frozen by the first zap sanction and permanently removed by the second. Removed players' claimed resources return to the unclaimed state.

Reward strategy and failure modes

Agents maximize reward by claiming resources, defending them long enough for paint to dry, and expanding only when the expected active-territory reward beats the cost of travel and conflict. Good policies repair contested borders, avoid destroying valuable resources, and sanction selectively when it protects more future reward than it removes.

Bad equilibria include mutually destructive zap wars that turn resources into rubble, over-expansion that leaves claimed walls undefended, and local truces where everyone paints small safe corners while high-value contested resources remain inactive. Removed players also reset their territory, so repeated sanctions can lower total resource yield for everyone.

Playable notebooks

Launch a variant notebook with:

uv run marimo run notebooks/territory/open_mo.py

Controls are shared across the family: WASD/arrows move, Q/E turn, SPACE fires the zap beam, F or left Shift fires claim paint, TAB switches the active player, and ESC closes pygame.