SmartCity-GASP-MARL

This repository is a compact research demo for Governed Agent Societies in a smart-city setting. It implements a symbolic multi-agent environment, typed action traces, runtime governance guards, deterministic baselines, and small IPPO/MAPPO-style MARL trainers.

The demo is intentionally toy-sized. It is not a realistic city simulator. It is designed to make the following paper concepts executable and inspectable:

multi-role execution under partial observations;
scenario-based role activation;
typed action records instead of free-form control only;
governance-aware transition dynamics;
JSONL execution traces;
trace-centric metrics such as TSC, EscCE, GAU, activation precision/recall, and a simple RDC proxy;
deterministic and learned multi-agent policy modes.

Project structure

smartcity_gasp_marl/
  smartcity_gasp/
    core/          # state, scenarios, typed actions, governance, metrics, traces
    envs/          # PettingZoo-like parallel MARL environment
    policies/      # deterministic and random policies
    marl/          # compact PPO, IPPO, and MAPPO-style training utilities
    experiments/   # runnable scripts
  configs/         # example experiment configuration files
  docs/            # algorithm alignment and paper integration notes
  outputs/         # generated sample outputs
  tests/           # lightweight smoke tests

Installation

The deterministic environment requires only Python and NumPy.

python -m pip install -r requirements.txt

For MARL training, install PyTorch as appropriate for your machine. The code will still run deterministic baselines if PyTorch is not available.

Quick start

Run the deterministic baselines:

python -m smartcity_gasp.experiments.run_deterministic_baselines --episodes 20

Run a short MAPPO smoke training session:

python -m smartcity_gasp.experiments.train_marl --algorithm mappo --episodes 20 --eval-episodes 10

Run the governed MAPPO variant:

python -m smartcity_gasp.experiments.train_marl --algorithm mappo --governed --episodes 20 --eval-episodes 10

Run the compact end-to-end smoke suite:

python -m smartcity_gasp.experiments.evaluate_all --quick

Execution modes

The deterministic experiment compares four modes:

Mode	Meaning
B0 direct controller	A single direct controller acts without role activation or guard.
B1 all agents, no guard	All service agents act in every scenario; no governance guard.
B2 activated, no guard	Only scenario-relevant agents act; no governance guard.
B3 activated, governed	Scenario-relevant agents act; high-impact actions pass through the guard.

The MARL experiment supports:

Mode	Meaning
IPPO unguarded	Independent PPO-style actor/critic over local observations.
MAPPO unguarded	Local actor with centralized critic over global state.
MAPPO governed	Same centralized-critic setup, but actions pass through the guard.

Smart-city scenarios

Episodes sample one of four symbolic incident families:

traffic accident near a hospital;
power outage in a critical district;
flooded underpass;
pollution spike near a school.

Each scenario has severity, evidence quality, congestion, hospital-access risk, pollution-zone status, power/water status, overseer availability, and a required role set.

Typed actions

RL policies output integer actions. The environment converts each integer into a typed action record before guard evaluation. A trace entry therefore contains fields such as:

{
  "role": "TrafficAgent",
  "action_type": "open_bus_lane",
  "target": "hospital_route",
  "evidence_refs": ["verified_incident_report"],
  "risk_level": "high",
  "escalation_requested": false,
  "p_escalation": 0.65
}

Governance rules

The current guard implements simple, inspectable rules:

public alerts require verified evidence;
bus-lane opening for longer than five minutes requires approval;
road closure affecting hospital access requires escalation;
memory writes require a source and an expiration condition;
pollution-zone rerouting is sanitized unless emergency priority is active;
inactive service roles are denied in governed execution.

Outputs

Generated outputs are written under outputs/results/:

deterministic_results.csv
deterministic_summary.md
deterministic_table.tex
example_trace.json
traces/*.jsonl
*_learning_curve.csv
*_results.csv
*_summary.md
*_table.tex

The JSONL traces are the most useful artifact for the paper: they show proposed actions, evidence links, guard outcomes, violations, support annotations, and reward signals.

Notes for paper use

The code implements the governed simulation and trace-evaluation parts of the paper algorithm directly. The learning layer provides IPPO and MAPPO-style policy training. The centralized critic uses global state, while actors act from local observations. The verifier/governor are rule-based runtime guards in this version, which keeps governance inspectable and avoids learning a black-box safety policy before the trace metrics are validated.

See docs/algorithm_alignment.md and docs/paper_integration_notes.md for the precise mapping between the code and the paper sections.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
configs		configs
docs		docs
outputs/results		outputs/results
smartcity_gasp		smartcity_gasp
tests		tests
.gitignore		.gitignore
.gitignoregit		.gitignoregit
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SmartCity-GASP-MARL

Project structure

Installation

Quick start

Execution modes

Smart-city scenarios

Typed actions

Governance rules

Outputs

Notes for paper use

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SmartCity-GASP-MARL

Project structure

Installation

Quick start

Execution modes

Smart-city scenarios

Typed actions

Governance rules

Outputs

Notes for paper use

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages