Auth friction and sub-agents convo#416
Conversation
| """Whether this agent may spawn child agents. | ||
|
|
||
| Default ``False``. ``True`` grants project-admin authority and is | ||
| not safe for production until isolation is implemented. |
There was a problem hiding this comment.
is there a sensible path for that implementation?
There was a problem hiding this comment.
and who gets to do what now with this? and how is that restricted when "isolation is implemented"?
There was a problem hiding this comment.
Yes -- for each spawn-capable agent you create a sub-project. First we enable agents to be project admins for the user's main Globus project (dangerous, but a stepping stone). If creds leak an attacker basically has full access. I assume that users trust themselves, and we expose an explicit warning if spawn_capable=True regarding that risk.
Then we can work on isolation. Each spawn-capable agent owns a child Globus Project minted when they are provisioned. Manager.register_persistent_agent(...., spawn_capable=True) calls auth_client.create_project(...), which creates the parent client within a sub-project as admin. Children are minted in the sub-project. PersistentAgentCredentials gets a project_id so parents know which project to create children in.
So that isolation restricts auth actions, but agents can still communicate. Sub-projects bind what an agent can create or destroy in Globus. Coms is seperate -- each agent the user launches holds the same exchange scope, so they all reach the same exchange, and fleet group membership decides whose mailbox they can read.
| return auth_code.strip() | ||
|
|
||
|
|
||
| def get_academy_home() -> pathlib.Path: |
There was a problem hiding this comment.
put this somewhere more general and pull it out of this PR? at least the json pool logger also wants this sort of thing:
$ git grep share.acad
academy/logging/configs/jsonpool.py: ~/local/share/academy/logs/
| directory = get_academy_home() / _AGENTS_SUBDIR | ||
| self._dir = pathlib.Path(directory) | ||
| self._dir.mkdir(parents=True, exist_ok=True) | ||
| self._dir.chmod(_AGENTS_DIR_MODE) |
There was a problem hiding this comment.
theres probably some security relevant race condition here between the mkdir and chmod.
eg some of the hits on https://www.google.com/search?q=mkdir+chmod+toctou
| """MRO of the agent type, matching ``Agent._agent_mro()``.""" | ||
|
|
||
| fleet_group_ids: tuple[uuid.UUID, ...] | ||
| """Fleet group memberships at provision time (length 1 in v1).""" |
There was a problem hiding this comment.
what is this for? it only seems to be referenced in tests?
There was a problem hiding this comment.
Right -- thats an artefact from part of the future implementation. Not best practice to include here without its functionality, but it helps explain the direction. The idea is to have group memberships handle auth checks on agent mailboxs (replacing per-agent scopes). I updated the PR comment with more info re this direction. If the proposal looks good after some convo I can pull this to make the PRs more self contained.
|
note from a private conversation:
|
While brainstorming and developing out auth for the Globus exchange, a small path helper was included in academy-agents#416. This change pulls that helper out as a more general method for use across different modules, including logging and token storage.
While brainstorming and developing out auth for the Globus exchange, a small path helper was included in #416. This change pulls that helper out as a more general method for use across different modules, including logging and token storage.
Summary
This PR is a place to discuss and decide on how we want to 1. reduce auth friction from users using hosted exchanges (Globus), and 2. enable agents to delegate auth to sub-agents. This initial PR is part of a proposal I have, but does not lock us into any particular solution.
Launching agents on the Globus Exchange today requires users to authenticate twice, and agents cant spawn sub-agents. My proposed solution is described below to get us started:
Agents already have Globus Auth identities, and per-agent scope registration, but those identities are not used at runtime. The agent uses a user delegated token to access the exchange, so every action runs through the users auth chain. We redesign this so those existing identities are used by agents to authenticate as themselves via
client_credentials, so the token presented to the exchange is native to the agent, not the user. After a one-time provisioning, agents boot non-interactively, persist across sessions, and can spawn other agents on demand.You could imagine this -- initial provisioning of some agent client:
and reuse of that client avoiding re-auth:
and then an agent with sub-agent launch capabilities:
This functionality requires:
client_credentials, instead of as the user's delegate. This removes one auth request.client_id,secret) are persisted on disk so they can boot non-interactively. This is a pre-req for sub-agent spawning + long-horizon agents.manage_projects--> hierarchical relationships between agents are represented in the exchange.All four are validated end-to-end against the live Globus dev exchange by some spike scripts, so in theory this should work.
This specific PR relates to point 3. @AK2000 already pushed a PR (#397) which persists credentials on disk for long-lived agents via the
python -m academy.runpath. This PR adds similar infrastructure: a storage convention for agent credentials, plus lookup and boot primitives. The flowagent_id->credentials_map->re-launched agentworks today, but there is no user facing flow that exercises it yet (future PR).Identity mapping between agents and agent-credentials is done via UUID.
PersistentAgentCredentials.agent_idcarries the agents UUID. The store writes agents/{agent_id.uid}.json. The file path is the identity mapping.Design decisions / thoughts
Globus Groups as auth
Right now an agent is a thin layer on top of the user's identity. The user logs in, delegates a token, and the agent uses that token to access the exchange on the user's behalf.
The idea is to make each agent its own Globus Confidential Client, authenticating against the exchange via
client_credentialsOAuth grant. Once an agent is running under its own auth, the user's session is not required. The agent could run indefinitely, its credentials can be written to disk, and reloaded across restarts.We can also make an agent spawn-capable at its registration, which would (in this proposal) grant it the
manage_projectscapability, enabling it to mint creds for child agents. In this model a user is required to provision a parent-agent and the fleet of agents under that parent can grow without additional auth prompts.This all assumes that we move the auth model from per-agent scopes to fleet groups (Globus Groups). Currently the exchange does this check: "does this token bear scope
agent_scope", which requires a consent prompt per agent. With groups this becomes "is this agent a member of fleet group X?". So per-agent scopes are not required.The logic for this already exists, and authz check runs in the backend (look for group_memberships in the backend). Its used for explicit mailbox sharing (e.g. a user chooses to share a mailbox with a group), so this proposal is to make this the default rather than opt-in, and to develop the wiring needed to manage fleet memberships.
Secondary wins
Relationship disambiguation on the exchange. As each agent has its own client credentials and authenticates as itself, we can resolve an agents
client_idat the exchange. So, the exchange can now know which agent sent a message, and we can create attestation / provenance chains. You could enforce things like only a parent shutting down its child, only a fleet member invoking a fleet action, or rejecting a message whose claimed src doesn't match the authenticated sender.Related Issues
Changes
Testing
New unit tests included in PR.
Pull Request Checklist
Please confirm the PR meets the following requirements.
pre-commit(e.g., ruff, mypy, etc.).