Skip to content

Auth friction and sub-agents convo#416

Open
GusEllerm wants to merge 2 commits into
academy-agents:mainfrom
GusEllerm:agent-auth/creds
Open

Auth friction and sub-agents convo#416
GusEllerm wants to merge 2 commits into
academy-agents:mainfrom
GusEllerm:agent-auth/creds

Conversation

@GusEllerm

@GusEllerm GusEllerm commented May 22, 2026

Copy link
Copy Markdown
Collaborator

Summary

This PR is a place to discuss and decide on how we want to 1. reduce auth friction from users using hosted exchanges (Globus), and 2. enable agents to delegate auth to sub-agents. This initial PR is part of a proposal I have, but does not lock us into any particular solution.

Launching agents on the Globus Exchange today requires users to authenticate twice, and agents cant spawn sub-agents. My proposed solution is described below to get us started:

Agents already have Globus Auth identities, and per-agent scope registration, but those identities are not used at runtime. The agent uses a user delegated token to access the exchange, so every action runs through the users auth chain. We redesign this so those existing identities are used by agents to authenticate as themselves via client_credentials, so the token presented to the exchange is native to the agent, not the user. After a one-time provisioning, agents boot non-interactively, persist across sessions, and can spawn other agents on demand.

You could imagine this -- initial provisioning of some agent client:

async with await Manager.from_exchange_factory(factory=factory) as m:
    creds = await m.register_persistent_agent(MyAgent)
    # creds.agent_id is durable; credentials persisted at
    # $ACADEMY_HOME/agents/<uid>.json with 0o600 perms.

and reuse of that client avoiding re-auth:

# Booting — any future session, any process, zero prompts:
async with await Manager.from_exchange_factory(factory=factory) as m:
    creds = store.load(agent_id)  
    handle = await m.launch(
        MyAgent,
        registration=PersistentGlobusAgentRegistration(creds),
    )

and then an agent with sub-agent launch capabilities:

class Coordinator(Agent):
    @action
    async def deploy_worker(self) -> AgentId[Worker]:
        return await self.spawn_child(Worker, name='worker-1')
        # Parent's own client_credentials grant mints child credentials.
        # No user prompt; no user session needed.

This functionality requires:

  1. Agents authenticating themselves via client_credentials, instead of as the user's delegate. This removes one auth request.
  2. Agent mailboxes are shared with a fleet Globus Group when provisioned for auth
  3. Each agent's credentials (client_id, secret) are persisted on disk so they can boot non-interactively. This is a pre-req for sub-agent spawning + long-horizon agents.
  4. Parent agents provision children using their own manage_projects --> hierarchical relationships between agents are represented in the exchange.

All four are validated end-to-end against the live Globus dev exchange by some spike scripts, so in theory this should work.

This specific PR relates to point 3. @AK2000 already pushed a PR (#397) which persists credentials on disk for long-lived agents via the python -m academy.run path. This PR adds similar infrastructure: a storage convention for agent credentials, plus lookup and boot primitives. The flow agent_id -> credentials_map -> re-launched agent works today, but there is no user facing flow that exercises it yet (future PR).

Identity mapping between agents and agent-credentials is done via UUID. PersistentAgentCredentials.agent_id carries the agents UUID. The store writes agents/{agent_id.uid}.json. The file path is the identity mapping.


Design decisions / thoughts

Globus Groups as auth

Right now an agent is a thin layer on top of the user's identity. The user logs in, delegates a token, and the agent uses that token to access the exchange on the user's behalf.

The idea is to make each agent its own Globus Confidential Client, authenticating against the exchange via client_credentials OAuth grant. Once an agent is running under its own auth, the user's session is not required. The agent could run indefinitely, its credentials can be written to disk, and reloaded across restarts.

We can also make an agent spawn-capable at its registration, which would (in this proposal) grant it the manage_projects capability, enabling it to mint creds for child agents. In this model a user is required to provision a parent-agent and the fleet of agents under that parent can grow without additional auth prompts.

This all assumes that we move the auth model from per-agent scopes to fleet groups (Globus Groups). Currently the exchange does this check: "does this token bear scope agent_scope", which requires a consent prompt per agent. With groups this becomes "is this agent a member of fleet group X?". So per-agent scopes are not required.

The logic for this already exists, and authz check runs in the backend (look for group_memberships in the backend). Its used for explicit mailbox sharing (e.g. a user chooses to share a mailbox with a group), so this proposal is to make this the default rather than opt-in, and to develop the wiring needed to manage fleet memberships.

Secondary wins

Relationship disambiguation on the exchange. As each agent has its own client credentials and authenticates as itself, we can resolve an agents client_id at the exchange. So, the exchange can now know which agent sent a message, and we can create attestation / provenance chains. You could enforce things like only a parent shutting down its child, only a fleet member invoking a fleet action, or rejecting a message whose claimed src doesn't match the authenticated sender.

Related Issues

Changes

  • Breaking (backwards incompatible changes to public interfaces)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (non-breaking change or feature addition)
  • Refactor (internal code or design clean up)
  • Documentation (no changes to the code)
  • Test (changes or additions to testing)
  • Build (change to CI workflows or build processes)
  • Package (changes to package metadata or dependency versions)

Testing

New unit tests included in PR.

Pull Request Checklist

Please confirm the PR meets the following requirements.

  • Relevant tags are added based on the types of changes.
  • Code changes pass pre-commit (e.g., ruff, mypy, etc.).
  • Tests have been added to show the fix is effective or that the new feature works.
  • New and existing unit tests pass locally with the changes.
  • Docs have been updated and reviewed if relevant.

@GusEllerm GusEllerm changed the title Auth friction and sub-agents Auth friction and sub-agents convo May 22, 2026
@AK2000 AK2000 marked this pull request as draft May 22, 2026 19:26
@AK2000 AK2000 marked this pull request as ready for review May 22, 2026 19:26
"""Whether this agent may spawn child agents.

Default ``False``. ``True`` grants project-admin authority and is
not safe for production until isolation is implemented.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a sensible path for that implementation?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and who gets to do what now with this? and how is that restricted when "isolation is implemented"?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes -- for each spawn-capable agent you create a sub-project. First we enable agents to be project admins for the user's main Globus project (dangerous, but a stepping stone). If creds leak an attacker basically has full access. I assume that users trust themselves, and we expose an explicit warning if spawn_capable=True regarding that risk.

Then we can work on isolation. Each spawn-capable agent owns a child Globus Project minted when they are provisioned. Manager.register_persistent_agent(...., spawn_capable=True) calls auth_client.create_project(...), which creates the parent client within a sub-project as admin. Children are minted in the sub-project. PersistentAgentCredentials gets a project_id so parents know which project to create children in.

So that isolation restricts auth actions, but agents can still communicate. Sub-projects bind what an agent can create or destroy in Globus. Coms is seperate -- each agent the user launches holds the same exchange scope, so they all reach the same exchange, and fleet group membership decides whose mailbox they can read.

return auth_code.strip()


def get_academy_home() -> pathlib.Path:

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

put this somewhere more general and pull it out of this PR? at least the json pool logger also wants this sort of thing:

$ git grep share.acad
academy/logging/configs/jsonpool.py: ~/local/share/academy/logs/

directory = get_academy_home() / _AGENTS_SUBDIR
self._dir = pathlib.Path(directory)
self._dir.mkdir(parents=True, exist_ok=True)
self._dir.chmod(_AGENTS_DIR_MODE)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

theres probably some security relevant race condition here between the mkdir and chmod.

eg some of the hits on https://www.google.com/search?q=mkdir+chmod+toctou

"""MRO of the agent type, matching ``Agent._agent_mro()``."""

fleet_group_ids: tuple[uuid.UUID, ...]
"""Fleet group memberships at provision time (length 1 in v1)."""

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is this for? it only seems to be referenced in tests?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right -- thats an artefact from part of the future implementation. Not best practice to include here without its functionality, but it helps explain the direction. The idea is to have group memberships handle auth checks on agent mailboxs (replacing per-agent scopes). I updated the PR comment with more info re this direction. If the proposal looks good after some convo I can pull this to make the PRs more self contained.

@benclifford

Copy link
Copy Markdown
Collaborator

note from a private conversation:

Another question on your proposal is about what this unlocks in terms of other auth. For example do we now get a trust chain where an agent can believe the remote agent id, enough for auth, if restricting who can call actions

GusEllerm added a commit to GusEllerm/academy that referenced this pull request Jun 15, 2026
While brainstorming and developing out auth for the Globus exchange, a small path helper was included in academy-agents#416. This change pulls that helper out as a more general method for use across different modules, including logging and token storage.
GusEllerm added a commit that referenced this pull request Jun 16, 2026
While brainstorming and developing out auth for the Globus exchange, a small path helper was included in #416. This change pulls that helper out as a more general method for use across different modules, including logging and token storage.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants