fix: bind session_id to caller ownership#25
Conversation
📝 WalkthroughSummary by CodeRabbitRelease Notes
WalkthroughThis PR implements session ownership binding to prevent cross-session trajectory poisoning in multi-tenant deployments. A ChangesSession Ownership Binding for Multi-Tenant Safety
Sequence DiagramsequenceDiagram
participant Client as Client
participant Middleware as HTTP Middleware
participant Store as Session Store
participant Escalation as Escalation Router
Client->>Middleware: POST /chat (session_id, message)
Middleware->>Middleware: Derive owner_token from client IP + secret
Middleware->>Store: assert_session_owner(session_id, owner_token)
alt First write
Store->>Store: set_session_owner(session_id, owner_token)
else Ownership mismatch
Store-->>Middleware: SessionOwnershipError
Middleware-->>Client: HTTP 403
end
Middleware->>Escalation: check_rate_limit(session_id, owner_token)
Escalation->>Store: assert_session_owner(session_id, owner_token)
Escalation->>Store: check_rate_limit(session_id)
Middleware->>Middleware: classify_sync(message)
alt High risk detected
Middleware->>Escalation: escalate(..., owner_token)
Escalation->>Store: assert_session_owner(session_id, owner_token)
Escalation->>Store: log_escalation(session_id, ..., owner_token)
end
Middleware-->>Client: classification result
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
| except SessionOwnershipError as exc: | ||
| return JSONResponse( | ||
| status_code=403, | ||
| content={"status": "error", "message": str(exc), "session_id": session_id}, |
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
humane_proxy/mcp_server.py (1)
82-96:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winDocument the
owner_tokenparameter.The docstring describes
messageandsession_idbut omits the newowner_tokenparameter. MCP tool parameters are exposed to clients, so documenting this optional ownership token would improve discoverability.📝 Proposed docstring addition
Parameters ---------- message: The user message to classify. session_id: Optional session identifier for trajectory tracking. + owner_token: + Optional token to validate session ownership. When provided, + the session must be bound to this token or the call will fail. Returns🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@humane_proxy/mcp_server.py` around lines 82 - 96, Add documentation for the new owner_token parameter to the method that classifies messages (the "Classify a message for self-harm or criminal intent." docstring—likely the classify_message function/method). Under the Parameters section, add an entry for owner_token (e.g., owner_token: Optional[str]) describing that it is an optional ownership/auth token used for tracking or access control, that it defaults to None, and any privacy/validation notes needed for clients; keep the rest of the docstring format consistent with existing entries so the MCP tool exposes this parameter to clients.
🧹 Nitpick comments (1)
tests/test_session_ownership.py (1)
32-34: ⚡ Quick winAdd a post-rebind mismatch assertion to harden the regression signal.
After Line 33, assert that
owner-ais rejected forsid-2again. This verifies “delete then rebind” still restores strict mismatch enforcement, not just successful rebinding.Suggested test addition
def test_delete_session_clears_owner(tmp_path): @@ # After delete, the session can be re-bound to a new owner. store.assert_session_owner("sid-2", "owner-b") + with pytest.raises(SessionOwnershipError): + store.assert_session_owner("sid-2", "owner-a")🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/test_session_ownership.py` around lines 32 - 34, Add a post-rebind mismatch assertion to ensure "owner-a" is rejected for "sid-2" after the delete+rebind sequence: after the existing store.assert_session_owner("sid-2", "owner-b") call, add a call to the test helper that asserts a mismatch for the previous owner (e.g. store.assert_session_mismatch("sid-2", "owner-a") or the equivalent helper in your test suite) so the test verifies strict mismatch enforcement is restored; locate this near the existing store.assert_session_owner assertion in tests/test_session_ownership.py.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@humane_proxy/middleware/interceptor.py`:
- Around line 113-116: The JSONResponse in interceptor.py currently returns the
raw exception string (str(exc)) which can leak sensitive internals; change the
response to use a generic error message (e.g., "Internal server error" or
"Access denied") instead of str(exc) and ensure the real exception is recorded
only in server logs by calling the module logger (e.g., logger.exception or
logger.error with exc_info) including session_id for correlation; update the
code path that constructs the JSONResponse (the return that references
JSONResponse, exc and session_id) to remove exposure of exc to clients and log
the full exception internally.
In `@humane_proxy/storage/base.py`:
- Around line 27-37: Update the storage interface to enforce a strict,
non-overwrite ownership contract: ensure set_session_owner(session_id,
owner_token) is specified to persist the owner only if the session has no owner
yet (first-write wins, must be idempotent) and that
assert_session_owner(session_id, owner_token) raises a single, well-defined
exception type (SessionOwnershipError) on any mismatch; document that backends
must persist the first owner and never replace it, and must throw
SessionOwnershipError for mismatches so all implementations behave identically.
In `@SECURITY.md`:
- Around line 15-20: Update the SECURITY.md section that describes the
IP-derived owner token (used for the built-in HTTP proxy POST /chat and hardened
with HUMANE_PROXY_SESSION_SECRET) to explicitly document its limitations: note
that deriving owner tokens from client IPs can collapse distinct users behind
shared IPs/NAT and weaken isolation, recommend using explicit/app-level owner
tokens or random session_id values for stronger identity, and add guidance to
only trust vetted proxy headers (and document how to configure trusted proxies)
when deriving client IPs so operators don’t rely on untrusted headers.
---
Outside diff comments:
In `@humane_proxy/mcp_server.py`:
- Around line 82-96: Add documentation for the new owner_token parameter to the
method that classifies messages (the "Classify a message for self-harm or
criminal intent." docstring—likely the classify_message function/method). Under
the Parameters section, add an entry for owner_token (e.g., owner_token:
Optional[str]) describing that it is an optional ownership/auth token used for
tracking or access control, that it defaults to None, and any privacy/validation
notes needed for clients; keep the rest of the docstring format consistent with
existing entries so the MCP tool exposes this parameter to clients.
---
Nitpick comments:
In `@tests/test_session_ownership.py`:
- Around line 32-34: Add a post-rebind mismatch assertion to ensure "owner-a" is
rejected for "sid-2" after the delete+rebind sequence: after the existing
store.assert_session_owner("sid-2", "owner-b") call, add a call to the test
helper that asserts a mismatch for the previous owner (e.g.
store.assert_session_mismatch("sid-2", "owner-a") or the equivalent helper in
your test suite) so the test verifies strict mismatch enforcement is restored;
locate this near the existing store.assert_session_owner assertion in
tests/test_session_ownership.py.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 6bcd3d91-c1bc-485f-a363-38086e8426be
📒 Files selected for processing (12)
SECURITY.mdhumane_proxy/__init__.pyhumane_proxy/errors.pyhumane_proxy/escalation/local_db.pyhumane_proxy/escalation/router.pyhumane_proxy/mcp_server.pyhumane_proxy/middleware/interceptor.pyhumane_proxy/storage/base.pyhumane_proxy/storage/postgres.pyhumane_proxy/storage/redis.pyhumane_proxy/storage/sqlite.pytests/test_session_ownership.py
| return JSONResponse( | ||
| status_code=403, | ||
| content={"status": "error", "message": str(exc), "session_id": session_id}, | ||
| ) |
There was a problem hiding this comment.
Avoid exposing exception details to external users.
The CodeQL warning is valid. Returning str(exc) directly can leak internal error details (e.g., owner token values, session internals) to external callers. Use a generic message instead.
🛡️ Proposed fix
except SessionOwnershipError as exc:
+ logger.warning("Session ownership mismatch for %s: %s", session_id, exc)
return JSONResponse(
status_code=403,
- content={"status": "error", "message": str(exc), "session_id": session_id},
+ content={"status": "error", "message": "Session ownership mismatch.", "session_id": session_id},
)🧰 Tools
🪛 GitHub Check: CodeQL
[warning] 115-115: Information exposure through an exception
Stack trace information flows to this location and may be exposed to an external user.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@humane_proxy/middleware/interceptor.py` around lines 113 - 116, The
JSONResponse in interceptor.py currently returns the raw exception string
(str(exc)) which can leak sensitive internals; change the response to use a
generic error message (e.g., "Internal server error" or "Access denied") instead
of str(exc) and ensure the real exception is recorded only in server logs by
calling the module logger (e.g., logger.exception or logger.error with exc_info)
including session_id for correlation; update the code path that constructs the
JSONResponse (the return that references JSONResponse, exc and session_id) to
remove exposure of exc to clients and log the full exception internally.
| def set_session_owner(self, session_id: str, owner_token: str) -> None: | ||
| """Bind a session to an owner token (first write wins).""" | ||
| ... | ||
|
|
||
| @abstractmethod | ||
| def assert_session_owner(self, session_id: str, owner_token: str) -> None: | ||
| """Ensure ``session_id`` is owned by ``owner_token``. | ||
|
|
||
| Implementations should persist the first seen owner token for a new | ||
| session and reject subsequent mismatches. | ||
| """ |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win
Tighten the ownership contract to prevent backend divergence.
Line 27 and Line 32 define intent, but the interface should explicitly require non-overwrite semantics and a single mismatch exception type (e.g., SessionOwnershipError) so backends cannot drift in behavior.
Proposed contract-doc clarification
`@abstractmethod`
def set_session_owner(self, session_id: str, owner_token: str) -> None:
- """Bind a session to an owner token (first write wins)."""
+ """Bind a session to an owner token (first write wins).
+
+ Must not overwrite an existing different owner token.
+ """
...
`@abstractmethod`
def assert_session_owner(self, session_id: str, owner_token: str) -> None:
"""Ensure ``session_id`` is owned by ``owner_token``.
- Implementations should persist the first seen owner token for a new
- session and reject subsequent mismatches.
+ Implementations must atomically check-and-bind ownership:
+ - if no owner exists, persist ``owner_token``
+ - if owner matches, no-op
+ - if owner differs, raise ``SessionOwnershipError``
"""
...🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@humane_proxy/storage/base.py` around lines 27 - 37, Update the storage
interface to enforce a strict, non-overwrite ownership contract: ensure
set_session_owner(session_id, owner_token) is specified to persist the owner
only if the session has no owner yet (first-write wins, must be idempotent) and
that assert_session_owner(session_id, owner_token) raises a single, well-defined
exception type (SessionOwnershipError) on any mismatch; document that backends
must persist the first owner and never replace it, and must throw
SessionOwnershipError for mismatches so all implementations behave identically.
| For the built-in HTTP proxy (`POST /chat`), the owner token is derived from the client IP address and **hardened** with `HUMANE_PROXY_SESSION_SECRET` when set. | ||
|
|
||
| #### Recommended configuration | ||
|
|
||
| - Set `HUMANE_PROXY_SESSION_SECRET` to a long random value (and keep it stable across deploys). | ||
| - Avoid predictable `session_id` values (usernames, emails, sequential IDs). Prefer random IDs. |
There was a problem hiding this comment.
Document IP-derived owner token limitations and proxy trust assumptions.
Line 15 currently reads as if IP-based binding is broadly strong. In shared-IP/NAT environments this can collapse multiple callers into one owner token and weaken isolation. Add explicit guidance to use explicit/app-level owner tokens for stronger identity and to only trust vetted proxy headers for client IP derivation.
Suggested doc patch
-For the built-in HTTP proxy (`POST /chat`), the owner token is derived from the client IP address and **hardened** with `HUMANE_PROXY_SESSION_SECRET` when set.
+For the built-in HTTP proxy (`POST /chat`), the owner token is derived from the client IP address and **hardened** with `HUMANE_PROXY_SESSION_SECRET` when set.
+IP-based ownership is a coarse identity signal: users behind the same NAT/egress IP can share an owner token.
+For stronger per-user isolation, prefer supplying an explicit application-level `owner_token` (for example, stable API key/user identity).
+If running behind a reverse proxy/load balancer, ensure client IP extraction uses only trusted headers/sources.📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| For the built-in HTTP proxy (`POST /chat`), the owner token is derived from the client IP address and **hardened** with `HUMANE_PROXY_SESSION_SECRET` when set. | |
| #### Recommended configuration | |
| - Set `HUMANE_PROXY_SESSION_SECRET` to a long random value (and keep it stable across deploys). | |
| - Avoid predictable `session_id` values (usernames, emails, sequential IDs). Prefer random IDs. | |
| For the built-in HTTP proxy (`POST /chat`), the owner token is derived from the client IP address and **hardened** with `HUMANE_PROXY_SESSION_SECRET` when set. | |
| IP-based ownership is a coarse identity signal: users behind the same NAT/egress IP can share an owner token. | |
| For stronger per-user isolation, prefer supplying an explicit application-level `owner_token` (for example, stable API key/user identity). | |
| If running behind a reverse proxy/load balancer, ensure client IP extraction uses only trusted headers/sources. | |
| #### Recommended configuration | |
| - Set `HUMANE_PROXY_SESSION_SECRET` to a long random value (and keep it stable across deploys). | |
| - Avoid predictable `session_id` values (usernames, emails, sequential IDs). Prefer random IDs. |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@SECURITY.md` around lines 15 - 20, Update the SECURITY.md section that
describes the IP-derived owner token (used for the built-in HTTP proxy POST
/chat and hardened with HUMANE_PROXY_SESSION_SECRET) to explicitly document its
limitations: note that deriving owner tokens from client IPs can collapse
distinct users behind shared IPs/NAT and weaken isolation, recommend using
explicit/app-level owner tokens or random session_id values for stronger
identity, and add guidance to only trust vetted proxy headers (and document how
to configure trusted proxies) when deriving client IPs so operators don’t rely
on untrusted headers.
|
@ionfwsrijan I have some comments regarding this specific issue. First of all, I'll be labelling both this PR and issue #24 as
These points make me come to the understanding that you don't understand the codebase and were not making legitimate contributions. I've concluded that the contribution was made using AI agents without any human supervision. For this reason, the PR, issue and your username have been forwarded to GSSoC. You may take up the matter with their team if you wish to do so. |
Description
Binds each
session_idto a stable per-caller owner token on first use and rejects subsequent mismatches to prevent cross-session risk trajectory poisoning (and false-positive/false-negative escalations).This adds session ownership tracking across SQLite/Redis/Postgres backends, enforces ownership in the
/chatinterceptor (owner token derived from client IP, HMAC-hardened withHUMANE_PROXY_SESSION_SECRETwhen set), adds programmatic support via optionalowner_tokenparameters, and includes tests plus security documentation.Fixes #24.
Type of Change
Checklist
pytest tests/ -v)