-
-
Notifications
You must be signed in to change notification settings - Fork 27
fix: bind session_id to caller ownership #25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,21 @@ | ||
| # Security Policy | ||
|
|
||
| ## Reporting a vulnerability | ||
|
|
||
| Please open a private report via GitHub Security Advisories (preferred), or file an issue with minimal details if private reporting is not available. | ||
|
|
||
| ## Notes for operators | ||
|
|
||
| ### `session_id` ownership binding | ||
|
|
||
| HumaneProxy supports caller-provided `session_id` values for trajectory tracking and escalation auditing. In multi-tenant deployments, a user who can guess another user’s `session_id` could otherwise poison their risk trajectory or generate false escalations. | ||
|
|
||
| To mitigate this, HumaneProxy binds each `session_id` to a per-caller **owner token** on first use. Subsequent writes to the same `session_id` must match the original owner token. | ||
|
|
||
| For the built-in HTTP proxy (`POST /chat`), the owner token is derived from the client IP address and **hardened** with `HUMANE_PROXY_SESSION_SECRET` when set. | ||
|
|
||
| #### Recommended configuration | ||
|
|
||
| - Set `HUMANE_PROXY_SESSION_SECRET` to a long random value (and keep it stable across deploys). | ||
| - Avoid predictable `session_id` values (usernames, emails, sequential IDs). Prefer random IDs. | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,12 @@ | ||
| """Project-wide exception types.""" | ||
|
|
||
| from __future__ import annotations | ||
|
|
||
|
|
||
| class HumaneProxyError(Exception): | ||
| """Base exception for HumaneProxy.""" | ||
|
|
||
|
|
||
| class SessionOwnershipError(HumaneProxyError): | ||
| """Raised when a session_id is used by a different caller/owner.""" | ||
|
|
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -4,6 +4,8 @@ | |
|
|
||
| import logging | ||
| import os | ||
| import hashlib | ||
| import hmac | ||
| from collections.abc import AsyncGenerator | ||
| from contextlib import asynccontextmanager | ||
| from typing import Any | ||
|
|
@@ -12,6 +14,7 @@ | |
| from fastapi import FastAPI, Request | ||
| from fastapi.responses import JSONResponse | ||
|
|
||
| from humane_proxy.errors import SessionOwnershipError | ||
| from humane_proxy.escalation.local_db import init_db | ||
| from humane_proxy.escalation.router import escalate, get_self_harm_response | ||
|
|
||
|
|
@@ -62,6 +65,22 @@ | |
| request.client.host if request.client else "unknown" | ||
| ) | ||
|
|
||
| def _owner_token_for_request(request: Request) -> str: | ||
| """Derive a stable per-caller owner token for session binding. | ||
|
|
||
| Uses ``HUMANE_PROXY_SESSION_SECRET`` when set for HMAC hardening. | ||
| Falls back to a deterministic hash of the client IP to avoid storing | ||
| raw IPs in the session ownership table. | ||
| """ | ||
| ip = request.client.host if request.client else "unknown" | ||
| secret = os.environ.get("HUMANE_PROXY_SESSION_SECRET", "").encode("utf-8") | ||
| ip_bytes = ip.encode("utf-8") | ||
| if secret: | ||
| digest = hmac.new(secret, ip_bytes, hashlib.sha256).hexdigest() | ||
| return f"hmac:{digest}" | ||
| digest = hashlib.sha256(ip_bytes).hexdigest() | ||
| return f"ip:{digest}" | ||
|
|
||
|
|
||
| def _extract_last_user_message(payload: dict[str, Any]) -> str: | ||
| messages: list[dict[str, str]] = payload.get("messages", []) | ||
|
|
@@ -77,6 +96,7 @@ | |
| payload: dict[str, Any] = await request.json() | ||
|
|
||
| session_id = _resolve_session_id(payload, request) | ||
| owner_token = _owner_token_for_request(request) | ||
| user_message = _extract_last_user_message(payload) | ||
|
|
||
| if not user_message: | ||
|
|
@@ -85,6 +105,16 @@ | |
| content={"status": "error", "message": "No user message found in payload."}, | ||
| ) | ||
|
|
||
| from humane_proxy.storage.factory import get_store | ||
|
|
||
| try: | ||
| get_store().assert_session_owner(session_id, owner_token) | ||
| except SessionOwnershipError as exc: | ||
| return JSONResponse( | ||
| status_code=403, | ||
| content={"status": "error", "message": str(exc), "session_id": session_id}, | ||
Check warningCode scanning / CodeQL Information exposure through an exception Medium Stack trace information Error loading related location Loading |
||
|
|
||
| ) | ||
|
Comment on lines
+113
to
+116
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Avoid exposing exception details to external users. The CodeQL warning is valid. Returning 🛡️ Proposed fix except SessionOwnershipError as exc:
+ logger.warning("Session ownership mismatch for %s: %s", session_id, exc)
return JSONResponse(
status_code=403,
- content={"status": "error", "message": str(exc), "session_id": session_id},
+ content={"status": "error", "message": "Session ownership mismatch.", "session_id": session_id},
)🧰 Tools🪛 GitHub Check: CodeQL[warning] 115-115: Information exposure through an exception 🤖 Prompt for AI Agents |
||
|
|
||
| pipeline = _get_pipeline() | ||
| result = await pipeline.classify(user_message, session_id) | ||
|
|
||
|
|
@@ -99,6 +129,7 @@ | |
| message_hash=result.message_hash, | ||
| stage_reached=cls.stage, | ||
| reasoning=cls.reasoning, | ||
| owner_token=owner_token, | ||
| ) | ||
|
|
||
| # Self-harm: return care response instead of generic flagged message. | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -18,6 +18,25 @@ def init(self) -> None: | |
| """Initialise the storage backend (create tables, ensure indexes, etc.).""" | ||
| ... | ||
|
|
||
| @abstractmethod | ||
| def get_session_owner(self, session_id: str) -> str | None: | ||
| """Return the owner token for a session, or ``None`` if unknown.""" | ||
| ... | ||
|
|
||
| @abstractmethod | ||
| def set_session_owner(self, session_id: str, owner_token: str) -> None: | ||
| """Bind a session to an owner token (first write wins).""" | ||
| ... | ||
|
|
||
| @abstractmethod | ||
| def assert_session_owner(self, session_id: str, owner_token: str) -> None: | ||
| """Ensure ``session_id`` is owned by ``owner_token``. | ||
|
|
||
| Implementations should persist the first seen owner token for a new | ||
| session and reject subsequent mismatches. | ||
| """ | ||
|
Comment on lines
+27
to
+37
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win Tighten the ownership contract to prevent backend divergence. Line 27 and Line 32 define intent, but the interface should explicitly require non-overwrite semantics and a single mismatch exception type (e.g., Proposed contract-doc clarification `@abstractmethod`
def set_session_owner(self, session_id: str, owner_token: str) -> None:
- """Bind a session to an owner token (first write wins)."""
+ """Bind a session to an owner token (first write wins).
+
+ Must not overwrite an existing different owner token.
+ """
...
`@abstractmethod`
def assert_session_owner(self, session_id: str, owner_token: str) -> None:
"""Ensure ``session_id`` is owned by ``owner_token``.
- Implementations should persist the first seen owner token for a new
- session and reject subsequent mismatches.
+ Implementations must atomically check-and-bind ownership:
+ - if no owner exists, persist ``owner_token``
+ - if owner matches, no-op
+ - if owner differs, raise ``SessionOwnershipError``
"""
...🤖 Prompt for AI Agents |
||
| ... | ||
|
|
||
| @abstractmethod | ||
| def log( | ||
| self, | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Document IP-derived owner token limitations and proxy trust assumptions.
Line 15 currently reads as if IP-based binding is broadly strong. In shared-IP/NAT environments this can collapse multiple callers into one owner token and weaken isolation. Add explicit guidance to use explicit/app-level owner tokens for stronger identity and to only trust vetted proxy headers for client IP derivation.
Suggested doc patch
📝 Committable suggestion
🤖 Prompt for AI Agents