Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 21 additions & 0 deletions SECURITY.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# Security Policy

## Reporting a vulnerability

Please open a private report via GitHub Security Advisories (preferred), or file an issue with minimal details if private reporting is not available.

## Notes for operators

### `session_id` ownership binding

HumaneProxy supports caller-provided `session_id` values for trajectory tracking and escalation auditing. In multi-tenant deployments, a user who can guess another user’s `session_id` could otherwise poison their risk trajectory or generate false escalations.

To mitigate this, HumaneProxy binds each `session_id` to a per-caller **owner token** on first use. Subsequent writes to the same `session_id` must match the original owner token.

For the built-in HTTP proxy (`POST /chat`), the owner token is derived from the client IP address and **hardened** with `HUMANE_PROXY_SESSION_SECRET` when set.

#### Recommended configuration

- Set `HUMANE_PROXY_SESSION_SECRET` to a long random value (and keep it stable across deploys).
- Avoid predictable `session_id` values (usernames, emails, sequential IDs). Prefer random IDs.
Comment on lines +15 to +20

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Document IP-derived owner token limitations and proxy trust assumptions.

Line 15 currently reads as if IP-based binding is broadly strong. In shared-IP/NAT environments this can collapse multiple callers into one owner token and weaken isolation. Add explicit guidance to use explicit/app-level owner tokens for stronger identity and to only trust vetted proxy headers for client IP derivation.

Suggested doc patch
-For the built-in HTTP proxy (`POST /chat`), the owner token is derived from the client IP address and **hardened** with `HUMANE_PROXY_SESSION_SECRET` when set.
+For the built-in HTTP proxy (`POST /chat`), the owner token is derived from the client IP address and **hardened** with `HUMANE_PROXY_SESSION_SECRET` when set.
+IP-based ownership is a coarse identity signal: users behind the same NAT/egress IP can share an owner token.
+For stronger per-user isolation, prefer supplying an explicit application-level `owner_token` (for example, stable API key/user identity).
+If running behind a reverse proxy/load balancer, ensure client IP extraction uses only trusted headers/sources.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
For the built-in HTTP proxy (`POST /chat`), the owner token is derived from the client IP address and **hardened** with `HUMANE_PROXY_SESSION_SECRET` when set.
#### Recommended configuration
- Set `HUMANE_PROXY_SESSION_SECRET` to a long random value (and keep it stable across deploys).
- Avoid predictable `session_id` values (usernames, emails, sequential IDs). Prefer random IDs.
For the built-in HTTP proxy (`POST /chat`), the owner token is derived from the client IP address and **hardened** with `HUMANE_PROXY_SESSION_SECRET` when set.
IP-based ownership is a coarse identity signal: users behind the same NAT/egress IP can share an owner token.
For stronger per-user isolation, prefer supplying an explicit application-level `owner_token` (for example, stable API key/user identity).
If running behind a reverse proxy/load balancer, ensure client IP extraction uses only trusted headers/sources.
#### Recommended configuration
- Set `HUMANE_PROXY_SESSION_SECRET` to a long random value (and keep it stable across deploys).
- Avoid predictable `session_id` values (usernames, emails, sequential IDs). Prefer random IDs.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@SECURITY.md` around lines 15 - 20, Update the SECURITY.md section that
describes the IP-derived owner token (used for the built-in HTTP proxy POST
/chat and hardened with HUMANE_PROXY_SESSION_SECRET) to explicitly document its
limitations: note that deriving owner tokens from client IPs can collapse
distinct users behind shared IPs/NAT and weaken isolation, recommend using
explicit/app-level owner tokens or random session_id values for stronger
identity, and add guidance to only trust vetted proxy headers (and document how
to configure trusted proxies) when deriving client IPs so operators don’t rely
on untrusted headers.


22 changes: 20 additions & 2 deletions humane_proxy/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -89,7 +89,13 @@ def pipeline(self):
"""Return the underlying SafetyPipeline instance."""
return self._pipeline

def check(self, text: str, session_id: str = "programmatic") -> dict:
def check(
self,
text: str,
session_id: str = "programmatic",
*,
owner_token: str | None = None,
) -> dict:
"""Run the synchronous safety pipeline on *text* (Stages 1+2).

Returns
Expand All @@ -98,10 +104,19 @@ def check(self, text: str, session_id: str = "programmatic") -> dict:
``{"safe": bool, "category": str, "score": float, "triggers": list,
"stage_reached": int, ...}``
"""
if owner_token is not None:
from humane_proxy.storage.factory import get_store
get_store().assert_session_owner(session_id, owner_token)
result = self._pipeline.classify_sync(text, session_id)
return result.to_dict()

async def check_async(self, text: str, session_id: str = "programmatic") -> dict:
async def check_async(
self,
text: str,
session_id: str = "programmatic",
*,
owner_token: str | None = None,
) -> dict:
"""Run the full async safety pipeline on *text* (all 3 stages).

Returns
Expand All @@ -110,6 +125,9 @@ async def check_async(self, text: str, session_id: str = "programmatic") -> dict
Same as :meth:`check`, but potentially enriched with Stage-3
reasoning and higher accuracy.
"""
if owner_token is not None:
from humane_proxy.storage.factory import get_store
get_store().assert_session_owner(session_id, owner_token)
result = await self._pipeline.classify(text, session_id)
return result.to_dict()

Expand Down
12 changes: 12 additions & 0 deletions humane_proxy/errors.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
"""Project-wide exception types."""

from __future__ import annotations


class HumaneProxyError(Exception):
"""Base exception for HumaneProxy."""


class SessionOwnershipError(HumaneProxyError):
"""Raised when a session_id is used by a different caller/owner."""

13 changes: 10 additions & 3 deletions humane_proxy/escalation/local_db.py
Original file line number Diff line number Diff line change
Expand Up @@ -40,10 +40,13 @@ def init_db() -> None:
logger.info("Escalation storage initialised (backend: %s)", type(store).__name__)


def check_rate_limit(session_id: str) -> bool:
def check_rate_limit(session_id: str, owner_token: str | None = None) -> bool:
"""Return ``True`` if the session is **within** its allowed quota."""
from humane_proxy.storage.factory import get_store
return get_store().check_rate_limit(session_id)
store = get_store()
if owner_token is not None:
store.assert_session_owner(session_id, owner_token)
return store.check_rate_limit(session_id)


def log_escalation(
Expand All @@ -54,10 +57,14 @@ def log_escalation(
message_hash: str | None = None,
stage_reached: int = 1,
reasoning: str | None = None,
owner_token: str | None = None,
) -> None:
"""Persist an escalation event to the configured backend."""
from humane_proxy.storage.factory import get_store
get_store().log(
store = get_store()
if owner_token is not None:
store.assert_session_owner(session_id, owner_token)
store.log(
session_id=session_id,
category=category,
risk_score=risk_score,
Expand Down
4 changes: 3 additions & 1 deletion humane_proxy/escalation/router.py
Original file line number Diff line number Diff line change
Expand Up @@ -133,6 +133,7 @@ def escalate(
message_hash: str | None = None,
stage_reached: int = 1,
reasoning: str | None = None,
owner_token: str | None = None,
) -> dict:
"""Handle a flagged interaction.

Expand Down Expand Up @@ -167,7 +168,7 @@ def escalate(
triggers = triggers or []

# --- Rate-limit gate ---
if not check_rate_limit(session_id):
if not check_rate_limit(session_id, owner_token=owner_token):
logger.warning(
"[RATE-LIMITED] session=%s category=%s risk_score=%.2f — suppressed (quota exhausted)",
session_id, category, risk_score,
Expand All @@ -186,6 +187,7 @@ def escalate(
message_hash=message_hash,
stage_reached=stage_reached,
reasoning=reasoning,
owner_token=owner_token,
)
except Exception:
logger.exception("Failed to write escalation to DB for session=%s", session_id)
Expand Down
4 changes: 4 additions & 0 deletions humane_proxy/mcp_server.py
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,7 @@ def _get_mcp_auth_provider():
async def check_message_safety(
message: str,
session_id: str = "mcp-default",
owner_token: str | None = None,
) -> dict:
"""Classify a message for self-harm or criminal intent.

Expand All @@ -98,6 +99,9 @@ async def check_message_safety(

config = get_config()
pipeline = SafetyPipeline(config)
if owner_token is not None:
from humane_proxy.storage.factory import get_store
get_store().assert_session_owner(session_id, owner_token)
result = await pipeline.classify(message, session_id)
return result.to_dict()

Expand Down
31 changes: 31 additions & 0 deletions humane_proxy/middleware/interceptor.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@

import logging
import os
import hashlib
import hmac
from collections.abc import AsyncGenerator
from contextlib import asynccontextmanager
from typing import Any
Expand All @@ -12,6 +14,7 @@
from fastapi import FastAPI, Request
from fastapi.responses import JSONResponse

from humane_proxy.errors import SessionOwnershipError
from humane_proxy.escalation.local_db import init_db
from humane_proxy.escalation.router import escalate, get_self_harm_response

Expand Down Expand Up @@ -62,6 +65,22 @@
request.client.host if request.client else "unknown"
)

def _owner_token_for_request(request: Request) -> str:
"""Derive a stable per-caller owner token for session binding.

Uses ``HUMANE_PROXY_SESSION_SECRET`` when set for HMAC hardening.
Falls back to a deterministic hash of the client IP to avoid storing
raw IPs in the session ownership table.
"""
ip = request.client.host if request.client else "unknown"
secret = os.environ.get("HUMANE_PROXY_SESSION_SECRET", "").encode("utf-8")
ip_bytes = ip.encode("utf-8")
if secret:
digest = hmac.new(secret, ip_bytes, hashlib.sha256).hexdigest()
return f"hmac:{digest}"
digest = hashlib.sha256(ip_bytes).hexdigest()
return f"ip:{digest}"


def _extract_last_user_message(payload: dict[str, Any]) -> str:
messages: list[dict[str, str]] = payload.get("messages", [])
Expand All @@ -77,6 +96,7 @@
payload: dict[str, Any] = await request.json()

session_id = _resolve_session_id(payload, request)
owner_token = _owner_token_for_request(request)
user_message = _extract_last_user_message(payload)

if not user_message:
Expand All @@ -85,6 +105,16 @@
content={"status": "error", "message": "No user message found in payload."},
)

from humane_proxy.storage.factory import get_store

try:
get_store().assert_session_owner(session_id, owner_token)
except SessionOwnershipError as exc:
return JSONResponse(
status_code=403,
content={"status": "error", "message": str(exc), "session_id": session_id},

Check warning

Code scanning / CodeQL

Information exposure through an exception Medium

Stack trace information
flows to this location and may be exposed to an external user.
)
Comment on lines +113 to +116

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Avoid exposing exception details to external users.

The CodeQL warning is valid. Returning str(exc) directly can leak internal error details (e.g., owner token values, session internals) to external callers. Use a generic message instead.

🛡️ Proposed fix
     except SessionOwnershipError as exc:
+        logger.warning("Session ownership mismatch for %s: %s", session_id, exc)
         return JSONResponse(
             status_code=403,
-            content={"status": "error", "message": str(exc), "session_id": session_id},
+            content={"status": "error", "message": "Session ownership mismatch.", "session_id": session_id},
         )
🧰 Tools
🪛 GitHub Check: CodeQL

[warning] 115-115: Information exposure through an exception
Stack trace information flows to this location and may be exposed to an external user.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@humane_proxy/middleware/interceptor.py` around lines 113 - 116, The
JSONResponse in interceptor.py currently returns the raw exception string
(str(exc)) which can leak sensitive internals; change the response to use a
generic error message (e.g., "Internal server error" or "Access denied") instead
of str(exc) and ensure the real exception is recorded only in server logs by
calling the module logger (e.g., logger.exception or logger.error with exc_info)
including session_id for correlation; update the code path that constructs the
JSONResponse (the return that references JSONResponse, exc and session_id) to
remove exposure of exc to clients and log the full exception internally.


pipeline = _get_pipeline()
result = await pipeline.classify(user_message, session_id)

Expand All @@ -99,6 +129,7 @@
message_hash=result.message_hash,
stage_reached=cls.stage,
reasoning=cls.reasoning,
owner_token=owner_token,
)

# Self-harm: return care response instead of generic flagged message.
Expand Down
19 changes: 19 additions & 0 deletions humane_proxy/storage/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,25 @@ def init(self) -> None:
"""Initialise the storage backend (create tables, ensure indexes, etc.)."""
...

@abstractmethod
def get_session_owner(self, session_id: str) -> str | None:
"""Return the owner token for a session, or ``None`` if unknown."""
...

@abstractmethod
def set_session_owner(self, session_id: str, owner_token: str) -> None:
"""Bind a session to an owner token (first write wins)."""
...

@abstractmethod
def assert_session_owner(self, session_id: str, owner_token: str) -> None:
"""Ensure ``session_id`` is owned by ``owner_token``.

Implementations should persist the first seen owner token for a new
session and reject subsequent mismatches.
"""
Comment on lines +27 to +37

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Tighten the ownership contract to prevent backend divergence.

Line 27 and Line 32 define intent, but the interface should explicitly require non-overwrite semantics and a single mismatch exception type (e.g., SessionOwnershipError) so backends cannot drift in behavior.

Proposed contract-doc clarification
     `@abstractmethod`
     def set_session_owner(self, session_id: str, owner_token: str) -> None:
-        """Bind a session to an owner token (first write wins)."""
+        """Bind a session to an owner token (first write wins).
+
+        Must not overwrite an existing different owner token.
+        """
         ...

     `@abstractmethod`
     def assert_session_owner(self, session_id: str, owner_token: str) -> None:
         """Ensure ``session_id`` is owned by ``owner_token``.

-        Implementations should persist the first seen owner token for a new
-        session and reject subsequent mismatches.
+        Implementations must atomically check-and-bind ownership:
+        - if no owner exists, persist ``owner_token``
+        - if owner matches, no-op
+        - if owner differs, raise ``SessionOwnershipError``
         """
         ...
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@humane_proxy/storage/base.py` around lines 27 - 37, Update the storage
interface to enforce a strict, non-overwrite ownership contract: ensure
set_session_owner(session_id, owner_token) is specified to persist the owner
only if the session has no owner yet (first-write wins, must be idempotent) and
that assert_session_owner(session_id, owner_token) raises a single, well-defined
exception type (SessionOwnershipError) on any mismatch; document that backends
must persist the first owner and never replace it, and must throw
SessionOwnershipError for mismatches so all implementations behave identically.

...

@abstractmethod
def log(
self,
Expand Down
56 changes: 56 additions & 0 deletions humane_proxy/storage/postgres.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
from datetime import datetime, timedelta, timezone
from typing import Any

from humane_proxy.errors import SessionOwnershipError
from humane_proxy.storage.base import EscalationStore

logger = logging.getLogger("humane_proxy.storage.postgres")
Expand Down Expand Up @@ -59,6 +60,15 @@ def _conn(self):

def init(self) -> None:
with self._conn() as conn:
conn.execute(
"""
CREATE TABLE IF NOT EXISTS sessions (
session_id TEXT PRIMARY KEY,
owner_token TEXT NOT NULL,
created_at DOUBLE PRECISION NOT NULL
)
"""
)
conn.execute(
"""
CREATE TABLE IF NOT EXISTS escalations (
Expand All @@ -83,6 +93,51 @@ def init(self) -> None:
conn.commit()
logger.info("PostgreSQL store initialised: %s", self._dsn.split("@")[-1] if "@" in self._dsn else "(local)")

def get_session_owner(self, session_id: str) -> str | None:
with self._conn() as conn:
row = conn.execute(
"SELECT owner_token FROM sessions WHERE session_id = %s",
(session_id,),
).fetchone()
return row["owner_token"] if row else None

def set_session_owner(self, session_id: str, owner_token: str) -> None:
ts = datetime.now(timezone.utc).timestamp()
with self._conn() as conn:
conn.execute(
"""
INSERT INTO sessions (session_id, owner_token, created_at)
VALUES (%s, %s, %s)
ON CONFLICT (session_id) DO NOTHING
""",
(session_id, owner_token, ts),
)
conn.commit()

def assert_session_owner(self, session_id: str, owner_token: str) -> None:
ts = datetime.now(timezone.utc).timestamp()
with self._conn() as conn:
conn.execute(
"""
INSERT INTO sessions (session_id, owner_token, created_at)
VALUES (%s, %s, %s)
ON CONFLICT (session_id) DO NOTHING
""",
(session_id, owner_token, ts),
)
row = conn.execute(
"SELECT owner_token FROM sessions WHERE session_id = %s",
(session_id,),
).fetchone()
conn.commit()
existing = row["owner_token"] if row else None
if existing is None:
return
if existing != owner_token:
raise SessionOwnershipError(
f"session_id '{session_id}' belongs to a different caller"
)

def log(
self,
session_id: str,
Expand Down Expand Up @@ -158,6 +213,7 @@ def delete_session(self, session_id: str) -> int:
cur = conn.execute(
"DELETE FROM escalations WHERE session_id = %s", (session_id,)
)
conn.execute("DELETE FROM sessions WHERE session_id = %s", (session_id,))
conn.commit()
return cur.rowcount

Expand Down
21 changes: 21 additions & 0 deletions humane_proxy/storage/redis.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
from datetime import datetime, timedelta, timezone
from typing import Any

from humane_proxy.errors import SessionOwnershipError
from humane_proxy.storage.base import EscalationStore

logger = logging.getLogger("humane_proxy.storage.redis")
Expand Down Expand Up @@ -69,6 +70,24 @@ def init(self) -> None:
self._client.ping()
logger.info("Redis store connected: %s", self._client.connection_pool.connection_kwargs.get("host", ""))

def get_session_owner(self, session_id: str) -> str | None:
value = self._client.get(self._key("owner", session_id))
return value or None

def set_session_owner(self, session_id: str, owner_token: str) -> None:
# First writer wins.
self._client.set(self._key("owner", session_id), owner_token, nx=True)

def assert_session_owner(self, session_id: str, owner_token: str) -> None:
owner_key = self._key("owner", session_id)
# First writer wins (SET NX). If it already exists, we verify.
self._client.set(owner_key, owner_token, nx=True)
existing = self._client.get(owner_key)
if existing and existing != owner_token:
raise SessionOwnershipError(
f"session_id '{session_id}' belongs to a different caller"
)

def log(
self,
session_id: str,
Expand Down Expand Up @@ -144,12 +163,14 @@ def get_by_id(self, escalation_id: int) -> dict[str, Any] | None:
def delete_session(self, session_id: str) -> int:
ids = self._client.zrange(self._key("session", session_id), 0, -1)
if not ids:
self._client.delete(self._key("owner", session_id))
return 0
pipe = self._client.pipeline()
for esc_id in ids:
pipe.delete(self._key("esc", esc_id))
pipe.zrem(self._key("esc_timeline"), esc_id)
pipe.delete(self._key("session", session_id))
pipe.delete(self._key("owner", session_id))
pipe.execute()
return len(ids)

Expand Down
Loading
Loading