Skip to content

fix(tools): re-validate redirects and pin peer IP to close SSRF bypass#6038

Draft
theCyberTech wants to merge 7 commits into
mainfrom
worktree-ssrf-redirect-fix
Draft

fix(tools): re-validate redirects and pin peer IP to close SSRF bypass#6038
theCyberTech wants to merge 7 commits into
mainfrom
worktree-ssrf-redirect-fix

Conversation

@theCyberTech

@theCyberTech theCyberTech commented Jun 4, 2026

Copy link
Copy Markdown
Member

Summary

validate_url (crewai_tools/security/safe_path.py) inspects the URL string it is handed, but ScrapeWebsiteTool and ScrapeElementFromWebsiteTool then fetched with requests.get(..., allow_redirects=True). Two gaps followed from that decoupling:

  1. Redirect bypass — a normal-looking public host that returns 302 Location: <internal-address> was followed to the internal target without re-validation, turning the tools into an SSRF proxy for the worker's network (incl. RFC1918 ranges and 169.254.169.254 cloud metadata, which the validator's own blocklist explicitly covers).
  2. DNS-rebinding TOCTOUvalidate_url resolved the host via getaddrinfo, then discarded the IPs and returned the URL string; requests re-resolved at connect time, so the IP that was checked was not necessarily the IP that was used.

This is a bypass of a CrewAI-shipped SSRF control (the validator was introduced as a security fix, CVE-2026-2286).

Fix

New crewai_tools/security/safe_requests.py validates at the connection layer, where the actual connection is made:

  • SSRFProtectedAdapter.send() re-runs validate_url on every request. requests.Session.send invokes the adapter once per redirect hop, so each Location is validated before it is followed → closes the redirect arm.
  • _SafeHTTP[S]Connection validate the actual connected peer IP (getpeername()) immediately after connect(). The IP that was authorised is the IP the socket uses → closes the DNS-rebinding gap.
  • safe_get() is a drop-in replacement for requests.get. The CREWAI_TOOLS_ALLOW_UNSAFE_PATHS escape hatch is honored consistently.

ScrapeWebsiteTool and ScrapeElementFromWebsiteTool now fetch through safe_get.

Scope

  • Limited to the two confirmed worker-side direct-fetch tools. The vendor-forwarding tools (Jina/Serper/Firecrawl/etc.) hand the URL to a third-party API, so the redirect lands on the vendor's network — a separate concern, not changed here.
  • WebsiteSearchTool/RagTool fetch through the RAG/embedchain loader layer (not a direct requests.get); that path is not covered by this requests-based utility and is flagged for a follow-up audit.

Testing

  • tests/utilities/test_safe_requests.py (new): per-hop redirect re-validation, connection peer guard (private/metadata blocked, public allowed, escape hatch, simulated rebind), adapter mounting.
  • pytest tests/utilities/test_safe_requests.py tests/utilities/test_safe_path.py → 35 passed.
  • ruff check clean.
  • End-to-end smoke: ScrapeWebsiteTool._run(website_url="http://169.254.169.254/latest/meta-data/") is now blocked.

Summary by CodeRabbit

  • New Features
    • Added SSRF-protected HTTP fetching with redirect-aware URL validation, connection-time private/reserved IP blocking (opt-out via an escape hatch), and environment proxy bypassing.
    • Introduced safe_get and create_safe_session helpers for safer, drop-in web requests.
  • Bug Fixes
    • Website scraping tools now consistently use the safer fetching flow when retrieving and parsing pages.
  • Tests
    • Added end-to-end coverage for URL/redirect validation, peer/IP guarding, proxy behavior, and safety overrides.

validate_url checked the URL string but ScrapeWebsiteTool and
ScrapeElementFromWebsiteTool then fetched with requests' default
allow_redirects=True, so a public host that 302-redirected to an
internal address reached it without re-validation. The resolved IPs
were also discarded, leaving a DNS time-of-check/time-of-use gap.

Add crewai_tools.security.safe_requests:
- SSRFProtectedAdapter re-runs validate_url on every send, including
  each redirect hop (Session.send calls the adapter per hop).
- Connections validate the actual connected peer IP at connect time,
  so the IP that was authorised is the IP that is used (closes the
  DNS-rebinding gap).

Route the two direct-fetch scrape tools through safe_get and add tests.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Jun 4, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 545c7736-daae-4e51-8974-295aa258ffa5

📥 Commits

Reviewing files that changed from the base of the PR and between ce53a34 and 74a7160.

📒 Files selected for processing (1)
  • lib/crewai-tools/src/crewai_tools/security/safe_requests.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • lib/crewai-tools/src/crewai_tools/security/safe_requests.py

📝 Walkthrough

Walkthrough

A new safe_requests.py module adds SSRF-protected HTTP fetching with socket-level peer validation and redirect revalidation. Both scraping tools now call safe_get instead of validate_url plus requests.get. Tests cover redirects, proxy handling, and peer-IP blocking.

Changes

SSRF-safe HTTP fetching

Layer / File(s) Summary
SSRF-safe HTTP core implementation
lib/crewai-tools/src/crewai_tools/security/safe_requests.py
Defines _assert_safe_peer, safe urllib3 connection and pool classes, SSRFProtectedAdapter, create_safe_session(), and safe_get() with initial URL validation, redirect-hop revalidation, and socket peer checks.
Scraping tools switched to safe_get
lib/crewai-tools/src/crewai_tools/tools/scrape_website_tool/scrape_website_tool.py, lib/crewai-tools/src/crewai_tools/tools/scrape_element_from_website/scrape_element_from_website.py
Both tools remove direct requests/validate_url usage and replace the _run fetch path with safe_get(...) while keeping the same request options.
Tests for redirect revalidation and peer guarding
lib/crewai-tools/tests/utilities/test_safe_requests.py
Adds localhost test scaffolding plus cases for adapter revalidation, session mounting, proxy disabling, private/public IP peer checks, the escape hatch, and a simulated rebinding peer swap.
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 37.50% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly summarizes the main SSRF fix: redirect re-validation and peer-IP pinning.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch worktree-ssrf-redirect-fix

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@github-actions github-actions Bot added the size/L label Jun 4, 2026
@theCyberTech theCyberTech marked this pull request as ready for review June 24, 2026 06:35
Copilot AI review requested due to automatic review settings June 24, 2026 06:35

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens crewai-tools’ direct-fetch scraping tools against SSRF bypasses by enforcing URL validation across redirects and validating the actual connected peer IP to close DNS-rebinding TOCTOU gaps.

Changes:

  • Added crewai_tools.security.safe_requests with a requests adapter + connection classes that re-validate each redirect hop and verify the socket peer IP after connect.
  • Switched ScrapeWebsiteTool and ScrapeElementFromWebsiteTool to fetch via safe_get() instead of requests.get(...).
  • Added focused unit tests for redirect re-validation, peer-IP guarding, and proxy/environment behavior.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
lib/crewai-tools/src/crewai_tools/security/safe_requests.py New SSRF-hardened requests session/adapter and peer-IP validation logic.
lib/crewai-tools/src/crewai_tools/tools/scrape_website_tool/scrape_website_tool.py Routes website scraping fetches through safe_get.
lib/crewai-tools/src/crewai_tools/tools/scrape_element_from_website/scrape_element_from_website.py Routes element scraping fetches through safe_get.
lib/crewai-tools/tests/utilities/test_safe_requests.py Adds unit tests covering redirect-hop validation and peer-IP enforcement.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread lib/crewai-tools/src/crewai_tools/security/safe_requests.py Outdated
Comment thread lib/crewai-tools/src/crewai_tools/security/safe_requests.py

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@lib/crewai-tools/src/crewai_tools/security/safe_requests.py`:
- Around line 49-52: The SSRF guard in safe_requests.py currently fails open
when sock.getpeername() raises OSError, letting an unverified connection
continue. Update the peer inspection logic in the function containing the
getpeername() check to fail closed instead of returning, so any inability to
inspect the peer address blocks the request and prevents the connection from
proceeding.
- Around line 68-71: The _SafeHTTPSConnection.connect() flow is checking the
peer too late because super().connect() already performs the TLS setup, so move
the safety check earlier by overriding _new_conn() in _SafeHTTPSConnection
(matching the HTTP path) and calling _assert_safe_peer() on the raw socket
before any handshake or protocol traffic occurs.
- Around line 94-121: The SSRFProtectedAdapter still allows traffic to go
through caller-supplied proxies, which bypasses the safe pool and IP pinning.
Update SSRFProtectedAdapter to reject any proxy use by overriding the
proxy-selection path (for example, proxy_manager_for or equivalent) so requests
with proxies or session.proxies fail closed, and keep validate_url in send as
the direct-request guard.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 1a379643-6064-4174-b2f1-ba7d860c65b8

📥 Commits

Reviewing files that changed from the base of the PR and between a046e6a and 7a39a30.

📒 Files selected for processing (4)
  • lib/crewai-tools/src/crewai_tools/security/safe_requests.py
  • lib/crewai-tools/src/crewai_tools/tools/scrape_element_from_website/scrape_element_from_website.py
  • lib/crewai-tools/src/crewai_tools/tools/scrape_website_tool/scrape_website_tool.py
  • lib/crewai-tools/tests/utilities/test_safe_requests.py

Comment thread lib/crewai-tools/src/crewai_tools/security/safe_requests.py Outdated
Comment on lines +68 to +71
class _SafeHTTPSConnection(HTTPSConnection):
def connect(self) -> None:
super().connect()
_assert_safe_peer(self.sock)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔒 Security & Privacy | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
python - <<'PY'
import inspect
from urllib3.connection import HTTPSConnection

print(inspect.getsource(HTTPSConnection.connect))
PY

Repository: crewAIInc/crewAI

Length of output: 6196


🏁 Script executed:

#!/bin/bash
set -euo pipefail

python3 - <<'PY'
import inspect
import urllib3.connection as c
from pathlib import Path

print("HTTPSConnection._new_conn exists:", hasattr(c.HTTPSConnection, "_new_conn"))
print("HTTPConnection._new_conn exists:", hasattr(c.HTTPConnection, "_new_conn"))
print("HTTPSConnection.connect source:\n")
print(inspect.getsource(c.HTTPSConnection.connect))
print("\nHTTPConnection._new_conn source:\n")
print(inspect.getsource(c.HTTPConnection._new_conn))
print("\nHTTPSConnection._new_conn source:\n")
print(inspect.getsource(c.HTTPSConnection._new_conn))
PY

printf '\n--- safe_requests.py excerpt ---\n'
nl -ba lib/crewai-tools/src/crewai_tools/security/safe_requests.py | sed -n '1,180p'

Repository: crewAIInc/crewAI

Length of output: 8412


🏁 Script executed:

#!/bin/bash
set -euo pipefail

python3 - <<'PY'
from pathlib import Path

path = Path("lib/crewai-tools/src/crewai_tools/security/safe_requests.py")
text = path.read_text()
for i, line in enumerate(text.splitlines(), 1):
    if 1 <= i <= 220:
        print(f"{i:4d}: {line}")
PY

Repository: crewAIInc/crewAI

Length of output: 6524


🏁 Script executed:

#!/bin/bash
set -euo pipefail

python3 - <<'PY'
import urllib3.connection as c
import inspect

print("HTTPConnection.connect exists:", hasattr(c.HTTPConnection, "connect"))
print("HTTPConnection.connect source:")
print(inspect.getsource(c.HTTPConnection.connect))
PY

Repository: crewAIInc/crewAI

Length of output: 862


Move the HTTPS peer check before the TLS handshake. _SafeHTTPSConnection.connect() runs the _assert_safe_peer() check after super().connect(), and urllib3’s HTTPS connect path performs the TLS wrap/handshake inside connect(). That leaves a rebound/private peer able to receive the ClientHello. Override _new_conn() here, like the HTTP path, so the raw socket is checked before any protocol traffic.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@lib/crewai-tools/src/crewai_tools/security/safe_requests.py` around lines 68
- 71, The _SafeHTTPSConnection.connect() flow is checking the peer too late
because super().connect() already performs the TLS setup, so move the safety
check earlier by overriding _new_conn() in _SafeHTTPSConnection (matching the
HTTP path) and calling _assert_safe_peer() on the raw socket before any
handshake or protocol traffic occurs.

Comment thread lib/crewai-tools/src/crewai_tools/security/safe_requests.py
theCyberTech and others added 2 commits June 24, 2026 18:43
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
@theCyberTech theCyberTech marked this pull request as draft June 25, 2026 04:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants