fix(tools): re-validate redirects and pin peer IP to close SSRF bypass#6038
fix(tools): re-validate redirects and pin peer IP to close SSRF bypass#6038theCyberTech wants to merge 7 commits into
Conversation
validate_url checked the URL string but ScrapeWebsiteTool and ScrapeElementFromWebsiteTool then fetched with requests' default allow_redirects=True, so a public host that 302-redirected to an internal address reached it without re-validation. The resolved IPs were also discarded, leaving a DNS time-of-check/time-of-use gap. Add crewai_tools.security.safe_requests: - SSRFProtectedAdapter re-runs validate_url on every send, including each redirect hop (Session.send calls the adapter per hop). - Connections validate the actual connected peer IP at connect time, so the IP that was authorised is the IP that is used (closes the DNS-rebinding gap). Route the two direct-fetch scrape tools through safe_get and add tests. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughA new ChangesSSRF-safe HTTP fetching
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Pull request overview
This PR hardens crewai-tools’ direct-fetch scraping tools against SSRF bypasses by enforcing URL validation across redirects and validating the actual connected peer IP to close DNS-rebinding TOCTOU gaps.
Changes:
- Added
crewai_tools.security.safe_requestswith arequestsadapter + connection classes that re-validate each redirect hop and verify the socket peer IP after connect. - Switched
ScrapeWebsiteToolandScrapeElementFromWebsiteToolto fetch viasafe_get()instead ofrequests.get(...). - Added focused unit tests for redirect re-validation, peer-IP guarding, and proxy/environment behavior.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| lib/crewai-tools/src/crewai_tools/security/safe_requests.py | New SSRF-hardened requests session/adapter and peer-IP validation logic. |
| lib/crewai-tools/src/crewai_tools/tools/scrape_website_tool/scrape_website_tool.py | Routes website scraping fetches through safe_get. |
| lib/crewai-tools/src/crewai_tools/tools/scrape_element_from_website/scrape_element_from_website.py | Routes element scraping fetches through safe_get. |
| lib/crewai-tools/tests/utilities/test_safe_requests.py | Adds unit tests covering redirect-hop validation and peer-IP enforcement. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@lib/crewai-tools/src/crewai_tools/security/safe_requests.py`:
- Around line 49-52: The SSRF guard in safe_requests.py currently fails open
when sock.getpeername() raises OSError, letting an unverified connection
continue. Update the peer inspection logic in the function containing the
getpeername() check to fail closed instead of returning, so any inability to
inspect the peer address blocks the request and prevents the connection from
proceeding.
- Around line 68-71: The _SafeHTTPSConnection.connect() flow is checking the
peer too late because super().connect() already performs the TLS setup, so move
the safety check earlier by overriding _new_conn() in _SafeHTTPSConnection
(matching the HTTP path) and calling _assert_safe_peer() on the raw socket
before any handshake or protocol traffic occurs.
- Around line 94-121: The SSRFProtectedAdapter still allows traffic to go
through caller-supplied proxies, which bypasses the safe pool and IP pinning.
Update SSRFProtectedAdapter to reject any proxy use by overriding the
proxy-selection path (for example, proxy_manager_for or equivalent) so requests
with proxies or session.proxies fail closed, and keep validate_url in send as
the direct-request guard.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 1a379643-6064-4174-b2f1-ba7d860c65b8
📒 Files selected for processing (4)
lib/crewai-tools/src/crewai_tools/security/safe_requests.pylib/crewai-tools/src/crewai_tools/tools/scrape_element_from_website/scrape_element_from_website.pylib/crewai-tools/src/crewai_tools/tools/scrape_website_tool/scrape_website_tool.pylib/crewai-tools/tests/utilities/test_safe_requests.py
| class _SafeHTTPSConnection(HTTPSConnection): | ||
| def connect(self) -> None: | ||
| super().connect() | ||
| _assert_safe_peer(self.sock) |
There was a problem hiding this comment.
🔒 Security & Privacy | 🟠 Major
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
python - <<'PY'
import inspect
from urllib3.connection import HTTPSConnection
print(inspect.getsource(HTTPSConnection.connect))
PYRepository: crewAIInc/crewAI
Length of output: 6196
🏁 Script executed:
#!/bin/bash
set -euo pipefail
python3 - <<'PY'
import inspect
import urllib3.connection as c
from pathlib import Path
print("HTTPSConnection._new_conn exists:", hasattr(c.HTTPSConnection, "_new_conn"))
print("HTTPConnection._new_conn exists:", hasattr(c.HTTPConnection, "_new_conn"))
print("HTTPSConnection.connect source:\n")
print(inspect.getsource(c.HTTPSConnection.connect))
print("\nHTTPConnection._new_conn source:\n")
print(inspect.getsource(c.HTTPConnection._new_conn))
print("\nHTTPSConnection._new_conn source:\n")
print(inspect.getsource(c.HTTPSConnection._new_conn))
PY
printf '\n--- safe_requests.py excerpt ---\n'
nl -ba lib/crewai-tools/src/crewai_tools/security/safe_requests.py | sed -n '1,180p'Repository: crewAIInc/crewAI
Length of output: 8412
🏁 Script executed:
#!/bin/bash
set -euo pipefail
python3 - <<'PY'
from pathlib import Path
path = Path("lib/crewai-tools/src/crewai_tools/security/safe_requests.py")
text = path.read_text()
for i, line in enumerate(text.splitlines(), 1):
if 1 <= i <= 220:
print(f"{i:4d}: {line}")
PYRepository: crewAIInc/crewAI
Length of output: 6524
🏁 Script executed:
#!/bin/bash
set -euo pipefail
python3 - <<'PY'
import urllib3.connection as c
import inspect
print("HTTPConnection.connect exists:", hasattr(c.HTTPConnection, "connect"))
print("HTTPConnection.connect source:")
print(inspect.getsource(c.HTTPConnection.connect))
PYRepository: crewAIInc/crewAI
Length of output: 862
Move the HTTPS peer check before the TLS handshake. _SafeHTTPSConnection.connect() runs the _assert_safe_peer() check after super().connect(), and urllib3’s HTTPS connect path performs the TLS wrap/handshake inside connect(). That leaves a rebound/private peer able to receive the ClientHello. Override _new_conn() here, like the HTTP path, so the raw socket is checked before any protocol traffic.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@lib/crewai-tools/src/crewai_tools/security/safe_requests.py` around lines 68
- 71, The _SafeHTTPSConnection.connect() flow is checking the peer too late
because super().connect() already performs the TLS setup, so move the safety
check earlier by overriding _new_conn() in _SafeHTTPSConnection (matching the
HTTP path) and calling _assert_safe_peer() on the raw socket before any
handshake or protocol traffic occurs.
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Summary
validate_url(crewai_tools/security/safe_path.py) inspects the URL string it is handed, butScrapeWebsiteToolandScrapeElementFromWebsiteToolthen fetched withrequests.get(..., allow_redirects=True). Two gaps followed from that decoupling:302 Location: <internal-address>was followed to the internal target without re-validation, turning the tools into an SSRF proxy for the worker's network (incl. RFC1918 ranges and169.254.169.254cloud metadata, which the validator's own blocklist explicitly covers).validate_urlresolved the host viagetaddrinfo, then discarded the IPs and returned the URL string;requestsre-resolved at connect time, so the IP that was checked was not necessarily the IP that was used.This is a bypass of a CrewAI-shipped SSRF control (the validator was introduced as a security fix, CVE-2026-2286).
Fix
New
crewai_tools/security/safe_requests.pyvalidates at the connection layer, where the actual connection is made:SSRFProtectedAdapter.send()re-runsvalidate_urlon every request.requests.Session.sendinvokes the adapter once per redirect hop, so eachLocationis validated before it is followed → closes the redirect arm._SafeHTTP[S]Connectionvalidate the actual connected peer IP (getpeername()) immediately afterconnect(). The IP that was authorised is the IP the socket uses → closes the DNS-rebinding gap.safe_get()is a drop-in replacement forrequests.get. TheCREWAI_TOOLS_ALLOW_UNSAFE_PATHSescape hatch is honored consistently.ScrapeWebsiteToolandScrapeElementFromWebsiteToolnow fetch throughsafe_get.Scope
WebsiteSearchTool/RagToolfetch through the RAG/embedchain loader layer (not a directrequests.get); that path is not covered by thisrequests-based utility and is flagged for a follow-up audit.Testing
tests/utilities/test_safe_requests.py(new): per-hop redirect re-validation, connection peer guard (private/metadata blocked, public allowed, escape hatch, simulated rebind), adapter mounting.pytest tests/utilities/test_safe_requests.py tests/utilities/test_safe_path.py→ 35 passed.ruff checkclean.ScrapeWebsiteTool._run(website_url="http://169.254.169.254/latest/meta-data/")is now blocked.Summary by CodeRabbit
safe_getandcreate_safe_sessionhelpers for safer, drop-in web requests.