fix: stop FileCompressorTool from archiving out-of-tree symlink targets#6249
fix: stop FileCompressorTool from archiving out-of-tree symlink targets#6249theCyberTech wants to merge 6 commits into
Conversation
input_path/output_path were already allow-list validated, but when compressing a directory the per-file members were not. zipfile.write() dereferences symlinks, so a symlink inside an allowed directory pointing at an out-of-tree secret (e.g. ~/.ssh/id_rsa, /etc/passwd) had its contents copied into the archive — a confinement bypass / info-disclosure. - _compress_zip: validate each walked file against the allow-list; skip and record any whose resolved path escapes it. - _compress_tar: filter drops symlink/hardlink members (tar stores links rather than dereferencing, so this prevents shipping an out-of-tree link that resolves at extraction time). - _run: surface a "N item(s) skipped for safety" note rather than swallowing the exclusions. - tests: regression tests asserting the secret content never lands in the zip or tar archive. Also fix a pre-existing broken import that made the whole compressor test module error at collection (empty package __init__; aligned to the module-path import used by sibling tools). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
📝 WalkthroughWalkthroughThe PR updates file compression to track skipped entries, exclude unsafe paths and links from ZIP/TAR archives, and surface skipped-item counts in the tool result. Tests cover symlink exclusion and input/output path confinement for both archive formats. ChangesSecure file compression
Sequence Diagram(s)sequenceDiagram
participant Caller
participant FileCompressorTool._run
participant FileCompressorTool._compress_zip
participant FileCompressorTool._compress_tar
participant validate_file_path
participant zipfile
participant tarfile
Caller->>FileCompressorTool._run: request compression
alt ZIP output
FileCompressorTool._run->>FileCompressorTool._compress_zip: input_path, output_path
FileCompressorTool._compress_zip->>validate_file_path: validate input_path and each file
validate_file_path-->>FileCompressorTool._compress_zip: accepted or rejected
FileCompressorTool._compress_zip->>zipfile: write accepted files
FileCompressorTool._compress_zip-->>FileCompressorTool._run: skipped list
else TAR.GZ output
FileCompressorTool._run->>FileCompressorTool._compress_tar: input_path, output_path, format
FileCompressorTool._compress_tar->>validate_file_path: validate input_path and output_path
FileCompressorTool._compress_tar->>tarfile: add directory with _drop_links filter
FileCompressorTool._compress_tar-->>FileCompressorTool._run: skipped list
end
FileCompressorTool._run-->>Caller: success message with skipped count
Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Summary: This PR hardens FileCompressorTool by validating walked ZIP members and excluding tar symlink/hardlink entries to prevent out-of-tree file inclusion. No exploitable security vulnerabilities were identified in the added code.
Risk: Low risk. The changes reduce an existing path-confinement exposure without adding new authentication, authorization, external integration, or public API surfaces.
…pth) Corridor scan flagged output_path reaching the archive writer. _run does validate output_path before dispatch, but _compress_zip/_compress_tar are staticmethods reachable independently of _run — called directly they would write unvalidated. Validate at the sink so every call site is confined to the allow-list, not just the _run entrypoint. Also decouple the symlink tests from the configurable-allow-list feature (PR #6248): chdir into the working dir so the allowed root is cwd, instead of setting CREWAI_TOOLS_ALLOWED_DIRS (which this branch, off main, does not yet support). Adds direct-call tests asserting the sink rejects an out-of-root output path. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
There was a problem hiding this comment.
Security Issues
-
Time-of-check to time-of-use (TOCTOU) on zip member writes
During directory compression, each file is validated via validate_file_path(full_path) but the returned resolved path is discarded and zipf.write(full_path, arcname) uses the original path. An attacker can swap a symlink target between validation and write, causing out-of-tree content to be archived despite the check. -
Missing allow-list validation for single-file inputs (zip and tar helpers)
When input_path is a single file and the helper methods are called directly (bypassing _run), input_path is never validated against the allowed directories. This allows archiving arbitrary files outside the allowed directory, violating the path-confinement policy.
Recommendations
- Always use the resolved path returned by validate_file_path(...) for subsequent I/O to eliminate TOCTOU windows.
- Validate input_path with validate_file_path(...) in the single-file branches of both _compress_zip and _compress_tar, and use the resolved value in write/add calls.
There was a problem hiding this comment.
Pull request overview
Hardens FileCompressorTool against a path-confinement bypass where symlinks inside an allowed directory could cause out-of-tree file contents to be archived (notably for zip archives where zipfile.write() dereferences symlinks). Also fixes the compressor test module import so the tests execute in CI.
Changes:
- ZIP compression now validates each walked file’s resolved path against the allowed root and skips unsafe entries while writing the canonical (validated) path to the archive.
- TAR compression now filters out symlink/hardlink members to avoid shipping link entries that could resolve out-of-tree at extraction time.
- Adds security regression tests for symlink handling and validates
input_path/output_pathinside the compression helpers (sink-level defense-in-depth).
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
lib/crewai-tools/src/crewai_tools/tools/files_compressor_tool/files_compressor_tool.py |
Validates walked zip members (prevents symlink deref exfiltration), filters tar link entries, and surfaces a “skipped for safety” note. |
lib/crewai-tools/tests/tools/files_compressor_tool_test.py |
Fixes the import path so tests run, and adds regression coverage for symlink-to-outside handling + sink validation. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In
`@lib/crewai-tools/src/crewai_tools/tools/files_compressor_tool/files_compressor_tool.py`:
- Around line 87-91: The success message in FilesCompressorTool is using an
overly specific skipped-item reason that can be incorrect for TAR archives.
Update the message logic in FilesCompressorTool around the skipped handling so
it uses a generic “skipped for safety” style description instead of claiming
items resolved outside the allowed directory, and keep the condition based on
the skipped list while avoiding TAR-specific misreporting.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 16f34458-f1e9-4781-b5e6-18a3194ee7cd
📒 Files selected for processing (2)
lib/crewai-tools/src/crewai_tools/tools/files_compressor_tool/files_compressor_tool.pylib/crewai-tools/tests/tools/files_compressor_tool_test.py
What this fixes
FileCompressorToolalready checked the top-levelinput_pathandoutput_path, but it did not re-check each file while walking a directory.That matters for ZIP files.
zipfile.write()follows symlinks. So if an allowed project folder contained a symlink like this:compressing
project/could copy the contents ofoutside.txtinto the ZIP archive. The path looked allowed because the symlink lived inside the project, but the file content came from outside the allowed directory.For TAR files the behavior is different: tar archives store symlinks as links instead of copying the target contents. This PR still drops symlink and hardlink entries so the archive does not ship a link that could point outside the allowed tree when someone extracts it later.
What changed
_compress_zipand_compress_tarvalidate bothinput_pathandoutput_pathat the archive-writing sink, so direct helper calls get the same path checks as_run.Tests
uv run pytest lib/crewai-tools/tests/tools/files_compressor_tool_test.py-> 21 passeduv run ruff check lib/crewai-tools/src/crewai_tools/tools/files_compressor_tool/files_compressor_tool.py lib/crewai-tools/tests/tools/files_compressor_tool_test.py-> passedruff,ruff-format,mypy-> passedNotes
This is related to #6248, but it is independent. This branch was cut from
mainand focuses only on compressor archive safety.