Skip to content

fix: stop FileCompressorTool from archiving out-of-tree symlink targets#6249

Open
theCyberTech wants to merge 6 commits into
mainfrom
fix/compressor-symlink-leak
Open

fix: stop FileCompressorTool from archiving out-of-tree symlink targets#6249
theCyberTech wants to merge 6 commits into
mainfrom
fix/compressor-symlink-leak

Conversation

@theCyberTech

@theCyberTech theCyberTech commented Jun 19, 2026

Copy link
Copy Markdown
Member

What this fixes

FileCompressorTool already checked the top-level input_path and output_path, but it did not re-check each file while walking a directory.

That matters for ZIP files. zipfile.write() follows symlinks. So if an allowed project folder contained a symlink like this:

project/leak.txt -> /some/path/outside/project/outside.txt

compressing project/ could copy the contents of outside.txt into the ZIP archive. The path looked allowed because the symlink lived inside the project, but the file content came from outside the allowed directory.

For TAR files the behavior is different: tar archives store symlinks as links instead of copying the target contents. This PR still drops symlink and hardlink entries so the archive does not ship a link that could point outside the allowed tree when someone extracts it later.

What changed

  • ZIP compression now validates every walked file before writing it to the archive.
  • If a walked file resolves outside the allowed directory, the compressor skips it and reports that something was skipped for safety.
  • TAR compression now drops symlink and hardlink members.
  • _compress_zip and _compress_tar validate both input_path and output_path at the archive-writing sink, so direct helper calls get the same path checks as _run.
  • The compressor tests now import the tool through the same module path as the sibling tool tests, which fixes a pre-existing collection issue.

Tests

  • uv run pytest lib/crewai-tools/tests/tools/files_compressor_tool_test.py -> 21 passed
  • uv run ruff check lib/crewai-tools/src/crewai_tools/tools/files_compressor_tool/files_compressor_tool.py lib/crewai-tools/tests/tools/files_compressor_tool_test.py -> passed
  • Commit hook: ruff, ruff-format, mypy -> passed

Notes

This is related to #6248, but it is independent. This branch was cut from main and focuses only on compressor archive safety.

input_path/output_path were already allow-list validated, but when
compressing a directory the per-file members were not. zipfile.write()
dereferences symlinks, so a symlink inside an allowed directory pointing
at an out-of-tree secret (e.g. ~/.ssh/id_rsa, /etc/passwd) had its
contents copied into the archive — a confinement bypass / info-disclosure.

- _compress_zip: validate each walked file against the allow-list; skip
  and record any whose resolved path escapes it.
- _compress_tar: filter drops symlink/hardlink members (tar stores links
  rather than dereferencing, so this prevents shipping an out-of-tree
  link that resolves at extraction time).
- _run: surface a "N item(s) skipped for safety" note rather than
  swallowing the exclusions.
- tests: regression tests asserting the secret content never lands in the
  zip or tar archive. Also fix a pre-existing broken import that made the
  whole compressor test module error at collection (empty package
  __init__; aligned to the module-path import used by sibling tools).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Jun 19, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

The PR updates file compression to track skipped entries, exclude unsafe paths and links from ZIP/TAR archives, and surface skipped-item counts in the tool result. Tests cover symlink exclusion and input/output path confinement for both archive formats.

Changes

Secure file compression

Layer / File(s) Summary
Result message and compressor routing
lib/crewai-tools/src/crewai_tools/tools/files_compressor_tool/files_compressor_tool.py
_run now reads skipped entries from the selected compressor and includes a skipped-item count in the success message.
ZIP path confinement and skipped files
lib/crewai-tools/src/crewai_tools/tools/files_compressor_tool/files_compressor_tool.py
_compress_zip now returns skipped paths, validates sink paths and per-file paths, and leaves rejected files out of the archive.
TAR link filtering
lib/crewai-tools/src/crewai_tools/tools/files_compressor_tool/files_compressor_tool.py
_compress_tar now returns skipped member names, validates sink paths, and drops symlink and hardlink entries with a tar filter before writing.
Symlink and path-confinement tests
lib/crewai-tools/tests/tools/files_compressor_tool_test.py
Tests add a symlink fixture, verify ZIP/TAR omit external links, and assert both compressors reject input and output paths outside the allowed root.

Sequence Diagram(s)

sequenceDiagram
  participant Caller
  participant FileCompressorTool._run
  participant FileCompressorTool._compress_zip
  participant FileCompressorTool._compress_tar
  participant validate_file_path
  participant zipfile
  participant tarfile

  Caller->>FileCompressorTool._run: request compression
  alt ZIP output
    FileCompressorTool._run->>FileCompressorTool._compress_zip: input_path, output_path
    FileCompressorTool._compress_zip->>validate_file_path: validate input_path and each file
    validate_file_path-->>FileCompressorTool._compress_zip: accepted or rejected
    FileCompressorTool._compress_zip->>zipfile: write accepted files
    FileCompressorTool._compress_zip-->>FileCompressorTool._run: skipped list
  else TAR.GZ output
    FileCompressorTool._run->>FileCompressorTool._compress_tar: input_path, output_path, format
    FileCompressorTool._compress_tar->>validate_file_path: validate input_path and output_path
    FileCompressorTool._compress_tar->>tarfile: add directory with _drop_links filter
    FileCompressorTool._compress_tar-->>FileCompressorTool._run: skipped list
  end
  FileCompressorTool._run-->>Caller: success message with skipped count
Loading

Suggested reviewers

  • lorenzejay
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 30.77% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly matches the main change: hardening FileCompressorTool against archiving out-of-tree symlink targets.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/compressor-symlink-leak

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@corridor-security corridor-security Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary: This PR hardens FileCompressorTool by validating walked ZIP members and excluding tar symlink/hardlink entries to prevent out-of-tree file inclusion. No exploitable security vulnerabilities were identified in the added code.

Risk: Low risk. The changes reduce an existing path-confinement exposure without adding new authentication, authorization, external integration, or public API surfaces.

…pth)

Corridor scan flagged output_path reaching the archive writer. _run does
validate output_path before dispatch, but _compress_zip/_compress_tar are
staticmethods reachable independently of _run — called directly they would
write unvalidated. Validate at the sink so every call site is confined to
the allow-list, not just the _run entrypoint.

Also decouple the symlink tests from the configurable-allow-list feature
(PR #6248): chdir into the working dir so the allowed root is cwd, instead
of setting CREWAI_TOOLS_ALLOWED_DIRS (which this branch, off main, does not
yet support). Adds direct-call tests asserting the sink rejects an
out-of-root output path.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

@corridor-security corridor-security Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Security Issues

  • Time-of-check to time-of-use (TOCTOU) on zip member writes
    During directory compression, each file is validated via validate_file_path(full_path) but the returned resolved path is discarded and zipf.write(full_path, arcname) uses the original path. An attacker can swap a symlink target between validation and write, causing out-of-tree content to be archived despite the check.

  • Missing allow-list validation for single-file inputs (zip and tar helpers)
    When input_path is a single file and the helper methods are called directly (bypassing _run), input_path is never validated against the allowed directories. This allows archiving arbitrary files outside the allowed directory, violating the path-confinement policy.

Recommendations

  • Always use the resolved path returned by validate_file_path(...) for subsequent I/O to eliminate TOCTOU windows.
  • Validate input_path with validate_file_path(...) in the single-file branches of both _compress_zip and _compress_tar, and use the resolved value in write/add calls.

@theCyberTech theCyberTech marked this pull request as ready for review June 26, 2026 02:23
Copilot AI review requested due to automatic review settings June 26, 2026 02:23

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Hardens FileCompressorTool against a path-confinement bypass where symlinks inside an allowed directory could cause out-of-tree file contents to be archived (notably for zip archives where zipfile.write() dereferences symlinks). Also fixes the compressor test module import so the tests execute in CI.

Changes:

  • ZIP compression now validates each walked file’s resolved path against the allowed root and skips unsafe entries while writing the canonical (validated) path to the archive.
  • TAR compression now filters out symlink/hardlink members to avoid shipping link entries that could resolve out-of-tree at extraction time.
  • Adds security regression tests for symlink handling and validates input_path/output_path inside the compression helpers (sink-level defense-in-depth).

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
lib/crewai-tools/src/crewai_tools/tools/files_compressor_tool/files_compressor_tool.py Validates walked zip members (prevents symlink deref exfiltration), filters tar link entries, and surfaces a “skipped for safety” note.
lib/crewai-tools/tests/tools/files_compressor_tool_test.py Fixes the import path so tests run, and adds regression coverage for symlink-to-outside handling + sink validation.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@lib/crewai-tools/src/crewai_tools/tools/files_compressor_tool/files_compressor_tool.py`:
- Around line 87-91: The success message in FilesCompressorTool is using an
overly specific skipped-item reason that can be incorrect for TAR archives.
Update the message logic in FilesCompressorTool around the skipped handling so
it uses a generic “skipped for safety” style description instead of claiming
items resolved outside the allowed directory, and keep the condition based on
the skipped list while avoiding TAR-specific misreporting.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 16f34458-f1e9-4781-b5e6-18a3194ee7cd

📥 Commits

Reviewing files that changed from the base of the PR and between 5d4851e and 5dd1b68.

📒 Files selected for processing (2)
  • lib/crewai-tools/src/crewai_tools/tools/files_compressor_tool/files_compressor_tool.py
  • lib/crewai-tools/tests/tools/files_compressor_tool_test.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants