Skip to content

Fix symlink path traversal in skill archive extraction#6235

Merged
lorenzejay merged 4 commits into
mainfrom
fix-skill-archive-symlink-traversal
Jun 24, 2026
Merged

Fix symlink path traversal in skill archive extraction#6235
lorenzejay merged 4 commits into
mainfrom
fix-skill-archive-symlink-traversal

Conversation

@theCyberTech

@theCyberTech theCyberTech commented Jun 19, 2026

Copy link
Copy Markdown
Member

Summary

_safe_extractall — the Python < 3.12 fallback used by crewai skills when unpacking a downloaded skill archive (_unpack_archive) — validated each tar member's name against the destination directory but never validated symlink / hardlink targets.

Because the project supports Python >=3.10, this fallback runs on real installs (3.10 / 3.11). The 3.12+ path uses tarfile.extractall(..., filter="data"), which already blocks malicious links; the fallback did not provide the equivalent guard.

Impact

A malicious skill tarball can:

  1. plant a symlink whose target escapes dest (e.g. link -> /home/user/.ssh), then
  2. include a regular member written through that link (link/authorized_keys).

Every member name resolves inside dest, so the old check passed — but extraction creates the symlink and then writes through it, landing the payload outside dest. Classic symlink-extraction path traversal / arbitrary file write from an externally-sourced archive.

Fix

_safe_extractall now, for symlink/hardlink members, rejects:

  • absolute link targets, and
  • any link target that resolves outside the destination directory

(anchored correctly: hardlink names are relative to the archive root, symlink targets relative to the member's own directory). This mirrors the filter="data" protection on Python 3.12+.

The zip fallback (_safe_extract_zip) is unaffected — zipfile.extractall does not recreate stored symlinks as links.

Tests

lib/cli/tests/skills/test_safe_extract.py:

  • absolute escaping symlink + write-through file → rejected, nothing written outside dest
  • relative (../../) escaping symlink → rejected
  • benign in-tree symlink → allowed
  • ordinary archive → extracts correctly

All 4 pass; the two malicious-link tests fail against the pre-patch code, confirming they are genuine regression guards.

Summary by CodeRabbit

  • Bug Fixes

    • Enhanced archive extraction security to prevent malicious archives from writing files outside the intended destination directory via symlink and hardlink path traversal attacks.
  • Tests

    • Added regression tests to verify archive extraction security protections work correctly.

`_safe_extractall` (the Python < 3.12 fallback used by `crewai skills`
archive unpacking) validated each member's *name* against the destination
but never validated symlink/hardlink *targets*. A malicious skill tarball
could plant a symlink escaping the destination (e.g. `link -> /home/user/.ssh`)
followed by a regular member written through it (`link/authorized_keys`),
escaping `dest` even though every member name resolves inside it — the
classic symlink-extraction traversal.

The 3.12+ path (`extractall(..., filter="data")`) already blocks this; the
fallback now mirrors it by rejecting absolute link targets and any link
target that resolves outside the destination directory.

Adds regression tests covering absolute and relative escaping symlinks plus
benign in-tree symlinks and ordinary archives.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings June 19, 2026 04:58
@coderabbitai

coderabbitai Bot commented Jun 19, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

_safe_extractall in main.py is hardened to also validate symlink and hardlink targets, rejecting both absolute link targets and relative targets whose resolved path escapes the destination directory. A new test module test_safe_extract.py adds four regression tests with in-memory tar archives covering both blocked and allowed extraction scenarios.

Changes

Symlink/hardlink path traversal hardening in _safe_extractall

Layer / File(s) Summary
_safe_extractall link-target validation
lib/cli/src/crewai_cli/experimental/skills/main.py
Adds checks to reject symlink/hardlink members with absolute link targets or relative targets that resolve outside dest; expands docstring to describe the symlink-escape scenario and its mitigation.
Regression tests for path-traversal blocking
lib/cli/tests/skills/test_safe_extract.py
Introduces _tar_from_members helper and four pytest functions: absolute-symlink escape blocked, relative-symlink escape (../../) blocked, benign relative symlink allowed, and benign regular-file archive allowed.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 60.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Fix symlink path traversal in skill archive extraction' directly and accurately summarizes the main security fix: preventing symlink-based path traversal attacks in the archive extraction function.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix-skill-archive-symlink-traversal

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens the skill archive extraction path used on Python < 3.12 by extending _safe_extractall to validate symlink/hardlink targets (not just member names), preventing link-based path traversal that could otherwise enable writes outside the intended destination directory.

Changes:

  • Add link-target validation for symlink and hardlink tar members in the Python < 3.12 extraction fallback.
  • Add regression tests covering escaping and benign symlink scenarios, plus baseline extraction behavior.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
lib/cli/src/crewai_cli/experimental/skills/main.py Strengthens _safe_extractall with link-target validation to prevent symlink/hardlink traversal on Python < 3.12.
lib/cli/tests/skills/test_safe_extract.py Adds regression tests ensuring malicious escaping symlinks are rejected and benign archives still extract correctly.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread lib/cli/src/crewai_cli/experimental/skills/main.py
Comment thread lib/cli/tests/skills/test_safe_extract.py

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
lib/cli/tests/skills/test_safe_extract.py (2)

70-88: 💤 Low value

Consider verifying the symlink was actually created.

The test confirms real.txt extracts correctly but doesn't assert that alias.txt exists or is a symlink. Adding an assertion would strengthen the test's coverage of the "allow" path for symlinks.

     assert (dest / "real.txt").read_bytes() == b"hi"
+    assert (dest / "alias.txt").is_symlink()
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@lib/cli/tests/skills/test_safe_extract.py` around lines 70 - 88, The
test_allows_benign_relative_symlink test only verifies that the real.txt file
was extracted with correct content but does not verify that the alias.txt
symlink was actually created during extraction. Add assertions after the
_safe_extractall call to verify that the alias.txt symlink exists at the
destination path and confirm it is actually a symbolic link (using is_symlink()
method) and that it points to the correct target (real.txt).

1-107: ⚡ Quick win

Consider adding hardlink test coverage.

The implementation validates both issym() and islnk() (hardlinks) with different anchor semantics, but only symlinks are tested. A hardlink escape test would confirm the member.islnk() branch (anchored to archive root) behaves correctly.

def test_blocks_hardlink_escaping_destination(tmp_path: Path) -> None:
    """A hardlink whose target escapes dest is rejected."""
    dest = tmp_path / "dest"
    dest.mkdir()

    def build(tf: tarfile.TarFile) -> None:
        link = tarfile.TarInfo("escape")
        link.type = tarfile.LNKTYPE
        link.linkname = "../outside.txt"  # escapes archive root
        tf.addfile(link)

    with _tar_from_members(build) as tf:
        with pytest.raises(ValueError, match="escaping destination"):
            _safe_extractall(tf, dest)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@lib/cli/tests/skills/test_safe_extract.py` around lines 1 - 107, Add a new
test function named test_blocks_hardlink_escaping_destination after the existing
symlink tests to validate that hardlinks with escaping targets are properly
rejected. The test should create a tar archive with a hardlink member (using
tarfile.LNKTYPE) whose linkname escapes the archive root, call _safe_extractall
with the archive and destination path, and assert that a ValueError is raised
with a message matching "escaping destination". This ensures the hardlink
validation branch (member.islnk()) in the _safe_extractall function works
correctly with its archive-root-based anchor semantics, matching the coverage
already provided for symlinks.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@lib/cli/tests/skills/test_safe_extract.py`:
- Around line 70-88: The test_allows_benign_relative_symlink test only verifies
that the real.txt file was extracted with correct content but does not verify
that the alias.txt symlink was actually created during extraction. Add
assertions after the _safe_extractall call to verify that the alias.txt symlink
exists at the destination path and confirm it is actually a symbolic link (using
is_symlink() method) and that it points to the correct target (real.txt).
- Around line 1-107: Add a new test function named
test_blocks_hardlink_escaping_destination after the existing symlink tests to
validate that hardlinks with escaping targets are properly rejected. The test
should create a tar archive with a hardlink member (using tarfile.LNKTYPE) whose
linkname escapes the archive root, call _safe_extractall with the archive and
destination path, and assert that a ValueError is raised with a message matching
"escaping destination". This ensures the hardlink validation branch
(member.islnk()) in the _safe_extractall function works correctly with its
archive-root-based anchor semantics, matching the coverage already provided for
symlinks.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 46717317-8e7d-4b5f-9816-ac6bf661dc83

📥 Commits

Reviewing files that changed from the base of the PR and between 854c67d and 32d2da1.

📒 Files selected for processing (2)
  • lib/cli/src/crewai_cli/experimental/skills/main.py
  • lib/cli/tests/skills/test_safe_extract.py

@github-actions github-actions Bot added size/L and removed size/M labels Jun 24, 2026
@lorenzejay lorenzejay merged commit fac3e35 into main Jun 24, 2026
55 checks passed
@lorenzejay lorenzejay deleted the fix-skill-archive-symlink-traversal branch June 24, 2026 15:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants