Fix symlink path traversal in skill archive extraction by theCyberTech · Pull Request #6235 · crewAIInc/crewAI

theCyberTech · 2026-06-19T04:58:01Z

Summary

_safe_extractall — the Python < 3.12 fallback used by crewai skills when unpacking a downloaded skill archive (_unpack_archive) — validated each tar member's name against the destination directory but never validated symlink / hardlink targets.

Because the project supports Python >=3.10, this fallback runs on real installs (3.10 / 3.11). The 3.12+ path uses tarfile.extractall(..., filter="data"), which already blocks malicious links; the fallback did not provide the equivalent guard.

Impact

A malicious skill tarball can:

plant a symlink whose target escapes dest (e.g. link -> /home/user/.ssh), then
include a regular member written through that link (link/authorized_keys).

Every member name resolves inside dest, so the old check passed — but extraction creates the symlink and then writes through it, landing the payload outside dest. Classic symlink-extraction path traversal / arbitrary file write from an externally-sourced archive.

Fix

_safe_extractall now, for symlink/hardlink members, rejects:

absolute link targets, and
any link target that resolves outside the destination directory

(anchored correctly: hardlink names are relative to the archive root, symlink targets relative to the member's own directory). This mirrors the filter="data" protection on Python 3.12+.

The zip fallback (_safe_extract_zip) is unaffected — zipfile.extractall does not recreate stored symlinks as links.

Tests

lib/cli/tests/skills/test_safe_extract.py:

absolute escaping symlink + write-through file → rejected, nothing written outside dest
relative (../../) escaping symlink → rejected
benign in-tree symlink → allowed
ordinary archive → extracts correctly

All 4 pass; the two malicious-link tests fail against the pre-patch code, confirming they are genuine regression guards.

Summary by CodeRabbit

Bug Fixes
- Enhanced archive extraction security to prevent malicious archives from writing files outside the intended destination directory via symlink and hardlink path traversal attacks.
Tests
- Added regression tests to verify archive extraction security protections work correctly.

`_safe_extractall` (the Python < 3.12 fallback used by `crewai skills` archive unpacking) validated each member's *name* against the destination but never validated symlink/hardlink *targets*. A malicious skill tarball could plant a symlink escaping the destination (e.g. `link -> /home/user/.ssh`) followed by a regular member written through it (`link/authorized_keys`), escaping `dest` even though every member name resolves inside it — the classic symlink-extraction traversal. The 3.12+ path (`extractall(..., filter="data")`) already blocks this; the fallback now mirrors it by rejecting absolute link targets and any link target that resolves outside the destination directory. Adds regression tests covering absolute and relative escaping symlinks plus benign in-tree symlinks and ordinary archives. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

coderabbitai · 2026-06-19T04:58:22Z

📝 Walkthrough

Walkthrough

_safe_extractall in main.py is hardened to also validate symlink and hardlink targets, rejecting both absolute link targets and relative targets whose resolved path escapes the destination directory. A new test module test_safe_extract.py adds four regression tests with in-memory tar archives covering both blocked and allowed extraction scenarios.

Changes

Symlink/hardlink path traversal hardening in _safe_extractall

Layer / File(s)	Summary
`_safe_extractall` link-target validation `lib/cli/src/crewai_cli/experimental/skills/main.py`	Adds checks to reject symlink/hardlink members with absolute link targets or relative targets that resolve outside `dest`; expands docstring to describe the symlink-escape scenario and its mitigation.
Regression tests for path-traversal blocking `lib/cli/tests/skills/test_safe_extract.py`	Introduces `_tar_from_members` helper and four pytest functions: absolute-symlink escape blocked, relative-symlink escape (`../../`) blocked, benign relative symlink allowed, and benign regular-file archive allowed.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 60.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'Fix symlink path traversal in skill archive extraction' directly and accurately summarizes the main security fix: preventing symlink-based path traversal attacks in the archive extraction function.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix-skill-archive-symlink-traversal

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

Copilot

Pull request overview

This PR hardens the skill archive extraction path used on Python < 3.12 by extending _safe_extractall to validate symlink/hardlink targets (not just member names), preventing link-based path traversal that could otherwise enable writes outside the intended destination directory.

Changes:

Add link-target validation for symlink and hardlink tar members in the Python < 3.12 extraction fallback.
Add regression tests covering escaping and benign symlink scenarios, plus baseline extraction behavior.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File	Description
`lib/cli/src/crewai_cli/experimental/skills/main.py`	Strengthens `_safe_extractall` with link-target validation to prevent symlink/hardlink traversal on Python < 3.12.
`lib/cli/tests/skills/test_safe_extract.py`	Adds regression tests ensuring malicious escaping symlinks are rejected and benign archives still extract correctly.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

coderabbitai

🧹 Nitpick comments (2)

lib/cli/tests/skills/test_safe_extract.py (2)

70-88: 💤 Low value

Consider verifying the symlink was actually created.

The test confirms real.txt extracts correctly but doesn't assert that alias.txt exists or is a symlink. Adding an assertion would strengthen the test's coverage of the "allow" path for symlinks.

     assert (dest / "real.txt").read_bytes() == b"hi"
+    assert (dest / "alias.txt").is_symlink()

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@lib/cli/tests/skills/test_safe_extract.py` around lines 70 - 88, The
test_allows_benign_relative_symlink test only verifies that the real.txt file
was extracted with correct content but does not verify that the alias.txt
symlink was actually created during extraction. Add assertions after the
_safe_extractall call to verify that the alias.txt symlink exists at the
destination path and confirm it is actually a symbolic link (using is_symlink()
method) and that it points to the correct target (real.txt).

1-107: ⚡ Quick win

Consider adding hardlink test coverage.

The implementation validates both issym() and islnk() (hardlinks) with different anchor semantics, but only symlinks are tested. A hardlink escape test would confirm the member.islnk() branch (anchored to archive root) behaves correctly.

def test_blocks_hardlink_escaping_destination(tmp_path: Path) -> None:
    """A hardlink whose target escapes dest is rejected."""
    dest = tmp_path / "dest"
    dest.mkdir()

    def build(tf: tarfile.TarFile) -> None:
        link = tarfile.TarInfo("escape")
        link.type = tarfile.LNKTYPE
        link.linkname = "../outside.txt"  # escapes archive root
        tf.addfile(link)

    with _tar_from_members(build) as tf:
        with pytest.raises(ValueError, match="escaping destination"):
            _safe_extractall(tf, dest)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@lib/cli/tests/skills/test_safe_extract.py` around lines 1 - 107, Add a new
test function named test_blocks_hardlink_escaping_destination after the existing
symlink tests to validate that hardlinks with escaping targets are properly
rejected. The test should create a tar archive with a hardlink member (using
tarfile.LNKTYPE) whose linkname escapes the archive root, call _safe_extractall
with the archive and destination path, and assert that a ValueError is raised
with a message matching "escaping destination". This ensures the hardlink
validation branch (member.islnk()) in the _safe_extractall function works
correctly with its archive-root-based anchor semantics, matching the coverage
already provided for symlinks.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@lib/cli/tests/skills/test_safe_extract.py`:
- Around line 70-88: The test_allows_benign_relative_symlink test only verifies
that the real.txt file was extracted with correct content but does not verify
that the alias.txt symlink was actually created during extraction. Add
assertions after the _safe_extractall call to verify that the alias.txt symlink
exists at the destination path and confirm it is actually a symbolic link (using
is_symlink() method) and that it points to the correct target (real.txt).
- Around line 1-107: Add a new test function named
test_blocks_hardlink_escaping_destination after the existing symlink tests to
validate that hardlinks with escaping targets are properly rejected. The test
should create a tar archive with a hardlink member (using tarfile.LNKTYPE) whose
linkname escapes the archive root, call _safe_extractall with the archive and
destination path, and assert that a ValueError is raised with a message matching
"escaping destination". This ensures the hardlink validation branch
(member.islnk()) in the _safe_extractall function works correctly with its
archive-root-based anchor semantics, matching the coverage already provided for
symlinks.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 46717317-8e7d-4b5f-9816-ac6bf661dc83

📥 Commits

Reviewing files that changed from the base of the PR and between 854c67d and 32d2da1.

📒 Files selected for processing (2)

lib/cli/src/crewai_cli/experimental/skills/main.py
lib/cli/tests/skills/test_safe_extract.py

Copilot AI review requested due to automatic review settings June 19, 2026 04:58

github-actions Bot added the size/M label Jun 19, 2026

Copilot started reviewing on behalf of theCyberTech June 19, 2026 04:58 View session

Copilot AI reviewed Jun 19, 2026

View reviewed changes

Comment thread lib/cli/src/crewai_cli/experimental/skills/main.py

Comment thread lib/cli/tests/skills/test_safe_extract.py

coderabbitai Bot reviewed Jun 19, 2026

View reviewed changes

theCyberTech added 2 commits June 24, 2026 13:37

Merge branch 'main' into fix-skill-archive-symlink-traversal

6c2e441

Harden skill cache archive extraction

580878d

github-actions Bot added size/L and removed size/M labels Jun 24, 2026

Reject special skill archive members

f2e5c31

lorenzejay approved these changes Jun 24, 2026

View reviewed changes

lorenzejay merged commit fac3e35 into main Jun 24, 2026
55 checks passed

lorenzejay deleted the fix-skill-archive-symlink-traversal branch June 24, 2026 15:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix symlink path traversal in skill archive extraction#6235

Fix symlink path traversal in skill archive extraction#6235
lorenzejay merged 4 commits into
mainfrom
fix-skill-archive-symlink-traversal

theCyberTech commented Jun 19, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented Jun 19, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

theCyberTech commented Jun 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Impact

Fix

Tests

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

theCyberTech commented Jun 19, 2026 •

edited

Loading

coderabbitai Bot commented Jun 19, 2026 •

edited

Loading