Skip to content

feat(chat): support zip uploads as virtual folders in the copilot VFS#5252

Open
waleedlatif1 wants to merge 7 commits into
stagingfrom
zip-upload-support
Open

feat(chat): support zip uploads as virtual folders in the copilot VFS#5252
waleedlatif1 wants to merge 7 commits into
stagingfrom
zip-upload-support

Conversation

@waleedlatif1

Copy link
Copy Markdown
Collaborator

Summary

  • Accept .zip chat attachments (previously rejected with "Unsupported file type: zip")
  • Present each uploaded archive as a virtual folder in the copilot VFS — the agent lists entries with glob("uploads/x.zip/*"), reads them with read("uploads/x.zip/<path>"), and greps inside them
  • Store the archive once; extract entries lazily on read, reusing the existing file-parsers and the zip-bomb / zip-slip / symlink guards (factored out of the file-manage decompress route into lib/uploads/archive.ts)
  • The upload message shows a capped inline file tree so the agent sees contents without a glob round-trip
  • No changes to the Go copilot service — it only ever sees normal VFS reads

Type of Change

  • New feature

Testing

  • 72 unit tests across archive, validation, file-reader (renderFileBuffer + binary no-fetch), and the VFS read/glob/grep routing (incl. an NFD-unicode entry round-trip)
  • tsc, biome, and check:api-validation all clean
  • manage/route.ts decompress refactor is behavior-identical (shared primitives match the removed local copies exactly)

Checklist

  • Code follows project style guidelines
  • Self-reviewed my changes
  • Tests added/updated and passing
  • No new warnings introduced
  • I confirm that I have read and agree to the terms outlined in the Contributor License Agreement (CLA)

@vercel

vercel Bot commented Jun 28, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
docs Skipped Skipped Jun 28, 2026 9:56pm

Request Review

@cursor

cursor Bot commented Jun 28, 2026

Copy link
Copy Markdown

PR Summary

Medium Risk
New user-facing archive parsing and VFS path semantics add complexity, but caps and shared zip-bomb guards limit exposure; file-manage decompress is a refactor to the same primitives.

Overview
Zip chat attachments are now allowed and treated as virtual folders under uploads/<name>/…. The agent can glob inside an archive, read or grep individual entries, or get a manifest on a bare archive read; upload context can include a capped inline file tree when a message is sent.

Archive handling is centralized in lib/uploads/archive.ts (zip-slip, symlink skip, streaming inflate caps). The file-manage decompress route imports those helpers instead of local duplicates. fetchWorkspaceFileBuffer accepts optional maxBytes so oversized archives are rejected before full download.

Upload I/O moves from flat readChatUpload / grepChatUpload to readChatUploadPath / grepChatUploadPath (first segment + optional entry path). VFS glob expands archive entries when the pattern targets a specific zip; renderFileBuffer reuses file-reader logic for extracted bytes. readFileRecord skips downloading clearly binary or over-limit files when metadata allows.

Reviewed by Cursor Bugbot for commit 99f272e. Configure here.

Comment thread apps/sim/lib/copilot/tools/handlers/upload-file-reader.ts Outdated
Comment thread apps/sim/lib/copilot/tools/handlers/upload-file-reader.ts Outdated
@greptile-apps

greptile-apps Bot commented Jun 28, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR adds zip uploads as virtual folders in the copilot VFS. The main changes are:

  • Accept .zip chat attachments.
  • List uploaded archive entries through glob("uploads/<zip>/**").
  • Read and grep archive entries lazily through the existing VFS paths.
  • Share archive safety helpers between file decompression and copilot reads.
  • Add tests for archive routing, size limits, encoding, and duplicate entry handling.

Confidence Score: 5/5

This looks safe to merge.

  • No blocking issues found in the changed code.

Important Files Changed

Filename Overview
apps/sim/lib/copilot/tools/handlers/upload-file-reader.ts Adds archive upload lookup, listing, read, grep, size checks, and VFS-key de-duplication.
apps/sim/lib/copilot/tools/handlers/vfs.ts Routes upload paths with archive entry segments and expands specific archive globs.
apps/sim/lib/uploads/archive.ts Adds shared zip listing and extraction helpers with archive safety limits.
apps/sim/lib/copilot/vfs/file-reader.ts Splits buffer rendering from stored-file reads so archive entries use the same rendering logic.
apps/sim/lib/uploads/contexts/workspace/workspace-file-manager.ts Adds an optional byte cap when downloading workspace file buffers.

Reviews (6): Last reviewed commit: "harden(chat): stream-cap archive downloa..." | Re-trigger Greptile

Comment thread apps/sim/lib/copilot/tools/handlers/upload-file-reader.ts Outdated
Comment thread apps/sim/lib/copilot/tools/handlers/vfs.ts
Comment thread apps/sim/lib/copilot/tools/handlers/upload-file-reader.ts Outdated
Accept .zip chat attachments and present each archive as a virtual folder the
agent lists and reads entry-by-entry. The archive is stored once; entries are
extracted lazily on read, reusing the existing file-parsers and zip-bomb /
zip-slip guards. No changes to the Go copilot service.

- allow zip in the attachment allowlist + chat accept attribute
- shared lib/uploads/archive.ts (factored from the file-manage decompress route)
- split readFileRecord into a pure renderFileBuffer reused for in-zip entries
- single-resolve readChatUploadPath/grepChatUploadPath dispatchers + VFS routing
- inline file tree in the upload context message
…ntries

Address review findings on the zip-upload feature:
- guard archive list/read/grep on record.size > MAX_ARCHIVE_BYTES before
  downloading, so an oversized zip is never buffered into memory
- a not-found archive entry now returns the file-tree manifest with a note
  (handles a stray /content habit suffix and typos) instead of failing
- de-duplicate archive entries that sanitize to the same path (./a/b vs a/b)
A name like test%2A.zip is exposed double-encoded by glob/upload-context
(test%252A.zip) but canonicalUploadKey decodes the input first, so a literal
%2A is indistinguishable from an encoded * and the lookup misses. Add an
encoded-form fallback (encode the stored name, compare to the raw input) which
recovers the row without affecting the U+202F normalization path.
@waleedlatif1

Copy link
Copy Markdown
Collaborator Author

@greptile

@waleedlatif1

Copy link
Copy Markdown
Collaborator Author

@cursor review

Comment thread apps/sim/lib/copilot/tools/handlers/vfs.ts
Comment thread apps/sim/lib/copilot/tools/handlers/upload-file-reader.ts
…encoding

Address round-2 review:
- filter expanded archive entries through the same micromatch matcher as the
  VFS map (new matchesVfsGlob), so uploads/<zip>/data/* and /** are scoped
  correctly instead of returning every entry
- build archive vfsPaths/manifest/grep labels with the per-segment encoder
  (encodeUploadName) so the zip segment matches the broad glob's spelling for
  literal-% names; canonicalUploadKey stays for resolution only
- upload-context now hints glob("uploads/<zip>/**") to list all entries
@waleedlatif1

Copy link
Copy Markdown
Collaborator Author

@greptile

@waleedlatif1

Copy link
Copy Markdown
Collaborator Author

@cursor review

Comment thread apps/sim/lib/copilot/tools/handlers/upload-file-reader.ts
Comment thread apps/sim/lib/uploads/archive.ts
…t-found

A nested archive read/grep ran findArchiveEntryRawPath outside the ArchiveError
catch, so an invalid/too-many-entries archive escaped to the generic handler and
showed as "Upload not found" (read) or a generic grep failure, while a bare
archive read already surfaced the real reason. Widen the catch so both paths
report the actual ArchiveError message.
@waleedlatif1

Copy link
Copy Markdown
Collaborator Author

@greptile

@waleedlatif1

Copy link
Copy Markdown
Collaborator Author

@cursor review

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Bugbot reviewed your changes and found no new issues!

Comment @cursor review or bugbot run to trigger another review on this PR

Reviewed by Cursor Bugbot for commit 8eb5503. Configure here.

Comment thread apps/sim/lib/uploads/archive.ts Outdated
Greptile: dedup keyed on the raw sanitized path, but archive paths are exposed
through VFS per-segment encoding that NFC-normalizes, so visually-identical
NFC/NFD entries (e.g. café.txt) kept both while emitting one shared vfsPath —
shadowing the second on read. Move dedup to listChatUploadArchiveEntries /
buildArchiveManifest, keyed on the same encodeEntryPath the resolver matches on;
listArchiveEntries now returns raw paths (dedup is a VFS-presentation concern).
@waleedlatif1

Copy link
Copy Markdown
Collaborator Author

@greptile

@waleedlatif1

Copy link
Copy Markdown
Collaborator Author

@cursor review

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Bugbot reviewed your changes and found no new issues!

Comment @cursor review or bugbot run to trigger another review on this PR

Reviewed by Cursor Bugbot for commit 74feedf. Configure here.

Final-audit hardening for the zip-upload feature:
- fetchWorkspaceFileBuffer gains an optional maxBytes that flows to downloadFile,
  enforced on the actual byte stream. The three archive fetches pass
  MAX_ARCHIVE_BYTES, so a stored object larger than its client-declared
  record.size can no longer be buffered fully into memory (record.size stays as
  a cheap early-out). Fixes a comment that overclaimed download-cap parity.
- grepping a bare archive (no entry) now throws a guiding WorkspaceFileGrepError
  pointing at an entry/manifest, instead of grepping the binary placeholder.
@waleedlatif1

Copy link
Copy Markdown
Collaborator Author

@greptile

@waleedlatif1

Copy link
Copy Markdown
Collaborator Author

@cursor review

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Bugbot reviewed your changes and found no new issues!

Comment @cursor review or bugbot run to trigger another review on this PR

Reviewed by Cursor Bugbot for commit 99f272e. Configure here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant