Add oversize submission detection and backward-compatible DB flag#73
Merged
Conversation
Detect oversize submissions at file-upload time and persist the flag to the classic arXiv_submissions.is_oversize column. The flag is a soft gate: the submitter is warned but can still proceed. The auto-hold effect at finalize is a later phase. Adds policy module: - submit_ce/domain/size_limits.py: SizeLimits (three per-archive limits, default 50 MB) and a pure check_sizes() decision function. Enforces the total and per-file uncompressed limits, matching legacy check_sizes; compressed limit defined but not enforced. OVERRIDE/MAXSIZE env escapes. - ui/config.py: MAX_UNCOMPRESSED_TOTAL_KB / _PER_FILE_KB / _COMPRESSED_KB. Adds detection in the file `Event`: - Submission.is_oversize: flag set when files change. - UploadArchive/UploadFiles/RemoveFiles evaluate the authoritative workspace against the limits in execute(), persist the result, and apply it in project() (deterministic on replay); RemoveAllFiles clears it. - SubmitApi.get_size_limits() (defaults) + Flask override reading config, so the domain event reaches limits through the api boundary. - upload controller flashes an oversize warning. Adds submit 1.5 backward-compatible DB projection: - update_from_submission writes arXiv_submissions.is_oversize; to_submission reads it back so rows round-trip. Tests: size_limits unit tests, event detection tests, and UI/DB persistence tests (column write + domain round-trip). Co-Authored-By: Claude Opus 4.8 (1M context)
bmaltzan
reviewed
Jun 10, 2026
| if workspace is None: | ||
| return False | ||
| per_file = {file.path: file.bytes for file in workspace.files} | ||
| total = workspace.size or 0 |
Contributor
There was a problem hiding this comment.
So per_file is the sum of file space used in src/
Is per_file the same as workspace.size or is the comment wrong?
or is workspace.size out of date?
class Workspace(BaseModel):
size: Optional[int] = None
"""Size in bytes of the uncompressed upload workspace."""
Contributor
Author
There was a problem hiding this comment.
The per_file here is a dict of file.path -> file.bytes
Contributor
Author
There was a problem hiding this comment.
workspace size should still be the uncompressed size of workspace.
The per_file is to detect single files that are over the size limit.
bmaltzan
approved these changes
Jun 10, 2026
bmaltzan
left a comment
Contributor
There was a problem hiding this comment.
The file size variable names were a little confusing, but otherwise looks fine.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Detect oversize submissions at file-upload time and persist the flag to the classic arXiv_submissions.is_oversize column. The flag is a soft gate: the submitter is warned but can still proceed. The auto-hold effect at finalize is a later phase.
Adds policy module:
Adds detection in the file
Event:Adds submit 1.5 backward-compatible DB projection:
Tests: size_limits unit tests, event detection tests, and UI/DB persistence tests (column write + domain round-trip).
STILL TODO: need to do an auto-hold on finalize
Co-Authored-By: Claude Opus 4.8 (1M context)