feat(cli): continue queue on item failure with per-item timeout#147
feat(cli): continue queue on item failure with per-item timeout#147nitrobass24 wants to merge 4 commits into
Conversation
Bulk `--queue` runs previously aborted the entire queue on the first input path that failed (e.g. a release rejected by all trackers), and the whole run shared a single 30-minute deadline so large queues could be killed mid-way. Process each input path under its own timeout (cliItemTimeout) via a new processCLIPaths helper. In queue mode a per-item failure is logged and the run continues, with an end-of-run summary and non-zero exit if any item failed. Non-queue (explicit path) runs keep aborting on first error. --upload-only budgets its timeout by item count since it uploads in one core call.
|
Warning Review limit reached
More reviews will be available in 4 seconds. Learn how PR review limits work. Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file). ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits. 🚦 How do rate limits work?CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability. For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthrough
ChangesCLI timeout phases and playlist selection
Estimated code review effort🎯 4 (Complex) | ⏱️ ~55 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@cmd/upbrr/main_test.go`:
- Around line 68-73: The test in non-queue mode currently only validates that an
error is returned with the check `if err == nil`, but does not verify that the
specific error type and exit code match the expected contract of
`cliExitError(code=1)`. Replace the generic nil check with a type assertion that
confirms the returned error is specifically a `cliExitError` with exit code 1,
ensuring the test properly validates the contract rather than just any error
occurrence.
- Around line 94-95: The deadline threshold of 10 milliseconds in the
time.Until(deadline) condition is too strict and causes timing flakes due to
scheduler and CI jitter. Relax the threshold by reducing the millisecond
comparison value from 10 to a smaller amount (such as 5 milliseconds) in the if
statement where fastSawFreshDeadline is set to true. This makes the deadline
check less stringent and more resilient to natural timing variations in the test
environment.
In `@cmd/upbrr/main.go`:
- Around line 196-199: The context setup at the line creating ctx and cancel is
currently using context.WithCancel without any timeout deadline, which removes
timeout protection from the setup operations that follow including
core.NewWithContext, filesystem.GatherQueuePaths, handleBDMVPlaylistSelection,
DeleteAllHistoryReleases, and deleteCLIStoredReleases. Replace
context.WithCancel(context.Background()) with context.WithTimeout to restore a
30-minute timeout deadline for these setup operations, ensuring the CLI will
fail after the deadline rather than hanging indefinitely if any operation blocks
on database locks or I/O issues.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 19d608fb-793a-4762-b738-193fec6e9663
📒 Files selected for processing (2)
cmd/upbrr/main.gocmd/upbrr/main_test.go
Address review on the queue continue-on-error change: - Restore a deadline for pre-upload setup (core init, queue gathering, disc playlist selection, cleanup, delete-tmp) via a dedicated setupCtx so the CLI fails fast on a stuck database/IO call. The per-item upload loop stays uncapped so long queues are not killed by a shared deadline. - Assert cliExitError with code 1 in the non-queue abort test instead of only checking for a non-nil error. - Relax the per-item deadline assertion from 10ms to 5ms to avoid CI timing flakes.
|
@CodeRabbit review |
✅ Action performedReview finished.
|
Resolve all 9 maintainer review comments, plus follow-up findings from an adversarial review pass, entirely within cmd/upbrr (internal/core untouched). Upload-only queue (#1, #6): queue mode routes each path through processCLIPaths for per-item timeout + continue-on-error; non-queue stays batched with a bounded timeout (min(len,50)*cliItemTimeout). Fold partial UploadedCount before propagating an item error; report the sum as accepted tracker uploads. Context/timeout scoping (#3, #4, #5): split the shared setup context into per-phase contexts (init/cleanup, queue gather, per-disc BDMV discovery) derived from the root cancelable ctx; keep the interactive prompt on the untimed-but- cancelable context; document cooperative cancellation. BDMV cancellation (#2): isCtxErr guards at all detect/load/discover/save sites so cancellation is surfaced instead of swallowed by continue. Queue UX (#7, #8, #9): summary error wraps the first cause and quotes failed paths; whitespace-only --queue rejected at parse time. processCLIPaths now honors parent-context cancellation instead of logging spurious per-item failures. Tests: per-disc fresh-deadline, context threading into site-check/interactive paths, prompt-loop save ctx-guards, parent-cancel abort, plus the fixes above.
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@cmd/upbrr/main.go`:
- Around line 416-440: The queue loop in main should short-circuit on parent ctx
cancellation before treating the current item as a normal failure. In the
runCLIPathWithTimeout call path, check ctx.Err() immediately after a non-nil
error and, if the context was canceled or expired, return the
cancellation/timeout error directly instead of appending to failed or
aggregating in firstErr. Keep the existing queueMode failure handling for real
item errors, but make sure the final queue summary in the main loop is only
reached for non-cancellation failures.
- Around line 1287-1293: The BDMV prompt currently blocks in fmt.Scanln inside
the prompt loop, so promptCtx is only checked after input returns and
cancellation cannot stop the wait. Update the stdin handling in the prompt logic
around fmt.Scanln to be cancellation-aware, using either a cancel-aware reader
or a goroutine with a select on promptCtx, so the prompt exits immediately when
the context is canceled.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: ee1dfd39-5924-4216-90c9-67c7adeb1ba2
📒 Files selected for processing (5)
cmd/upbrr/cli_options.gocmd/upbrr/cli_options_test.gocmd/upbrr/interactive_test.gocmd/upbrr/main.gocmd/upbrr/main_test.go
The queue loop only checked the parent context at the top of each iteration, so a failure caused by parent cancellation during the current item — notably the final item, where no next iteration runs — was recorded as a normal queue failure and surfaced as a "queue completed with N failed" summary instead of an abort. Add a symmetric parent-cancel check immediately after a non-nil item error. Per-item timeouts live on the derived itemCtx, so they leave the parent ctx.Err() nil and still continue the queue as ordinary failures.
Summary
Makes
--queueruns resilient for bulk processing. Two changes:Continue-on-error in queue mode. Previously the first input path that
failed (e.g. a release rejected by all its trackers) aborted the entire
queue. Now a per-item failure in queue mode is logged and the run continues,
with an end-of-run summary and a non-zero exit if anything failed.
Per-item timeout. The whole run previously shared a single 30-minute
deadline (
context.WithTimeout(ctx, 30*time.Minute)), so a large queue couldbe killed mid-way. Each input path now runs under its own
cliItemTimeout(30m); the base context is uncapped.
Behavior outside queue mode is unchanged: explicit single/multi-path runs still
abort on the first error.
--upload-onlyuploads every path in one core call,so it budgets its timeout by item count to preserve a per-item allowance.
Motivation
When uploading a large library with
--queue, many files legitimately fail(e.g. encodes missing MediaInfo encoding settings, which UNIT3D trackers
reject). The old behavior stopped the whole run on the first such file, and the
shared 30-minute deadline meant long queues of large UHD files could be killed
before finishing.
Changes
cmd/upbrr/main.gocliItemTimeout(30m) and aprocessCLIPathshelper that runs each pathunder
runCLIPathWithTimeout, logging+skipping failures in queue mode andsummarizing at the end.
WithCancel.--upload-onlywrapped in a count-scaled timeout.cmd/upbrr/main_test.goTestProcessCLIPathsQueueModeContinuesOnError— all items attempted, summary error.TestProcessCLIPathsQueueModeSucceedsWhenAllPassTestProcessCLIPathsNonQueueAbortsOnFirstError— first-error abort preserved.TestProcessCLIPathsAppliesPerItemTimeout— slow item cancelled at its owndeadline; next item gets a fresh one.
Behavior
A movie that fails all trackers is logged and skipped; the queue continues. At
the end:
queue: N of M item(s) failed: …and a non-zero exit. Each item gets afull 30-minute budget instead of sharing one run-wide clock.
Testing
make lint— 0 issues (golangci-lint + pathpolicy)make test-go— full race suite passesgofmt/golangci-lint fmt/logpolicy— cleanNo frontend changes.
Summary by CodeRabbit
--queuewhitespace is trimmed and rejected when it becomes empty.