Skip to content

feat(cli): continue queue on item failure with per-item timeout#147

Open
nitrobass24 wants to merge 4 commits into
mainfrom
feat/queue-continue-on-error
Open

feat(cli): continue queue on item failure with per-item timeout#147
nitrobass24 wants to merge 4 commits into
mainfrom
feat/queue-continue-on-error

Conversation

@nitrobass24

@nitrobass24 nitrobass24 commented Jun 23, 2026

Copy link
Copy Markdown
Collaborator

Summary

Makes --queue runs resilient for bulk processing. Two changes:

  1. Continue-on-error in queue mode. Previously the first input path that
    failed (e.g. a release rejected by all its trackers) aborted the entire
    queue. Now a per-item failure in queue mode is logged and the run continues,
    with an end-of-run summary and a non-zero exit if anything failed.

  2. Per-item timeout. The whole run previously shared a single 30-minute
    deadline (context.WithTimeout(ctx, 30*time.Minute)), so a large queue could
    be killed mid-way. Each input path now runs under its own cliItemTimeout
    (30m); the base context is uncapped.

Behavior outside queue mode is unchanged: explicit single/multi-path runs still
abort on the first error. --upload-only uploads every path in one core call,
so it budgets its timeout by item count to preserve a per-item allowance.

Motivation

When uploading a large library with --queue, many files legitimately fail
(e.g. encodes missing MediaInfo encoding settings, which UNIT3D trackers
reject). The old behavior stopped the whole run on the first such file, and the
shared 30-minute deadline meant long queues of large UHD files could be killed
before finishing.

Changes

  • cmd/upbrr/main.go
    • Add cliItemTimeout (30m) and a processCLIPaths helper that runs each path
      under runCLIPathWithTimeout, logging+skipping failures in queue mode and
      summarizing at the end.
    • Base context switched from a 30m run-wide timeout to WithCancel.
    • --upload-only wrapped in a count-scaled timeout.
  • cmd/upbrr/main_test.go
    • TestProcessCLIPathsQueueModeContinuesOnError — all items attempted, summary error.
    • TestProcessCLIPathsQueueModeSucceedsWhenAllPass
    • TestProcessCLIPathsNonQueueAbortsOnFirstError — first-error abort preserved.
    • TestProcessCLIPathsAppliesPerItemTimeout — slow item cancelled at its own
      deadline; next item gets a fresh one.

Behavior

upbrr --queue movies /data/movies -ua -sda

A movie that fails all trackers is logged and skipped; the queue continues. At
the end: queue: N of M item(s) failed: … and a non-zero exit. Each item gets a
full 30-minute budget instead of sharing one run-wide clock.

Testing

  • make lint — 0 issues (golangci-lint + pathpolicy)
  • make test-go — full race suite passes
  • gofmt / golangci-lint fmt / logpolicy — clean
  • New tests: 4/4 pass

No frontend changes.

Summary by CodeRabbit

  • New Features
    • Queue processing now continues after individual item failures and returns one aggregated error at the end; non-queue runs stop on the first error.
    • Upload-only mode uses bounded, phase-specific timeouts, including per-item timeouts for queue uploads and capped deadlines for batch uploads.
    • BDMV playlist discovery/selection now respects per-disc deadlines and properly propagates cancellation during prompting/saving.
  • Validation
    • --queue whitespace is trimmed and rejected when it becomes empty.
  • Tests
    • Expanded coverage for queue continuation, per-item timeout enforcement, cancellation behavior, upload-only timeout bounding, and BDMV playlist deadline/cancel propagation (including interactive mode).

Bulk `--queue` runs previously aborted the entire queue on the first
input path that failed (e.g. a release rejected by all trackers), and
the whole run shared a single 30-minute deadline so large queues could
be killed mid-way.

Process each input path under its own timeout (cliItemTimeout) via a new
processCLIPaths helper. In queue mode a per-item failure is logged and
the run continues, with an end-of-run summary and non-zero exit if any
item failed. Non-queue (explicit path) runs keep aborting on first error.
--upload-only budgets its timeout by item count since it uploads in one
core call.
@coderabbitai

coderabbitai Bot commented Jun 23, 2026

Copy link
Copy Markdown

Review Change Stack

Warning

Review limit reached

@nitrobass24, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 4 seconds. Learn how PR review limits work.

Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file).

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits.

🚦 How do rate limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: cd53850b-93f3-4fa1-81b3-b4ba4cc346d8

📥 Commits

Reviewing files that changed from the base of the PR and between 76ae66d and 099cf5c.

📒 Files selected for processing (2)
  • cmd/upbrr/main.go
  • cmd/upbrr/main_test.go
📝 Walkthrough

Walkthrough

cmd/upbrr now uses separate timeout contexts for setup, queue gathering, upload-only batches, per-item processing, and BDMV disc selection. --queue values are trimmed and validated, and tests were added for timeout propagation, queue continuation, upload-only execution, and cancellation handling.

Changes

CLI timeout phases and playlist selection

Layer / File(s) Summary
Queue option trimming
cmd/upbrr/cli_options.go, cmd/upbrr/cli_options_test.go
parseCLIOptions trims visited --queue values, rejects whitespace-only queue names, and stores the trimmed result. Tests cover rejection and trimming behavior.
Setup and gather contexts
cmd/upbrr/main.go
run() creates a cancelable base context, derives setupCtx for pre-upload work, moves stored-release cleanup and --delete-tmp operations onto setupCtx, and runs queue path discovery under a separate gatherCtx with its own timeout.
Upload-only and per-item processing
cmd/upbrr/main.go, cmd/upbrr/main_test.go
runCLIUploadOnlyBatch bounds batch uploads with an upload timeout cap, runCLIUploadOnlyQueue processes each path independently and sums uploaded counts, and processCLIPaths plus runCLIPathWithTimeout apply per-item deadlines with queue-mode aggregation and parent-cancel short-circuiting. Tests cover queue continuation, abort behavior, timeout isolation, and upload-only timeout bounding.
BDMV playlist selection timeouts
cmd/upbrr/main.go, cmd/upbrr/main_test.go, cmd/upbrr/interactive_test.go
handleBDMVPlaylistSelection applies a per-disc discovery timeout, delegates disc handling to handleBDMVDiscSelection, and returns context cancellation or deadline errors from discovery, loading, prompting, and saving while continuing past non-context errors. The interactive test mock records discovery deadlines, upload calls, and context-aware preview behavior.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~55 minutes

Poem

🐇 Hop, hop, the clocks are split,
One for setup, one per bit.
Queue names trimmed with a tidy grin,
Fresh deadlines let each trail begin.
BDMV discs now take their turn —
little timer lights go purr!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 28.57% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main CLI behavior change: queue-mode continues on item failure with per-item timeouts.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/queue-continue-on-error

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@cmd/upbrr/main_test.go`:
- Around line 68-73: The test in non-queue mode currently only validates that an
error is returned with the check `if err == nil`, but does not verify that the
specific error type and exit code match the expected contract of
`cliExitError(code=1)`. Replace the generic nil check with a type assertion that
confirms the returned error is specifically a `cliExitError` with exit code 1,
ensuring the test properly validates the contract rather than just any error
occurrence.
- Around line 94-95: The deadline threshold of 10 milliseconds in the
time.Until(deadline) condition is too strict and causes timing flakes due to
scheduler and CI jitter. Relax the threshold by reducing the millisecond
comparison value from 10 to a smaller amount (such as 5 milliseconds) in the if
statement where fastSawFreshDeadline is set to true. This makes the deadline
check less stringent and more resilient to natural timing variations in the test
environment.

In `@cmd/upbrr/main.go`:
- Around line 196-199: The context setup at the line creating ctx and cancel is
currently using context.WithCancel without any timeout deadline, which removes
timeout protection from the setup operations that follow including
core.NewWithContext, filesystem.GatherQueuePaths, handleBDMVPlaylistSelection,
DeleteAllHistoryReleases, and deleteCLIStoredReleases. Replace
context.WithCancel(context.Background()) with context.WithTimeout to restore a
30-minute timeout deadline for these setup operations, ensuring the CLI will
fail after the deadline rather than hanging indefinitely if any operation blocks
on database locks or I/O issues.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 19d608fb-793a-4762-b738-193fec6e9663

📥 Commits

Reviewing files that changed from the base of the PR and between 9e6bc56 and 2d92132.

📒 Files selected for processing (2)
  • cmd/upbrr/main.go
  • cmd/upbrr/main_test.go

Comment thread cmd/upbrr/main_test.go Outdated
Comment thread cmd/upbrr/main_test.go Outdated
Comment thread cmd/upbrr/main.go
Address review on the queue continue-on-error change:
- Restore a deadline for pre-upload setup (core init, queue gathering,
  disc playlist selection, cleanup, delete-tmp) via a dedicated setupCtx
  so the CLI fails fast on a stuck database/IO call. The per-item upload
  loop stays uncapped so long queues are not killed by a shared deadline.
- Assert cliExitError with code 1 in the non-queue abort test instead of
  only checking for a non-nil error.
- Relax the per-item deadline assertion from 10ms to 5ms to avoid CI
  timing flakes.
@nitrobass24

Copy link
Copy Markdown
Collaborator Author

@CodeRabbit review

@coderabbitai

coderabbitai Bot commented Jun 23, 2026

Copy link
Copy Markdown
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@nitrobass24 nitrobass24 requested a review from Audionut June 23, 2026 21:48
Resolve all 9 maintainer review comments, plus follow-up findings from an
adversarial review pass, entirely within cmd/upbrr (internal/core untouched).

Upload-only queue (#1, #6): queue mode routes each path through processCLIPaths
for per-item timeout + continue-on-error; non-queue stays batched with a bounded
timeout (min(len,50)*cliItemTimeout). Fold partial UploadedCount before
propagating an item error; report the sum as accepted tracker uploads.

Context/timeout scoping (#3, #4, #5): split the shared setup context into
per-phase contexts (init/cleanup, queue gather, per-disc BDMV discovery) derived
from the root cancelable ctx; keep the interactive prompt on the untimed-but-
cancelable context; document cooperative cancellation.

BDMV cancellation (#2): isCtxErr guards at all detect/load/discover/save sites so
cancellation is surfaced instead of swallowed by continue.

Queue UX (#7, #8, #9): summary error wraps the first cause and quotes failed
paths; whitespace-only --queue rejected at parse time. processCLIPaths now
honors parent-context cancellation instead of logging spurious per-item failures.

Tests: per-disc fresh-deadline, context threading into site-check/interactive
paths, prompt-loop save ctx-guards, parent-cancel abort, plus the fixes above.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@cmd/upbrr/main.go`:
- Around line 416-440: The queue loop in main should short-circuit on parent ctx
cancellation before treating the current item as a normal failure. In the
runCLIPathWithTimeout call path, check ctx.Err() immediately after a non-nil
error and, if the context was canceled or expired, return the
cancellation/timeout error directly instead of appending to failed or
aggregating in firstErr. Keep the existing queueMode failure handling for real
item errors, but make sure the final queue summary in the main loop is only
reached for non-cancellation failures.
- Around line 1287-1293: The BDMV prompt currently blocks in fmt.Scanln inside
the prompt loop, so promptCtx is only checked after input returns and
cancellation cannot stop the wait. Update the stdin handling in the prompt logic
around fmt.Scanln to be cancellation-aware, using either a cancel-aware reader
or a goroutine with a select on promptCtx, so the prompt exits immediately when
the context is canceled.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: ee1dfd39-5924-4216-90c9-67c7adeb1ba2

📥 Commits

Reviewing files that changed from the base of the PR and between 851c323 and 76ae66d.

📒 Files selected for processing (5)
  • cmd/upbrr/cli_options.go
  • cmd/upbrr/cli_options_test.go
  • cmd/upbrr/interactive_test.go
  • cmd/upbrr/main.go
  • cmd/upbrr/main_test.go

Comment thread cmd/upbrr/main.go
Comment thread cmd/upbrr/main.go
The queue loop only checked the parent context at the top of each
iteration, so a failure caused by parent cancellation during the current
item — notably the final item, where no next iteration runs — was
recorded as a normal queue failure and surfaced as a "queue completed
with N failed" summary instead of an abort.

Add a symmetric parent-cancel check immediately after a non-nil item
error. Per-item timeouts live on the derived itemCtx, so they leave the
parent ctx.Err() nil and still continue the queue as ordinary failures.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant