Skip to content

feat(webapp): billing limits — pause, reject, recovery, and settings UI#3996

Open
kathiekiwi wants to merge 8 commits into
mainfrom
feature/billing-limits
Open

feat(webapp): billing limits — pause, reject, recovery, and settings UI#3996
kathiekiwi wants to merge 8 commits into
mainfrom
feature/billing-limits

Conversation

@kathiekiwi

@kathiekiwi kathiekiwi commented Jun 19, 2026

Copy link
Copy Markdown
Collaborator

Summary

Adds Billing Limits to the webapp.

Customers can set a monthly spend cap. When usage crosses the limit, billable environments enter a grace period. If the limit is not resolved before grace expires, new triggers are rejected until the organization increases or removes the limit.

The webapp consumes billing-limit state from the billing platform and enforces it across environments, queues, and trigger creation.

Depends on the matching cloud billing PR.

User-facing changes

Billing Limits settings

  • New /settings/billing-limits page replaces the standalone billing-alerts page.

  • Configure:

    • plan limit
    • custom limit
    • no limit
  • Configure billing alerts and notification emails.

  • Resolve active billing limits by increasing or removing the limit.

Org-wide banners

Adds banners for:

  • grace period
  • rejected state
  • billing limits not configured
  • upgrade prompts

Usage page

Shows the configured billing limit on the spend chart.

Enforcement

  • Billable environments are paused when an org enters grace.

  • New triggers are rejected once grace expires.

  • Billing-limit pauses cannot be manually resumed.

  • New environments created during grace/rejected inherit the correct paused state.

  • Recovery supports:

    • resuming queued runs
    • cancelling queued runs and starting fresh
    • optional cancellation of in-progress runs when a limit is reached

Infrastructure

  • Adds billing-limit workers and reconciliation.
  • Adds admin endpoints used by the billing platform.
  • Adds BILLING_LIMIT as an environment pause source.

Test plan

  • Configure limits, alerts, and emails.
  • Verify grace and rejected flows.
  • Verify trigger rejection after grace expiry.
  • Verify recovery flows (queue and new_only).
  • Verify new environments created during grace start paused.
  • Verify billing-limit pauses cannot be manually resumed.
  • Verify billing limit marker on the usage chart.

Notes

  • isConfigured: false means no billing limit has been configured yet.
  • mode: "none" means the customer explicitly opted out.
  • Grace pauses execution but still accepts triggers.
  • Rejected blocks new triggers.

@changeset-bot

changeset-bot Bot commented Jun 19, 2026

Copy link
Copy Markdown

⚠️ No Changeset found

Latest commit: 6608555

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@coderabbitai

coderabbitai Bot commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • ✅ Review completed - (🔄 Check again to review again)

Walkthrough

This PR implements a billing limits feature that lets organizations configure a monthly compute spend cap. When the cap is reached, a billing platform webhook triggers a grace period during which billable environments are paused; new task triggers are rejected once the grace period ends. A recovery flow lets users increase or remove the limit and choose to resume queued runs or accept cancellation. A Redis-backed worker periodically reconciles environment pause state against billing platform data. The feature adds a new /settings/billing-limits settings page that consolidates limit configuration, alert thresholds, and the recovery panel, replacing the old /settings/billing-alerts route (which now redirects). The unified OrgBanner component replaces the prior UpgradePrompt and EnvironmentBanner components with a selector-driven switch over all banner states. Environment pause-source tracking now distinguishes billing-limit-enforced pauses from manual pauses, gating pause/resume operations accordingly.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 21.53% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The PR title clearly and specifically summarizes the main changes: implementing billing limits with pause, reject, recovery, and settings UI for the webapp.
Description check ✅ Passed The PR description is comprehensive and well-structured, covering summary, user-facing changes, enforcement, infrastructure, test plan, and implementation notes.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feature/billing-limits

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

coderabbitai[bot]

This comment was marked as resolved.

@kathiekiwi kathiekiwi force-pushed the feature/billing-limits branch from 5c1a4bf to ac87dcd Compare June 22, 2026 09:05
@pkg-pr-new

pkg-pr-new Bot commented Jun 22, 2026

Copy link
Copy Markdown

Open in StackBlitz

@trigger.dev/build

npm i https://pkg.pr.new/@trigger.dev/build@cb4df39

trigger.dev

npm i https://pkg.pr.new/trigger.dev@cb4df39

@trigger.dev/core

npm i https://pkg.pr.new/@trigger.dev/core@cb4df39

@trigger.dev/python

npm i https://pkg.pr.new/@trigger.dev/python@cb4df39

@trigger.dev/react-hooks

npm i https://pkg.pr.new/@trigger.dev/react-hooks@cb4df39

@trigger.dev/redis-worker

npm i https://pkg.pr.new/@trigger.dev/redis-worker@cb4df39

@trigger.dev/rsc

npm i https://pkg.pr.new/@trigger.dev/rsc@cb4df39

@trigger.dev/schema-to-json

npm i https://pkg.pr.new/@trigger.dev/schema-to-json@cb4df39

@trigger.dev/sdk

npm i https://pkg.pr.new/@trigger.dev/sdk@cb4df39

commit: cb4df39

coderabbitai[bot]

This comment was marked as resolved.

@kathiekiwi kathiekiwi force-pushed the feature/billing-limits branch from ac87dcd to 80b0a02 Compare June 22, 2026 09:52
@kathiekiwi kathiekiwi changed the title Billing limits feat(webapp): billing limits — pause, reject, recovery, and settings UI Jun 22, 2026
devin-ai-integration[bot]

This comment was marked as resolved.

@kathiekiwi kathiekiwi force-pushed the feature/billing-limits branch from 80b0a02 to 18bce2f Compare June 22, 2026 16:12
devin-ai-integration[bot]

This comment was marked as resolved.

@kathiekiwi kathiekiwi force-pushed the feature/billing-limits branch from 18bce2f to c338591 Compare June 22, 2026 19:17
devin-ai-integration[bot]

This comment was marked as resolved.

@kathiekiwi kathiekiwi force-pushed the feature/billing-limits branch from c338591 to 31b6df9 Compare June 22, 2026 20:13
devin-ai-integration[bot]

This comment was marked as resolved.

@kathiekiwi kathiekiwi force-pushed the feature/billing-limits branch from 31b6df9 to 3200c8a Compare June 22, 2026 22:06
devin-ai-integration[bot]

This comment was marked as resolved.

@kathiekiwi kathiekiwi force-pushed the feature/billing-limits branch from 3200c8a to 7763a7a Compare June 22, 2026 22:56
coderabbitai[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

@kathiekiwi kathiekiwi force-pushed the feature/billing-limits branch from 7763a7a to 6d9aa23 Compare June 23, 2026 08:27
devin-ai-integration[bot]

This comment was marked as resolved.

@kathiekiwi kathiekiwi force-pushed the feature/billing-limits branch from 6d9aa23 to cc28bfd Compare June 23, 2026 16:28
@kathiekiwi kathiekiwi marked this pull request as ready for review June 23, 2026 16:28
devin-ai-integration[bot]

This comment was marked as resolved.

@kathiekiwi kathiekiwi force-pushed the feature/billing-limits branch from cc28bfd to 2a89262 Compare June 23, 2026 17:50
devin-ai-integration[bot]

This comment was marked as resolved.

@kathiekiwi kathiekiwi force-pushed the feature/billing-limits branch from 2a89262 to b478344 Compare June 23, 2026 18:18
coderabbitai[bot]

This comment was marked as resolved.

@kathiekiwi kathiekiwi force-pushed the feature/billing-limits branch from b478344 to ff448c9 Compare June 23, 2026 18:35
@kathiekiwi kathiekiwi force-pushed the feature/billing-limits branch 2 times, most recently from cb4df39 to 19f5d47 Compare June 25, 2026 17:18
Add the EnvironmentPauseSource enum and migration, plus the billing-limit platform client wrappers and schemas.
Configure a spend limit, manage billing alerts, and surface org-wide banners.
Converge billable environments to paused via webhook and a reconciliation worker; block manual resume.
Reject triggers with a 422 once entitlement reports no access, and bust the entitlement cache on state changes.
Recovery UI and durable resolve: cancel queued runs before unpausing, with reconciliation as a safety net.
Optionally cancel in-progress runs on limit hit via a deduplicated bulk-cancel job.
@kathiekiwi kathiekiwi force-pushed the feature/billing-limits branch 2 times, most recently from 5eb8d46 to ecf6630 Compare June 25, 2026 21:01

@devin-ai-integration devin-ai-integration Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

View 1 additional finding in Devin Review.

Open in Devin Review

Comment on lines +114 to +134
const existing = await prismaClient.bulkActionGroup.findFirst({
where: {
environmentId: environment.id,
type: BulkActionType.CANCEL,
AND: [
{
params: {
path: ["source"],
equals: options.source,
},
},
{
params: {
path: ["dedupeKey"],
equals: options.dedupeKey,
},
},
],
},
select: { id: true, friendlyId: true },
});

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 Bulk cancel dedupe query scans JSONB params without an index

The BillingLimitBulkCancelService at apps/webapp/app/v3/services/billingLimit/BillingLimitBulkCancelService.server.ts:114-134 deduplicates cancel actions by querying bulkActionGroup.params JSONB path filters (path: ["source"] and path: ["dedupeKey"]). Without a GIN index on the params column of BulkActionGroup, this requires a sequential scan of all cancel bulk actions for the environment. For orgs with many historical bulk actions, this could be slow during billing limit events. The query is scoped to a single environmentId and type: CANCEL, which limits the scan somewhat.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

@kathiekiwi kathiekiwi force-pushed the feature/billing-limits branch from ecf6630 to 1662563 Compare June 25, 2026 22:18
… tests

Add the usage-bar marker, documentation, and test coverage.
CI unit-test workers have no global Postgres/Redis on localhost (testcontainers
use random ports). Two latent fragilities surface once new test files shift the
shard layout:

- Modules build a Redis-backed singleton at import (auto-increment counter via
  triggerTask.server) and throw during collection when REDIS_HOST is unset.
- Shared background singletons (OrganizationDataStoresRegistry) poll the global
  database at startup and reject async, which vitest flags as unhandled.

Set harmless REDIS_HOST/PORT defaults, swallow only the Prisma P1001
"can't reach database" unhandled rejection (other rejections stay fatal), and
inject a runs-repository stub in the dedupe unit test so it does not reach the
production clickhouse factory. Temporary infra workaround; owner: platform.
@kathiekiwi kathiekiwi force-pushed the feature/billing-limits branch from 1662563 to 6608555 Compare June 25, 2026 22:48

@devin-ai-integration devin-ai-integration Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

Open in Devin Review

Comment on lines +19 to +25
if (resumeMode === "new_only") {
await BillingLimitBulkCancelService.cancelQueuedRuns(organizationId, {
dedupeKey: buildBillingLimitResolveDedupeKey(organizationId, resolvedAt),
});
}

await convergeBillingLimitEnvironmentsForOrg(organizationId, "ok");

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Queued runs can start executing before cancellation completes when user chooses 'Cancel queued runs' during billing limit resolve

Environments are unpaused (convergeBillingLimitEnvironmentsForOrg at billingLimitConvergeResolve.server.ts:25) immediately after the cancel job is merely enqueued (cancelQueuedRuns at billingLimitConvergeResolve.server.ts:20), so queued runs can be dequeued and start executing before the bulk-cancel worker processes them.

Impact: When a user explicitly chooses "Cancel queued runs" during billing limit resolve, some of those queued runs may execute anyway, potentially incurring charges the user was trying to avoid.

Async bulk cancel is enqueued but environments are unpaused inline before it runs

The convergeBillingLimitResolve function handles the new_only resume mode (user chose to cancel queued runs):

  1. BillingLimitBulkCancelService.cancelQueuedRuns at billingLimitConvergeResolve.server.ts:20-22 creates BulkActionGroup records and enqueues processBulkAction jobs via the common worker (BillingLimitBulkCancelService.server.ts:170). The await resolves when the job is enqueued, not when cancellation is complete.

  2. Immediately after, convergeBillingLimitEnvironmentsForOrg(organizationId, "ok") at billingLimitConvergeResolve.server.ts:25 unpauses all billing-limit-paused environments, restoring their concurrency limits via updateEnvConcurrencyLimits (billingLimitConvergeEnvironments.server.ts:193).

  3. The run queue can now dequeue PENDING runs from those environments.

  4. The bulk cancel worker job hasn't executed yet — it searches for runs with QUEUED_STATUSES (BillingLimitBulkCancelService.server.ts:59), but runs that were already dequeued in step 3 have transitioned to DEQUEUED/EXECUTING and escape cancellation.

The invariant that should hold for resumeMode === "new_only" is: no queued runs from the billing-limit pause window should be dequeued. The current ordering violates this because concurrency is restored before the cancel job runs.

Prompt for agents
The problem is in convergeBillingLimitResolve (billingLimitConvergeResolve.server.ts:19-25). When resumeMode is 'new_only', the function enqueues bulk-cancel jobs for queued runs, then immediately unpauses environments. But the bulk-cancel is async (processed by the common worker later), so the run queue can dequeue runs before the cancel job executes.

The fix should ensure that queued runs are cancelled (or at least prevented from being dequeued) BEFORE environments are unpaused. Several approaches:

1. Process the cancel inline (synchronously) instead of via the async worker, then unpause environments. This is the most reliable but may be slow for large backlogs.

2. Use a two-phase approach: first cancel queued runs inline or wait for the bulk cancel to complete, then enqueue a separate job to unpause environments.

3. If inline cancel is too expensive, consider keeping environments paused until the bulk cancel job completes, and have the bulk cancel job trigger the unpause as its final step.

The key constraint is that environment concurrency must not be restored until the cancel has processed all queued runs, so the ordering must be: cancel first, then unpause.
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants