Summary
Implement a per-user daily AI spend budget for Pro correction refinement. The initial limit is 2 cents/day/user, stored as $0.02 = 20,000 micro-USD.
Usage tracking should live in the propositions service Postgres database, not Redis. Postgres is the durable source of truth for budget checks, reservations, actual token reconciliation, and later US-015 telemetry.
Implementation Changes
- Add AI usage budget options:
TextComparison:AiRefinement:DailyBudgetUsd = 0.02
TextComparison:AiRefinement:BudgetTimezone = UTC
TextComparison:AiRefinement:EstimatedOutputTokens
TextComparison:AiRefinement:InputUsdPerMillionTokens
TextComparison:AiRefinement:OutputUsdPerMillionTokens
TextComparison:AiRefinement:Model
- Store money as integer micro-USD:
$1.00 = 1,000,000
$0.02 = 20,000
- Add Postgres tables through the propositions
AppDbContext:
AiDailyUsage: one row per userId + dateUtc.
AiUsageLedger: one row per AI refinement reservation/attempt.
- Add an
AiUsageBudgetService used by the correction orchestrator after US-011 input validation and before the US-010 AI call.
- Use a reserve/call/reconcile flow:
- Build the final AI prompt payload.
- Estimate input tokens with
ceil(promptCharacterCount / 4).
- Estimate output tokens from configured max output tokens.
- Estimate cost using configured model rates.
- Atomically reserve budget in Postgres.
- If reservation fails, skip AI.
- If reservation succeeds, call AI.
- Reconcile actual token usage from
ChatResponse.Usage.InputTokenCount and OutputTokenCount when available.
- Move reserved cost into actual cost, or release unused reservation on timeout/failure when no usage is available.
- Do not retry AI when budget reservation fails.
Data And Concurrency
AiDailyUsage should include:
UserId
DateUtc
BudgetMicros
ReservedCostMicros
ActualCostMicros
UpdatedAtUtc
- unique index on
UserId + DateUtc
AiUsageLedger should include:
UserId
DateUtc
PropositionId
OperationId
Model
Status: reserved, completed, released, failed, skipped
- estimated/actual input and output tokens
- estimated/actual cost micros
FailureReason
CreatedAtUtc
CompletedAtUtc
Budget reservation must be atomic:
- use a transaction with row lock, or
- use an atomic
INSERT ... ON CONFLICT ... DO UPDATE ... WHERE actual + reserved + estimate <= budget.
Include pending reservations in the daily spend check so concurrent tabs cannot overspend.
Response Behavior
If budget is available and AI runs successfully:
correctionMode = ai_refined
aiAttempted = true
aiLimitReached = false
If daily budget would be exceeded:
- skip AI,
- return the pre-AI result from static + US-009 cleanup,
- preserve
correctionMode = static or normalized,
- set
aiAttempted = false,
- set
aiLimitReached = true,
- set
aiLimitReason = daily_budget_reached.
If AI was attempted and fails/times out:
- keep US-012 fallback behavior,
aiLimitReached = false unless the failure is budget-related.
UI should show a non-blocking warning when aiLimitReached = true, such as: “Daily AI refinement limit reached. Your result used standard correction.”
Acceptance Criteria
- Pro users have a configurable daily AI spend budget, initially
$0.02/day.
- Limit is tracked server-side in Postgres.
- Money is stored as integer micro-USD.
- AI cost is estimated before the AI call.
- Budget is reserved atomically before the AI call.
- Actual usage reconciles the reservation when token usage is available.
- When the budget limit is reached, AI is skipped.
- When AI is skipped due to budget, the user receives the pre-AI correction result.
- Response includes metadata for a non-blocking UI warning.
- Concurrent submissions cannot exceed the daily budget beyond accepted reservations.
Test Plan
Budget service tests:
- allows reservation when estimated cost fits within the 20,000 micro-USD daily budget,
- rejects reservation when estimated cost would exceed the budget,
- includes existing reserved cost in the check,
- includes existing actual cost in the check,
- creates one daily row per user/date,
- resets naturally on the next UTC date.
Reconciliation tests:
- successful AI call moves reserved cost to actual cost,
- actual token usage replaces estimated usage when available,
- timeout/failure releases reservation when actual usage is unavailable,
- failure with actual usage records actual cost if available.
Orchestrator tests:
- Pro user under budget calls AI,
- Pro user over budget skips AI,
- budget skip returns pre-AI result with
aiAttempted = false,
- budget skip includes
aiLimitReached = true,
- Free/anonymous users never hit the AI budget service.
Concurrency tests:
- simultaneous reservations for one user cannot exceed the daily budget,
- only reservations that fit are accepted.
API/frontend tests:
- response metadata supports a non-blocking warning,
- existing result rendering remains compatible.
Assumptions
- US-010, US-011, and US-012 are implemented first.
- Budget applies only to Pro AI refinement calls, not static comparison or US-009 deterministic cleanup.
- Daily budget is based on UTC date.
- Postgres is the source of truth; Redis is not used for MVP budget tracking.
- Token pricing is configurable because provider/model prices can change.
Summary
Implement a per-user daily AI spend budget for Pro correction refinement. The initial limit is 2 cents/day/user, stored as $0.02 = 20,000 micro-USD.
Usage tracking should live in the propositions service Postgres database, not Redis. Postgres is the durable source of truth for budget checks, reservations, actual token reconciliation, and later US-015 telemetry.
Implementation Changes
TextComparison:AiRefinement:DailyBudgetUsd = 0.02TextComparison:AiRefinement:BudgetTimezone = UTCTextComparison:AiRefinement:EstimatedOutputTokensTextComparison:AiRefinement:InputUsdPerMillionTokensTextComparison:AiRefinement:OutputUsdPerMillionTokensTextComparison:AiRefinement:Model$1.00 = 1,000,000$0.02 = 20,000AppDbContext:AiDailyUsage: one row peruserId + dateUtc.AiUsageLedger: one row per AI refinement reservation/attempt.AiUsageBudgetServiceused by the correction orchestrator after US-011 input validation and before the US-010 AI call.ceil(promptCharacterCount / 4).ChatResponse.Usage.InputTokenCountandOutputTokenCountwhen available.Data And Concurrency
AiDailyUsageshould include:UserIdDateUtcBudgetMicrosReservedCostMicrosActualCostMicrosUpdatedAtUtcUserId + DateUtcAiUsageLedgershould include:UserIdDateUtcPropositionIdOperationIdModelStatus:reserved,completed,released,failed,skippedFailureReasonCreatedAtUtcCompletedAtUtcBudget reservation must be atomic:
INSERT ... ON CONFLICT ... DO UPDATE ... WHERE actual + reserved + estimate <= budget.Include pending reservations in the daily spend check so concurrent tabs cannot overspend.
Response Behavior
If budget is available and AI runs successfully:
correctionMode = ai_refinedaiAttempted = trueaiLimitReached = falseIf daily budget would be exceeded:
correctionMode = staticornormalized,aiAttempted = false,aiLimitReached = true,aiLimitReason = daily_budget_reached.If AI was attempted and fails/times out:
aiLimitReached = falseunless the failure is budget-related.UI should show a non-blocking warning when
aiLimitReached = true, such as: “Daily AI refinement limit reached. Your result used standard correction.”Acceptance Criteria
$0.02/day.Test Plan
Budget service tests:
Reconciliation tests:
Orchestrator tests:
aiAttempted = false,aiLimitReached = true,Concurrency tests:
API/frontend tests:
Assumptions