Feat/analytics api hardening#2
Conversation
Greptile SummaryThis PR significantly hardens the analytics API by removing the Key changes:
Confidence Score: 3/5Two P1 correctness issues need resolution before merge: failed-aggregation events have no sentinel state (causing unbounded retry churn), and the subtract helpers can produce negative counters in partial-hour edge queries. The PR's architectural direction is sound — removing the deduplication table, status fields, and server-side site config all simplify the system meaningfully, and the new edge-hour query strategy is a genuine improvement. The high-load and integrity test coverage is strong. However, two concrete correctness gaps remain: (1) a failed event in aggregateEventsByIds is left with aggregatedAt:null indefinitely — every subsequent ingest batch for the site will re-attempt it with no backoff and no way to distinguish it from legitimately pending events; (2) subtractTopRows/subtractOverviewTotals can emit negative field values when unaggregated events exist in the excluded time slice, producing wrong dashboard numbers. src/component/ingest.ts (failed-event sentinel), src/component/analytics.ts (subtractTopRows and subtractOverviewTotals negative-value guards) Important Files Changed
Sequence DiagramsequenceDiagram
participant Browser
participant HTTP as HTTP Ingest Route
participant IB as ingestBatch (mutation)
participant RPSE as reducePendingSiteEvents (internalMutation)
participant AEB as aggregateEventsByIds
participant DB as Convex DB
Browser->>HTTP: POST /analytics/ingest (writeKey + events)
HTTP->>HTTP: hashWriteKey(writeKey)
HTTP->>IB: ctx.runMutation(ingestBatch, {writeKeyHash, events})
IB->>DB: query sites by writeKeyHash
DB-->>IB: site (active)
IB->>DB: insert events (aggregatedAt: null)
IB->>RPSE: scheduler.runAfter(0, reducePendingSiteEvents)
IB-->>HTTP: {accepted, rejected}
HTTP-->>Browser: 200 OK
RPSE->>DB: query events where aggregatedAt=null (take 101)
DB-->>RPSE: pendingEvents
RPSE->>AEB: aggregateEventsByIds(eventIds)
AEB->>DB: upsertVisitor, upsertSession
AEB->>DB: accumulateRollupShards → flushRollupShards
AEB->>DB: patch event.aggregatedAt = now
AEB-->>RPSE: {aggregated, skipped, failed}
alt hasMore
RPSE->>RPSE: scheduler.runAfter(0, reducePendingSiteEvents)
end
note over DB: getOverview query uses buildEdgeHourPlan for partial-hour boundaries, buildExactRangePlan for full-hour/day rollups
|
No description provided.