Skip to content

Instrument upload-pipeline latency before deciding on async variants #294

Description

@miratcan

Why

The story upload endpoint (_try_minio_upload in answer/api/router.py) does real work inside the request: decode original, generate 3 responsive WEBPs + blur + dominant colour, push 4 MinIO objects. Threaded but still on the request path.

For a 3-5 MB iPhone original this is plausibly 2-5s user-visible latency. The roadmap #274 proposed moving this to Celery; that's premature at current scale, but we can't tell when it stops being premature without measuring.

Scope

  1. Log per-stage timings (decode, each variant encode+put, blur, dominant colour, total) under a structured log line — e.g. upload.timing with fields user_id, bytes_in, bytes_out_lg/md/sm, stage durations, total
  2. Surface a simple aggregate query target — easiest: write to a UploadLatencyEvent row (or piggyback on an existing audit table) so we can look at p50/p95/p99 without a logs pipeline
  3. After 2-4 weeks of data, decide: stay sync, push to background thread, or commit to a queue

Threshold to revisit (rough)

Move to async when any of:

  • p99 upload duration crosses ~3s consistently
  • Gunicorn workers visibly back up during upload bursts (worker pool saturation)
  • A 4th+ variant gets added (AVIF, larger lg, etc.)

Split from #274

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions