From 1d151d0503cee1b668312ecbb25acac7a69ff22a Mon Sep 17 00:00:00 2001
From: Zhifei Li <andylizf@outlook.com>
Date: Tue, 23 Jun 2026 22:16:58 -0700
Subject: [PATCH 01/13] chore(eval): make the reproduction package
 self-contained and public-safe

Reproduction:
- Consolidate to a single entry doc: rename eval/REPRODUCE.md -> eval/README.md
  (auto-renders on the eval/ dir) and update all references.
- Document the three ways to supply tile images to the reader: self-hosted serve
  with materialized tiles, the public search API, and a self-hosted serve that
  renders tiles on demand from a kiwix ZIM. Make TILES_DIR optional so the reader
  can use serve-returned base64 tiles instead of a local corpus.
- Remove the reader-side "local-wiki" rendering path entirely so all tile
  rendering happens serve-side: drop LocalWikiTiledScreenshotRetriever, the
  lookup_reference_url machinery, and the --local-wiki / --local-wiki-screenshot-dir
  / --lookup-reference-url flags (they relied on an out-of-tree module and
  hardcoded placeholder paths).
- Run benchmarks on their full (filtered) sets; keep the 1000-example subsample
  only for nq and sqa, which is what the paper reports.

Hygiene:
- Read the Jina API key from JINA_API_KEY instead of a hardcoded default.
- Update stale notes now that the indexes and tile corpus are published on HF.
- Drop internal working-notes docs from the repo and scrub leftover references to
  old internal repo/module names.
- Tidy .gitignore to use generic patterns and add a scoped eval/.gitignore.
---
 .gitignore                                    |   24 -
 .../internal/screenshot-optimization-notes.md |  239 --
 docs/reproducing_paper.md                     |  592 -----
 .../plans/2026-05-11-pixelrag-restructure.md  | 1315 ----------
 .../plans/2026-05-25-pixelrag-frontend.md     | 2313 -----------------
 .../2026-05-27-chromium-build-centralia.md    |  438 ----
 .../2026-05-11-pixelrag-restructure-design.md |  359 ---
 .../2026-05-25-pixelrag-frontend-design.md    |  290 ---
 eval/.gitignore                               |    6 +
 eval/PAPER_EXPERIMENT_MAP.md                  |  129 -
 eval/{REPRODUCE.md => README.md}              |   62 +-
 eval/REPRODUCE_PROGRESS.txt                   |  366 ---
 eval/lib/__init__.py                          |    2 -
 eval/lib/benchmarks.py                        |    8 +-
 eval/lib/grader.py                            |    6 +-
 eval/lib/retrieval.py                         |  425 +--
 eval/lib/retrievers.py                        |   20 +-
 eval/lib/simpleqa_data.py                     |    2 +-
 eval/pyproject.toml                           |    8 +-
 eval/reproduce.sh                             |   26 +-
 eval/run_bench.py                             |   47 +-
 eval/run_livevqa.py                           |    2 +-
 eval/serve_up.sh                              |    2 +-
 23 files changed, 78 insertions(+), 6603 deletions(-)
 delete mode 100644 docs/internal/screenshot-optimization-notes.md
 delete mode 100644 docs/reproducing_paper.md
 delete mode 100644 docs/superpowers/plans/2026-05-11-pixelrag-restructure.md
 delete mode 100644 docs/superpowers/plans/2026-05-25-pixelrag-frontend.md
 delete mode 100644 docs/superpowers/plans/2026-05-27-chromium-build-centralia.md
 delete mode 100644 docs/superpowers/specs/2026-05-11-pixelrag-restructure-design.md
 delete mode 100644 docs/superpowers/specs/2026-05-25-pixelrag-frontend-design.md
 create mode 100644 eval/.gitignore
 delete mode 100644 eval/PAPER_EXPERIMENT_MAP.md
 rename eval/{REPRODUCE.md => README.md} (65%)
 delete mode 100644 eval/REPRODUCE_PROGRESS.txt

diff --git a/.gitignore b/.gitignore
index a28f164..c82149b 100644
--- a/.gitignore
+++ b/.gitignore
@@ -37,33 +37,9 @@ logs/
 *.log
 arxiv
 demos/e2e/output/
-eval/eval_output/
 .superpowers/
 .vercel
 
-# Large local retrieval artifacts (not committed)
-eval/tmp_news_state.db
-eval/live_pixel_full.json
-eval/live_reader_full.json
-eval/frozen_reader_full.json
-eval/mms_base_live.jsonl
-eval/mms_lora_live.jsonl
-eval/mms_naive_live.jsonl
-eval/evqa_base_landmarks_live.jsonl
-eval/evqa_base_inat_live.jsonl
-eval/evqa_lora_landmarks_live.jsonl
-eval/evqa_lora_inat_live.jsonl
-eval/mms_traf_live.jsonl
-eval/evqa_traf_landmarks.jsonl
-eval/evqa_traf_inaturalist.jsonl
-eval/evqa_naive_landmarks.jsonl
-eval/evqa_naive_inaturalist.jsonl
-eval/mms_naive_nothink.jsonl
-eval/evqa_base_landmarks_nothink.jsonl
-eval/evqa_base_inat_nothink.jsonl
-eval/evqa_lora_landmarks_nothink.jsonl
-eval/evqa_lora_inat_nothink.jsonl
-eval/paper_grader_out/
 node_modules/
 .next/
 
diff --git a/docs/internal/screenshot-optimization-notes.md b/docs/internal/screenshot-optimization-notes.md
deleted file mode 100644
index d16c5d9..0000000
--- a/docs/internal/screenshot-optimization-notes.md
+++ /dev/null
@@ -1,239 +0,0 @@
-# Screenshot Throughput Optimization — Working Progress
-
-## Target: 150 t/s @ 100% correct (8192px tiles, maxi Wikipedia)
-
-## Current Best
-
-| Config | t/s | Correct | Notes |
-|--------|-----|---------|-------|
-| multi-process 48w (frameStoppedLoading) | **91** | 100% ✓ | Stable, production-ready |
-| multi-process 48w (frameNavigated) | **98** | 100% ✓ | Stable (igpu incompatible) |
-| multi-process 48w (2000 art) | **113** | 99.8% ✓ | Steady-state |
-| igpu 48w + frameStoppedLoading | **117-132** | 90-97% | Fast but 3-10% about:blank |
-| igpu 48w + directClip | **128-148** | 48-90% | Fastest, worst correctness |
-
-## Production System Comparison
-
-The wiki-screenshot production system (`~/pixelrag-src/wiki-screenshot/`) uses:
-```python
-wait_fonts = False    # for kiwix/ZIM datasource
-wait_images = False   # for kiwix/ZIM datasource
-pre_screenshot_delay = 0.5  # fixed 500ms sleep, no fonts.ready
-```
-- Playwright-based (not CDP websocket)
-- GPU-accelerated (8× L40S per machine)
-- Multi-machine: 4 machines × ~70-80 t/s = ~290 t/s total
-- Full Wikipedia (8.28M articles) processed in ~1 day
-
-Our optimizations added `fonts.ready + eager images + double-rAF` for pixel-perfect
-correctness. Production skips these waits entirely (`pre_screenshot_delay=0` in
-coordinator). This is safe for Kiwix because all assets (including fonts) are served
-from localhost — they load before `wait_until="load"` fires.
-
-Gemini Vision validation of 5000 production tiles:
-- 0% BROKEN_RENDER, 0% ERROR_PAGE (rendering is correct without font wait)  
-- 12% BLANK/PARTIAL_BLANK (tile loop overshoots page height — separate bug)
-
-**Benchmark result**: Removing font/image wait gives only +4% throughput (99 vs 96 t/s)
-because nav is not the bottleneck — capture IPC is. The 290 t/s production rate comes
-from 4 machines × GPU acceleration, not from skipping font waits.
-
-## Pipeline Bottleneck Analysis
-
-```
-Stage         Capacity    Bottleneck?
-Nav           430 pg/s    No (3.4x headroom)
-Capture       125 t/s     YES (C/T_c = 48/321ms)
-
-Steady-state theoretical: 125-150 t/s
-Actual (200 art): 98 t/s (75% utilization, 25% = nav serial)
-Actual (2000 art): 113 t/s (85% utilization)
-```
-
-Per-capture breakdown at 48 concurrent:
-- IPC roundtrip: 181ms (ForceRedraw browser→renderer→compositor, 8 async hops)
-- DrawRenderPass: 62ms (composite 136 quads)
-- CopyDrawnRenderPass: 46ms (memcpy 28MB)
-
-Throughput = `C / T_c(C)` converges at ~125-130 t/s (USL contention curve).
-Nav latency (186ms) does not affect steady-state throughput (Little's Law).
-Minimum workers to saturate capture: `C × (1 + T_nav/T_cap) = 72`.
-
-## Chromium Patches (in custom build)
-
-| Patch | File | Impact |
-|-------|------|--------|
-| rawFilePath | page_handler.cc + Page.pdl | Async write raw BGRA to /dev/shm (ThreadPool) |
-| directClip | page_handler.cc + Page.pdl | CopyFromSurface(src_rect) without emulation change |
-| skipRedraw | page_handler.cc + Page.pdl | ForceRedrawWithCallback → CopyFromSurface |
-| ForceRedrawWithCallback | render_widget_host_impl.cc | Lightweight ForceRedraw with commit callback |
-| directClip ForceRedraw fix | page_handler.cc | directClip also does ForceRedraw before copy |
-
-## Strategy Architecture
-
-Strategies separated from bench framework:
-- `pixelrag_render.strategies/` — capture strategies (CDPPhased, CDPSequential, etc.)
-- `pixelrag_render.bench/` — measurement harness with GT validation + experiment dump
-- `Bench` class: `bench.run(strategy)` → GT cache + capture + verify + JSON dump
-
-### CDPPhasedStrategy (best strategy)
-- Work-stealing queue (asyncio.Queue, not round-robin)
-- Semaphore-limited concurrent captures
-- `wait_for_event("Page.frameStoppedLoading")` filtered by main frameId
-- Per-tile semaphore release (fine-grained pipelining)
-- Configurable: tile_height, nav_timeout, use_direct_clip, extra_chrome_args
-
-### WebsocketConnection
-- Background `_recv_loop` for multiplexed CDP
-- `wait_for_event(method, timeout, filter_fn)` for async event listening
-- Supports concurrent `cdp()` calls via pending futures dict
-
-## What Was Tried
-
-### Worked
-- ✅ rawFilePath: async write bypasses PNG encoding (+15%)
-- ✅ directClip: parallel tile capture within viewport
-- ✅ Phased strategy: semaphore-limited captures reduce contention (+15%)
-- ✅ Work-stealing queue: better load balancing
-- ✅ frameNavigated/frameStoppedLoading wait: fixes igpu about:blank race
-- ✅ Presentation feedback ForceRedraw: 100% correct (but slower)
-
-### Partially Worked
-- ⚠️ --in-process-gpu: 120+ t/s but 5-10% about:blank captures
-- ⚠️ SwapPromise ForceRedraw: shot_p50 325→303ms (7% gain)
-- ⚠️ directClip for all tiles: fast but correctness depends on ForceRedraw
-
-### Did Not Work
-- ❌ --single-process: 168 t/s but 74% correct
-- ❌ peekPixels (SkiaRenderer): headless uses SoftwareRenderer
-- ❌ Immediate BeginFrame feedback flush: breaks frame pipeline
-- ❌ CDPScreenshotNewSurface: RequestRepaintOnNewSurface overhead
-- ❌ 2-tab pipelining: Chrome UI thread serializes ForceRedraw
-- ❌ Chrome flags (disable-lcd-text etc.): ±2%
-- ❌ headless_shell: slower than chrome (no shared HTTP cache)
-- ❌ One-shot strategy: launch overhead 1-2s/process
-- ❌ Firefox Playwright: 2.6x slower than Chrome
-- ❌ Servo (servoshell 0.1.0): stub package, not ready
-- ❌ CEF (cefpython3): abandoned, no modern Python wheel
-- ❌ WebKitGTK snapshot: needs GPU/display access
-- ❌ RequestRepaintOnNewSurface in skipRedraw: didn't fix igpu race
-- ❌ Bitmap dimension retry: about:blank renders at full viewport size
-- ❌ Pixel content retry: can't distinguish white page from about:blank
-
-## igpu About:blank Root Cause
-
-Chrome `--in-process-gpu` has two bugs at 48 concurrent workers:
-1. **frameNavigated event not fired**: Chrome sometimes silently drops
-   `Page.frameNavigated` CDP event under high concurrency. 
-   Fix: use `Page.frameStoppedLoading` (always reliable).
-2. **Compositor surface race**: ForceRedraw's presentation feedback arrives
-   before the new page's CompositorFrame is activated in viz. CopyFromSurface
-   reads the old surface (about:blank at 875×8192, indistinguishable from
-   real page by dimensions). No reliable Python-side detection possible.
-
-## Key Analysis Methods Used
-
-- **Pipeline bottleneck analysis** (closed queueing model)
-- **Little's Law**: steady-state throughput = C/T_c when capture-bound
-- **USL contention curve**: C/T_c(C) convergence at ~125-130 t/s
-- **USE method**: Utilization (79%), Saturation (semaphore queue), Errors (0)
-- **Per-capture breakdown**: DrawRenderPass (57ms) + CopyDrawnRenderPass (18ms)
-  + IPC overhead (95ms) measured via Chromium instrumentation
-
-## Scale Estimate
-
-30M tiles (18.7M articles × ~1.6 tiles/article):
-- Single machine 98 t/s: 30M/98 = 85 hours = **3.5 days**
-- Single machine 120 t/s (igpu, 95% correct): 30M/120 = 69 hours = **2.9 days**
-- 4 machines × 98 t/s = 392 t/s: 30M/392 = 21 hours = **< 1 day**
-- Production system (290 t/s, 4 machines): ~1 day (matches historical data)
-
-## Production Pipeline: fast_cdp backend
-
-```
-Chrome 48w (capture)  →  /dev/shm (raw BGRA)  →  ProcessPool 4w (JPEG)  →  disk
-     98 t/s                 28MB/tile               ~100 t/s                100KB/tile
-```
-
-Architecture:
-- `render_articles()` in `pixelrag_render.backends.fast_cdp`
-- Capture: CDPPhasedStrategy logic (work-stealing, semaphore, frameStoppedLoading)
-- Compression: `concurrent.futures.ProcessPoolExecutor(4)` — GIL-free, separate cores
-- Raw files in /dev/shm/pixelrag_render/ — auto-deleted after compression
-- Output: JPEG tiles + tiles.json manifest per article
-
-Key: compression never blocks capture. Chrome writes raw → returns immediately.
-Compression reads raw file asynchronously on different CPU cores.
-
-128-core machine: 48 cores for Chrome, 4 cores for JPEG, 76 cores idle.
-JPEG compression of 875×8192 takes ~10-20ms → 4 cores handle 200-400 t/s → 
-plenty of headroom over 98 t/s capture rate.
-
-Storage: 30M tiles × 100KB JPEG = ~3 TB
-
-## GPU Acceleration (Brewster H200 findings)
-
-Lab machines have 8× H200/B200 GPUs but:
-- `/dev/dri/renderD*` needs `render` group membership (no sudo)
-- Docker daemon not running; rootless docker lacks nvidia-container-toolkit
-- SwiftShader (CPU Vulkan) doesn't improve throughput vs software rendering
-- headless Chrome ignores `--use-gl` flags (GPU process crashes on init)
-- When GPU DOES init (via Xvfb + ANGLE), missing NVIDIA userspace drivers in container
-
-To unlock GPU: `sudo usermod -aG render $USER` on lab machine.
-Expected impact: 4x faster DrawRenderPass based on production system data.
-
-## Backend reconciliation & SPA-render fix (2026-06-11)
-
-### The three render code paths (who actually runs what)
-- `backends/websocket.py` — the **shipped** general-purpose renderer. The `pixelshot`
-  CLI, the `pixelbrowse` skill, and the `pixelrag index` pipeline (`render_urls`,
-  `backend="cdp"`/`"websocket"`) all go through it. Simple: per-worker queue, inline
-  JPEG over CDP, no extra deps.
-- `backends/fast_cdp.py` — high-throughput batch path (`render_articles`): phased-logic
-  capture + rawFilePath to /dev/shm + ProcessPool JPEG. **No in-repo caller** — invoked
-  only by an out-of-repo ops script. The 8.28M flagship Wikipedia index was built by a
-  *separate* system (Playwright/GPU/4-machine, see "Production System Comparison"), not
-  by either of these.
-- `strategies/*` — the benchmarking menu; used only by `bench/`. Kept as research scaffolding.
-
-### Regression fixed: websocket backend rendered SPAs / tall pages wrong
-`backends/websocket.py` had drifted from the established capture pattern — it had **no
-nav-completion wait** (fired `document.fonts.ready` immediately after `Page.navigate`)
-and **no per-tile scroll**, both of which `fast_cdp` and the production strategies have.
-Consequences:
-- JS/SPA pages were measured/captured mid-hydration at a transient (often much taller)
-  layout → tiled into mostly-empty space = blank tiles (this is the "tile loop overshoots
-  page height" blank bug noted under "Production System Comparison", here root-caused).
-- At small `tile_height` (the skill uses 1568) every tile past the first was blank,
-  because content below the short device viewport is never rasterized without scrolling.
-
-Fix (verified in `bench/` against ground truth at 100% on the smoke set):
-- Wait for the `load` event before measuring/capturing (`readyState==='complete'`
-  shortcut + 12s cap). SSR pages fire `load` ~as fast as `fonts.ready`, so ~0 cost
-  (measured: Wikipedia render time unchanged).
-- Scroll each tile into view before capture (mirrors `fast_cdp`).
-- Optional `--wait-network-idle` (JS PerformanceObserver) for pages that fetch content
-  after load; off by default (costs a quiet window/page), on by default in the skill.
-
-### Raw vs inline-JPEG is the dominant throughput lever (measured, 48w, N=600, this box)
-| config | correct | t/s | note |
-|---|---|---|---|
-| phased **raw** (fast_cdp config) | 99.7% | **306** | capture-only in bench; JPEG is decoupled/parallel |
-| phased jpeg (inline) | 98.2% | 182 | Chrome encodes JPEG on the capture critical path |
-| sequential raw | 99.7% | 221 | |
-| sequential jpeg (inline) | 98.2% | 142 | |
-
-Takeaways: (1) **inline JPEG encoding is the bottleneck** — bypassing it with rawFilePath
-+ parallel compression is ~+56-68%. (2) phased's semaphore/work-stealing buys ~+38% over
-sequential **in raw mode** (in jpeg mode the encoding bottleneck masks it to ~+8% — an
-earlier jpeg-only comparison was misleading). So `fast_cdp` is ~2x the simple inline path
-at batch scale and is **kept**. Absolute t/s here is optimistic (capture-only, short
-window, 128-core box) vs the ~91-113 production figure; the *ratios* are the point.
-
-### Design direction
-Ship **one simple backend** (`websocket.py`, inline JPEG) for the CLI/skill/`pixelrag index`
-— that scale doesn't need the raw+decoupled machinery, and the flagship index uses the
-separate system anyway. Keep `fast_cdp` + `strategies/` as batch/research code. The shared
-capture-readiness logic (load wait, scroll) should eventually live in one place so the
-shipped backend can't silently drift from the correct pattern again.
diff --git a/docs/reproducing_paper.md b/docs/reproducing_paper.md
deleted file mode 100644
index f27da6c..0000000
--- a/docs/reproducing_paper.md
+++ /dev/null
@@ -1,592 +0,0 @@
-# Reproducing Paper Results
-
-> **Paper**: *PixelRAG: Retrieval and Generation in Pixel Space over Millions of Web Screenshots*
->
-> This document maps every table and figure in the paper to the exact commands needed to reproduce the numbers.
-
-## Prerequisites
-
-### Infrastructure
-
-| Component | Description | Where |
-|-----------|-------------|-------|
-| **Wikipedia tile index (base)** | 28M vectors, Qwen3-VL-Embedding-2B (pretrained) | `pixelrag-data/search_index/` (215 GB FAISS IVF, dim=2048) |
-| **Wikipedia tile index (fine-tuned)** | 26M vectors, LoRA checkpoint-200 | `pixelrag-data/search_index_lora_vit_ckpt200_v2/` (202 GB) |
-| **Wikipedia text index** | 15.7M text chunks (1024 tokens, Trafilatura) | `pixelrag-data/text_search_index_1024/` (121 GB) |
-| **Article metadata** | URL↔tile mapping for 7.1M articles | `pixelrag-data/articles.json` (199 MB) |
-| **Tile images** | ~30M PNG tiles (1024×1024) | Remote NFS or local SSD (~5.6 TB) |
-| **News tile index** | 3.6M tiles (BBC/AP/CNN) for LiveVQA | S3: `s3://wiki-screenshot-tiles-backup/kiwix_tiles/news_image_search_index/` |
-| **News text index** | 866K text chunks for news | S3: `s3://wiki-screenshot-tiles-backup/kiwix_tiles/news_text_search_index/` |
-| **News tiles** | Raw PNG tiles for news articles | S3: `s3://wiki-screenshot-tiles-backup/kiwix_tiles/news_tiles/` |
-| **LoRA adapter** | Fine-tuned embedding LoRA weights | S3: `s3://wiki-screenshot-tiles-backup/kiwix_tiles/adapters/lora_vit_ckpt200/` |
-| **Kiwix ZIM** | Offline Wikipedia for HTML baselines | S3: `s3://wiki-screenshot-tiles-backup/kiwix_tiles/zim/` |
-
-All S3 paths use AWS profile `leann` (`aws s3 --profile leann ...`).
-
-### Services to Start
-
-```bash
-# 1. Screenshot search API (port 30888) — serves the pixel tile index
-pixelrag-serve \
-    --index-dir pixelrag-data/search_index \           # or search_index_lora_vit_ckpt200_v2
-    --tiles-dir /path/to/wikipedia_tiles \
-    --articles-json pixelrag-data/articles.json \
-    --model Qwen/Qwen3-VL-Embedding-2B \
-    --device cuda --port 30888
-
-# 2. Text search API (port 30889) — serves the text chunk index
-pixelrag-serve \
-    --index-dir pixelrag-data/text_search_index_1024 \
-    --tiles-dir /path/to/text_chunks \
-    --articles-json pixelrag-data/articles.json \
-    --model Qwen/Qwen3-VL-Embedding-2B \
-    --device cuda --port 30889
-
-# 3. Reader model (port 8000) — vLLM serving Qwen3.5-4B (default reader)
-vllm serve Qwen/Qwen3.5-4B-Instruct \
-    --port 8000 --tensor-parallel-size 1 \
-    --max-model-len 32768
-```
-
-### Environment
-
-```bash
-cd ~/pixelrag/eval
-
-# Install eval dependencies (one-time)
-uv pip install pandas tqdm trafilatura openai aiohttp datasets huggingface-hub
-
-# For grading
-export OPENAI_API_KEY=sk-...   # GPT-4.1 judge
-```
-
----
-
-## Table 1: Main Results (6 Benchmarks × 4 Methods)
-
-**Reader**: Qwen3.5-4B, **k=3**, **Grader**: GPT-4.1 judge (except LiveVQA = exact match)
-
-### No Retrieval (baseline)
-
-```bash
-# SimpleQA — no retrieval
-python run_bench.py \
-    --task simpleqa --model Qwen/Qwen3.5-4B-Instruct \
-    --num-examples 1000 --no-think
-
-# NQ — no retrieval
-python run_bench.py \
-    --task nq --model Qwen/Qwen3.5-4B-Instruct \
-    --num-examples 1000 --no-think
-
-# NQ-Tables — no retrieval
-python run_bench.py \
-    --task nq_tables --model Qwen/Qwen3.5-4B-Instruct \
-    --num-examples 1000 --no-think
-
-# MMSearch — no retrieval (300 examples)
-python run_bench.py \
-    --task mmsearch --model Qwen/Qwen3.5-4B-Instruct \
-    --num-examples 300 --no-think
-
-# EVQA — no retrieval (landmarks, automatic only, n=749)
-python run_bench.py \
-    --task encyclopedic_vqa --model Qwen/Qwen3.5-4B-Instruct \
-    --evqa-dataset-filter landmarks --evqa-question-type-filter automatic \
-    --num-examples 749 --no-think
-
-# LiveVQA — see "LiveVQA Separate Pipeline" section below
-```
-
-### Text Retrieval — Trafilatura (Text → Text)
-
-Requires: text search API on port 30889 with Trafilatura-parsed text chunks.
-
-```bash
-# SimpleQA — Trafilatura text retrieval
-python run_bench.py \
-    --task simpleqa --model Qwen/Qwen3.5-4B-Instruct \
-    --text-api --text-api-url http://localhost:30889/search \
-    --retrieval-top-k 3 --num-examples 1000 --no-think
-
-# NQ — Trafilatura text retrieval
-python run_bench.py \
-    --task nq --model Qwen/Qwen3.5-4B-Instruct \
-    --text-api --text-api-url http://localhost:30889/search \
-    --retrieval-top-k 3 --num-examples 1000 --no-think
-
-# NQ-Tables
-python run_bench.py \
-    --task nq_tables --model Qwen/Qwen3.5-4B-Instruct \
-    --text-api --text-api-url http://localhost:30889/search \
-    --retrieval-top-k 3 --num-examples 1000 --no-think
-
-# MMSearch (multimodal query: text + image → text index)
-python run_bench.py \
-    --task mmsearch --model Qwen/Qwen3.5-4B-Instruct \
-    --text-api --text-api-url http://localhost:30889/search \
-    --retrieval-top-k 3 --num-examples 300 --no-think
-
-# EVQA
-python run_bench.py \
-    --task encyclopedic_vqa --model Qwen/Qwen3.5-4B-Instruct \
-    --text-api --text-api-url http://localhost:30889/search \
-    --evqa-dataset-filter landmarks --evqa-question-type-filter automatic \
-    --retrieval-top-k 3 --num-examples 749 --no-think
-
-# LiveVQA — see "LiveVQA Separate Pipeline" section below
-```
-
-### Text Retrieval — mwparserfromhell
-
-Same as Trafilatura but requires a separate text index built with mwparserfromhell parser.
-The text API must be started pointing to that index.
-
-```bash
-# Same commands as Trafilatura above, but --text-api-url points to
-# the mwparserfromhell text index API (different port or index-dir).
-# The parser choice is baked into the index at build time, not a runtime flag.
-```
-
-### PixelRAG (base) — Screenshot → Screenshot
-
-Requires: screenshot search API on port 30888 with base (pretrained) embedding index.
-
-```bash
-# SimpleQA — pixel retrieval (base)
-python run_bench.py \
-    --task simpleqa --model Qwen/Qwen3.5-4B-Instruct \
-    --local-api --local-api-url http://localhost:30888/search \
-    --retrieval-top-k 3 --num-examples 1000 --no-think
-
-# NQ — pixel retrieval (base)
-python run_bench.py \
-    --task nq --model Qwen/Qwen3.5-4B-Instruct \
-    --local-api --local-api-url http://localhost:30888/search \
-    --retrieval-top-k 3 --num-examples 1000 --no-think
-
-# NQ-Tables
-python run_bench.py \
-    --task nq_tables --model Qwen/Qwen3.5-4B-Instruct \
-    --local-api --local-api-url http://localhost:30888/search \
-    --retrieval-top-k 3 --num-examples 1000 --no-think
-
-# MMSearch (multimodal: query image sent alongside text)
-python run_bench.py \
-    --task mmsearch --model Qwen/Qwen3.5-4B-Instruct \
-    --local-api --local-api-url http://localhost:30888/search \
-    --retrieval-top-k 3 --num-examples 300 --no-think
-
-# EVQA (multimodal: landmark photo + question text)
-python run_bench.py \
-    --task encyclopedic_vqa --model Qwen/Qwen3.5-4B-Instruct \
-    --local-api --local-api-url http://localhost:30888/search \
-    --evqa-dataset-filter landmarks --evqa-question-type-filter automatic \
-    --retrieval-top-k 3 --num-examples 749 --no-think
-
-# LiveVQA — see "LiveVQA Separate Pipeline" section below
-```
-
-### PixelRAG (fine-tuned) — Screenshot → Screenshot with LoRA embedding
-
-Same commands as PixelRAG (base), but the search API must be started with the fine-tuned index:
-
-```bash
-# Start search API with fine-tuned index
-pixelrag-serve \
-    --index-dir pixelrag-data/search_index_lora_vit_ckpt200_v2 \
-    --tiles-dir /path/to/wikipedia_tiles \
-    --articles-json pixelrag-data/articles.json \
-    --model Qwen/Qwen3-VL-Embedding-2B \
-    --peft-adapter /path/to/lora_checkpoint_200 \
-    --device cuda --port 30888
-```
-
-Then run the same `--local-api` commands above.
-
-### Grading
-
-```bash
-cd ~/pixelrag/eval
-
-# Grade with GPT-4.1 judge (Wikipedia QA tasks)
-python grade.py simpleqa eval_output/simpleqa_*.jsonl
-python grade.py encyclopedic_vqa eval_output/encyclopedic_vqa_*.jsonl
-python grade.py mmsearch eval_output/mmsearch_*.jsonl
-
-# For NQ/NQ-Tables (with LLM judge for paper numbers)
-python grade.py nq eval_output/nq_*.jsonl --llm-judge
-python grade.py nq_tables eval_output/nq_tables_*.jsonl --llm-judge
-
-# For LiveVQA (exact letter match — handled by the LiveVQA pipeline scripts)
-```
-
----
-
-## Table 3: Retrieval–Reader Modality Ablation
-
-**Task**: SimpleQA (1000) + LiveVQA (6632), **Reader**: Qwen3.5-4B, **k=3**,
-**Embedding**: Qwen3-VL-Embedding-2B (base, no LoRA)
-
-| Row | Retrieval | Reader Input | Flags |
-|-----|-----------|-------------|-------|
-| Screenshot → Screenshot | Pixel index | Raw tile images | `--local-api` |
-| Screenshot → OCR text | Pixel index | OCR'd text from tiles | `--local-api --read-as-text-ocr` |
-| Text → Rendered image | Text index | Text chunks rendered as PNG | `--text-api --render-as-image` |
-| Text → Text | Text index | Raw text chunks | `--text-api` |
-| Text → HTML | Text index | Raw HTML from kiwix | `--text-api --html-dom-lookup` |
-
-```bash
-# Screenshot → Screenshot (same as main results PixelRAG base)
-python run_bench.py \
-    --task simpleqa --model Qwen/Qwen3.5-4B-Instruct \
-    --local-api --local-api-url http://localhost:30888/search \
-    --retrieval-top-k 3 --num-examples 1000 --no-think
-
-# Screenshot → OCR text
-python run_bench.py \
-    --task simpleqa --model Qwen/Qwen3.5-4B-Instruct \
-    --local-api --local-api-url http://localhost:30888/search \
-    --read-as-text-ocr --ocr-url http://localhost:8202/v1 \
-    --retrieval-top-k 3 --num-examples 1000 --no-think
-
-# Text → Rendered image
-python run_bench.py \
-    --task simpleqa --model Qwen/Qwen3.5-4B-Instruct \
-    --text-api --text-api-url http://localhost:30889/search \
-    --render-as-image \
-    --retrieval-top-k 3 --num-examples 1000 --no-think
-
-# Text → Text (same as main results Trafilatura)
-python run_bench.py \
-    --task simpleqa --model Qwen/Qwen3.5-4B-Instruct \
-    --text-api --text-api-url http://localhost:30889/search \
-    --retrieval-top-k 3 --num-examples 1000 --no-think
-
-# Text → HTML (DOM lookup)
-python run_bench.py \
-    --task simpleqa --model Qwen/Qwen3.5-4B-Instruct \
-    --text-api --text-api-url http://localhost:30889/search \
-    --html-dom-lookup \
-    --retrieval-top-k 3 --num-examples 1000 --no-think
-```
-
-For LiveVQA, use the separate pipeline (see "LiveVQA Separate Pipeline" section) with the corresponding ablation scripts.
-
----
-
-## Table 4: Embedding Training Recipe Ablation
-
-**Evaluated on mini-datastore** (400 queries, 7426 tiles).
-
-This ablation uses `--prebuilt-tiles-dir` pointing to the pre-built mini-datastore, with different embedding checkpoints. Each row corresponds to a different embedding training recipe:
-
-```bash
-# Base model (no fine-tuning)
-python run_bench.py \
-    --task simpleqa --model Qwen/Qwen3.5-4B-Instruct \
-    --use-tiled-retrieval --use-qwen3vl-embedding \
-    --qwen3vl-model Qwen/Qwen3-VL-Embedding-2B \
-    --embedding-backend hf \
-    --prebuilt-tiles-dir tiles-hard-mini/ \
-    --retrieval-top-k 3 --num-examples 400 --no-think
-
-# With LoRA checkpoint (dynamic hard negatives + ViT unfrozen)
-python run_bench.py \
-    --task simpleqa --model Qwen/Qwen3.5-4B-Instruct \
-    --use-tiled-retrieval --use-qwen3vl-embedding \
-    --qwen3vl-model Qwen/Qwen3-VL-Embedding-2B \
-    --embedding-backend biqwen3 \
-    --peft-adapter /path/to/checkpoint-200 \
-    --prebuilt-tiles-dir tiles-hard-mini/ \
-    --retrieval-top-k 3 --num-examples 400 --no-think
-```
-
-The intermediate checkpoints (in-batch negatives, naive hard negatives, dynamic hard negatives frozen) each have their own PEFT adapter path.
-
----
-
-## Figure 2: Token Efficiency (SimpleQA, k=1,2,3, 4 readers)
-
-**Task**: SimpleQA (1000), **Readers**: Qwen3.5-4B, Qwen3.5-9B, Qwen3.5-27B, Qwen3.6-35B-A3B
-
-For each reader × k × retrieval method, run:
-
-```bash
-# Example: Qwen3.5-4B, k=1, PixelRAG (fine-tuned)
-python run_bench.py \
-    --task simpleqa --model Qwen/Qwen3.5-4B-Instruct \
-    --local-api --local-api-url http://localhost:30888/search \
-    --retrieval-top-k 3 --reader-top-k 1 \
-    --num-examples 1000 --no-think
-
-# Example: Qwen3.5-4B, k=2, PixelRAG (fine-tuned)
-python run_bench.py \
-    --task simpleqa --model Qwen/Qwen3.5-4B-Instruct \
-    --local-api --local-api-url http://localhost:30888/search \
-    --retrieval-top-k 3 --reader-top-k 2 \
-    --num-examples 1000 --no-think
-
-# Example: Qwen3.5-4B, k=3, PixelRAG (fine-tuned)
-python run_bench.py \
-    --task simpleqa --model Qwen/Qwen3.5-4B-Instruct \
-    --local-api --local-api-url http://localhost:30888/search \
-    --retrieval-top-k 3 \
-    --num-examples 1000 --no-think
-```
-
-> **Optimization**: Use `--retrieval-top-k 3 --reader-top-k N` to retrieve once at k=3 and evaluate at k=1,2,3 from the same JSONL (the full retrieved set is stored in `retrieved_images`).
-
-For each reader, change `--model` and start the appropriate vLLM server.
-Repeat for text retrieval (Trafilatura: `--text-api`) and PixelRAG base (base index).
-
-The plot script is at `arxiv/figures/plot_token_efficiency.py`.
-
----
-
-## Figure 3: Agentic Multi-Hop QA (MoNaCo)
-
-**Task**: MoNaCo (1315 questions), **Agent**: GPT-5 ReAct, **k=5 per search**
-
-Uses `eval/run_monaco.py` — a ReAct agent that issues search tool calls.
-
-```bash
-cd ~/pixelrag/eval
-
-# PixelRAG backend
-python run_monaco.py \
-    --reader gpt-5 \
-    --retrieval pixel \
-    --pixel-api http://localhost:30888/search \
-    --default-top-k 5
-
-# Text retrieval backend (Trafilatura)
-python run_monaco.py \
-    --reader gpt-5 \
-    --retrieval text \
-    --text-api http://localhost:30889/search \
-    --default-top-k 5
-
-# Grade (token F1 computed inline; add --judge for LLM judge F1)
-python run_monaco.py \
-    --reader gpt-5 \
-    --retrieval pixel \
-    --judge --judge-model gpt-4.1-2025-04-14
-
-# Or grade existing predictions:
-python grade.py monaco eval_output/monaco/<run_tag>
-```
-
-The dataset (`monaco_version_1_release.jsonl`) should be placed at
-`eval/data/monaco/` or passed via `--data-path`.
-
----
-
-## Figure 4: Image Compression Curve
-
-**Task**: SimpleQA (1000), **Reader**: Qwen3.5-4B (base + SFT), k=1..5, compression c=1×/2×/3×
-
-```bash
-# No compression (c=1×), k=3
-python run_bench.py \
-    --task simpleqa --model Qwen/Qwen3.5-4B-Instruct \
-    --local-api --local-api-url http://localhost:30888/search \
-    --retrieval-top-k 5 --reader-top-k 3 \
-    --num-examples 1000 --no-think
-
-# 2× compression, k=3
-python run_bench.py \
-    --task simpleqa --model Qwen/Qwen3.5-4B-Instruct \
-    --local-api --local-api-url http://localhost:30888/search \
-    --retrieval-top-k 5 --reader-top-k 3 \
-    --pixel-compress-ratio 2.0 \
-    --num-examples 1000 --no-think
-
-# 3× compression, k=3
-python run_bench.py \
-    --task simpleqa --model Qwen/Qwen3.5-4B-Instruct \
-    --local-api --local-api-url http://localhost:30888/search \
-    --retrieval-top-k 5 --reader-top-k 3 \
-    --pixel-compress-ratio 3.0 \
-    --num-examples 1000 --no-think
-```
-
-For the SFT reader, replace `--model` with the SFT checkpoint path and serve it via vLLM.
-
-The plot script is at `arxiv/figures/plot_sft_compression_curve.py`.
-
----
-
-## Table 8: Full Reader-Model Sweep (31 VLMs)
-
-**Task**: SimpleQA (1000), **k=3**, pixel retrieval (base) vs text retrieval (Trafilatura)
-
-For each of the 31 reader models, run two jobs:
-
-```bash
-# Pixel retrieval
-python run_bench.py \
-    --task simpleqa --model <MODEL_NAME> \
-    --local-api --local-api-url http://localhost:30888/search \
-    --retrieval-top-k 3 --num-examples 1000 --no-think
-
-# Text retrieval
-python run_bench.py \
-    --task simpleqa --model <MODEL_NAME> \
-    --text-api --text-api-url http://localhost:30889/search \
-    --retrieval-top-k 3 --num-examples 1000 --no-think
-```
-
-where `<MODEL_NAME>` is one of:
-- `liuhaotian/llava-v1.5-7b`
-- `meta-llama/Llama-3.2-11B-Vision-Instruct` (k=1 for pixel due to architecture limit)
-- `meta-llama/Llama-3.2-90B-Vision-Instruct` (k=1 for pixel)
-- `meta-llama/Llama-4-Scout-17B-16E-Instruct`
-- `meta-llama/Llama-4-Maverick-17B-128E-Instruct`
-- `Qwen/Qwen2-VL-2B-Instruct` through `Qwen/Qwen2-VL-72B-Instruct`
-- `Qwen/Qwen2.5-VL-3B-Instruct` through `Qwen/Qwen2.5-VL-72B-Instruct`
-- `Qwen/Qwen3-VL-2B` through `Qwen/Qwen3-VL-235B-A22B`
-- `Qwen/Qwen3.5-0.8B` through `Qwen/Qwen3.5-35B-A3B`
-- `Qwen/Qwen3.6-27B`, `Qwen/Qwen3.6-35B-A3B`
-
-For reasoning-mode models, omit `--no-think`.
-
-Each model requires its own vLLM instance (or OpenRouter/Commonstack for API models).
-
----
-
-## LiveVQA (Table 1 + Table 3)
-
-LiveVQA uses `eval/run_livevqa.py` — a dedicated script for the news corpus.
-
-**Requires**: News pixel search API (port 30890), news text search API (port 30892),
-LiveVQA v4 JSON dataset, vLLM reader.
-
-```bash
-cd ~/pixelrag/eval
-
-# No retrieval
-python run_livevqa.py --mode naive \
-    --model Qwen/Qwen3.5-4B-Instruct \
-    --output eval_output/livevqa_naive.jsonl
-
-# PixelRAG (screenshot → screenshot)
-python run_livevqa.py --mode pixel \
-    --pixel-api http://localhost:30890/search \
-    --model Qwen/Qwen3.5-4B-Instruct \
-    --output eval_output/livevqa_pixel.jsonl
-
-# Text retrieval (Trafilatura)
-python run_livevqa.py --mode text \
-    --text-api http://localhost:30892/search \
-    --model Qwen/Qwen3.5-4B-Instruct \
-    --output eval_output/livevqa_text.jsonl
-
-# Hybrid (pixel + text)
-python run_livevqa.py --mode hybrid \
-    --pixel-api http://localhost:30890/search \
-    --text-api http://localhost:30892/search \
-    --model Qwen/Qwen3.5-4B-Instruct \
-    --output eval_output/livevqa_hybrid.jsonl
-```
-
-Grading is automatic (5-option MC exact letter match) — printed at the end of each run.
-
----
-
-## Known Issues (Blockers for Reproduction)
-
-### ~~0. Missing simpleqa modules~~ (FIXED)
-
-`screenshot.py` and `pixel_query.py` have been copied into `eval/lib/`.
-Selenium import is deferred so it doesn't block `--local-api` users.
-
-### ~~1. `dr_agent` not importable~~ (FIXED)
-
-Dataset loaders extracted into `eval/lib/benchmarks.py`. The `run_bench.py`
-import now reads from `simpleqa.datasets_loader` instead of `dr_agent`.
-
-### ~~2. Grading script not in this repo~~ (FIXED)
-
-`eval/grade.py` implements GPT-4.1 3-way grading (CORRECT/INCORRECT/NOT_ATTEMPTED) using
-the same prompt template as the paper. No dependency on the old repo's evaluation framework.
-
-For the legacy full evaluation framework (per-example HTML reports, etc.), the original
-is still at `~/pixelrag-src/Vis-RAG/agent/scripts/evaluate.py`.
-
-### 3. Hardcoded paths in retrieval.py
-
-`eval/lib/retrieval.py` lines 84–88 have placeholder paths (`/path/to/project`, `/path/to/data`) for the local kiwix tile store. These are only used by `LocalWikiTiledScreenshotRetriever` (ground-truth screenshot mode), not by the production `--local-api` mode.
-
-### ~~4. LiveVQA uses separate pipeline~~ (FIXED)
-
-`eval/run_livevqa.py` handles all LiveVQA modes (naive, pixel, text, hybrid).
-
-### ~~5. MoNaCo runs from old repo~~ (FIXED)
-
-`eval/run_monaco.py` implements the full ReAct agent loop with pixel/text retrieval backends.
-
-### 6. mwparserfromhell text index
-
-The paper's second text baseline uses mwparserfromhell parser. The text index must be built separately with this parser — the parser choice is embedded at index build time, not at query time. The build pipeline for this variant needs to be documented.
-
-### 7. News corpus indexes
-
-LiveVQA requires separate tile and text indexes built over the news corpus (BBC/AP/CNN). These indexes are on a different machine/path and need their own `pixelrag-serve` instances.
-
----
-
-## Grading Protocol Summary
-
-| Benchmark | Metric | Grader |
-|-----------|--------|--------|
-| SimpleQA | CORRECT/INCORRECT/NOT_ATTEMPTED → accuracy | GPT-4.1 (temp=0, seed=42) |
-| NQ | Same 3-way judge | GPT-4.1 (temp=0, seed=42) |
-| NQ-Tables | Same 3-way judge (up to 10 gold aliases joined with OR) | GPT-4.1 |
-| MMSearch | Same 3-way judge | GPT-4.1 |
-| EVQA | Same 3-way judge (reference_list → "Any of: ref1 \| ref2") | GPT-4.1 |
-| LiveVQA | 5-option multiple-choice exact letter match | No LLM |
-| MoNaCo | Token-level F1 (primary), LLM judge F1 (secondary) | GPT-4.1 |
-
----
-
-## Quick Smoke Test (Verify Pipeline Works)
-
-Run a single example end-to-end before committing to full runs:
-
-```bash
-# 1. Verify search API is responding
-curl -s http://localhost:30888/status | python -m json.tool
-
-# 2. Run 5 examples, no retrieval
-python run_bench.py \
-    --task simpleqa --model Qwen/Qwen3.5-4B-Instruct \
-    --num-examples 5 --no-think --force
-
-# 3. Run 5 examples, pixel retrieval
-python run_bench.py \
-    --task simpleqa --model Qwen/Qwen3.5-4B-Instruct \
-    --local-api --local-api-url http://localhost:30888/search \
-    --retrieval-top-k 3 --num-examples 5 --no-think --force
-
-# 4. Grade
-cd ~/pixelrag-src/Vis-RAG/agent
-python scripts/evaluate.py simpleqa ~/pixelrag/eval/eval_output/<output>.jsonl
-```
-
----
-
-## Output File Convention
-
-All outputs go to `eval_output/` with auto-generated filenames:
-
-```
-eval_output/{task}_{mode}_{model_safe}_{n}.jsonl
-```
-
-Examples:
-- `eval_output/simpleqa_naive_qwen_qwen3.5_4b_instruct_1000.jsonl`
-- `eval_output/simpleqa_local_api_qwen_qwen3.5_4b_instruct_1000.jsonl`
-- `eval_output/nq_text_api_qwen_qwen3.5_4b_instruct_1000.jsonl`
-
-Grading results are saved alongside as `*_eval_results.json`.
diff --git a/docs/superpowers/plans/2026-05-11-pixelrag-restructure.md b/docs/superpowers/plans/2026-05-11-pixelrag-restructure.md
deleted file mode 100644
index e9d9766..0000000
--- a/docs/superpowers/plans/2026-05-11-pixelrag-restructure.md
+++ /dev/null
@@ -1,1315 +0,0 @@
-# PixelRAG 5-Package Restructure Implementation Plan
-
-> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
-
-**Goal:** Restructure ~/pixelrag/ from the current messy first-pass merge into 5 clean packages: ingest, embed, index, serve, train.
-
-**Architecture:** Five independent uv workspace packages. ingest renders documents to tiles, embed provides orchestrator-free chunk/embed/build tools, index orchestrates full pipelines, serve provides the search API, train handles model fine-tuning. Source repos at ~/pixelrag-src/ are read-only.
-
-**Tech Stack:** Python 3.12+, uv workspaces, FastAPI, FAISS, torch, Playwright, Chromium CDP
-
----
-
-## File Structure
-
-```
-~/pixelrag/
-├── pyproject.toml                          # workspace root
-├── uv.lock
-├── LICENSE
-├── README.md
-├── .gitignore
-├── packages/
-│   ├── ingest/
-│   │   ├── pyproject.toml
-│   │   └── src/pixelrag_render/
-│   │       ├── __init__.py
-│   │       ├── render.py                   # Public API dispatch
-│   │       ├── backends/
-│   │       │   ├── __init__.py
-│   │       │   ├── cdp.py                  # Lean CDP capture (default)
-│   │       │   ├── playwright.py           # Full Playwright (compat)
-│   │       │   └── pdf.py                  # PDF rendering
-│   │       └── bench/
-│   │           ├── benchmark.py
-│   │           ├── benchmark_optimizations.py
-│   │           ├── benchmark_fullpage.py
-│   │           └── benchmark_longtail_matrix.py
-│   │
-│   ├── embed/
-│   │   ├── pyproject.toml
-│   │   └── src/pixelrag_embed/
-│   │       ├── __init__.py
-│   │       ├── chunk.py                    # Tile → 1024px strips
-│   │       ├── embed.py                    # Images → vectors
-│   │       └── index.py                    # Vectors → FAISS
-│   │
-│   ├── index/
-│   │   ├── pyproject.toml
-│   │   └── src/pixelrag_index/
-│   │       ├── __init__.py
-│   │       ├── config.py                   # pixelrag.yaml parser
-│   │       ├── pipelines.py                # End-to-end orchestration
-│   │       ├── distributed.py              # S3ShardCoordinator (optional)
-│   │       ├── monitor.py                  # Progress dashboard
-│   │       └── sources/
-│   │           ├── __init__.py
-│   │           ├── base.py                 # Source ABC
-│   │           ├── kiwix.py                # Wikipedia ZIM
-│   │           ├── web.py                  # URLs + download (generalized news)
-│   │           ├── pdf.py                  # PDF directory
-│   │           └── local.py                # Auto-detect mixed files
-│   │
-│   ├── serve/
-│   │   ├── pyproject.toml
-│   │   └── src/pixelrag_serve/
-│   │       ├── __init__.py
-│   │       └── api.py                      # Unified search API
-│   │
-│   └── train/
-│       ├── pyproject.toml
-│       └── src/pixelrag_train/
-│           ├── __init__.py
-│           ├── models/
-│           │   ├── __init__.py
-│           │   └── biqwen3.py
-│           ├── contrastive.py
-│           └── mine.py
-│
-└── eval/
-    ├── run_naive_simpleqa.py
-    └── simpleqa/
-```
-
----
-
-### Task 1: Clean workspace and create scaffold
-
-**Files:**
-- Modify: `~/pixelrag/pyproject.toml`
-- Create: all package directories and `__init__.py` files
-
-- [ ] **Step 1: Remove old packages directory**
-
-```bash
-cd ~/pixelrag
-rm -rf packages/
-```
-
-- [ ] **Step 2: Create new package skeleton**
-
-```bash
-cd ~/pixelrag
-
-# ingest
-mkdir -p packages/render/src/pixelrag_render/backends
-mkdir -p packages/render/src/pixelrag_render/bench
-
-# embed
-mkdir -p packages/embed/src/pixelrag_embed
-
-# index
-mkdir -p packages/index/src/pixelrag_index/sources
-
-# serve
-mkdir -p packages/serve/src/pixelrag_serve
-
-# train
-mkdir -p packages/train/src/pixelrag_train/models
-
-# __init__.py for all packages
-for pkg in \
-    packages/render/src/pixelrag_render \
-    packages/render/src/pixelrag_render/backends \
-    packages/embed/src/pixelrag_embed \
-    packages/index/src/pixelrag_index \
-    packages/index/src/pixelrag_index/sources \
-    packages/serve/src/pixelrag_serve \
-    packages/train/src/pixelrag_train \
-    packages/train/src/pixelrag_train/models; do
-    touch "$pkg/__init__.py"
-done
-```
-
-- [ ] **Step 3: Update workspace root pyproject.toml**
-
-Write `~/pixelrag/pyproject.toml`:
-```toml
-[project]
-name = "pixelrag"
-version = "0.1.0"
-description = "Visual Retrieval-Augmented Generation — render, embed, index, search, train"
-requires-python = ">=3.12"
-
-[tool.uv.workspace]
-members = ["packages/*"]
-
-[tool.uv]
-override-dependencies = ["nvidia-cudnn-cu12==9.20.0.48"]
-environments = ["sys_platform == 'linux'"]
-
-[[tool.uv.index]]
-name = "pytorch-cu129"
-url = "https://download.pytorch.org/whl/cu129"
-explicit = true
-```
-
-- [ ] **Step 4: Commit scaffold**
-
-```bash
-cd ~/pixelrag
-git add -A
-git commit -m "scaffold: 5-package workspace (ingest, embed, index, serve, train)"
-```
-
----
-
-### Task 2: pixelrag-render package
-
-**Files:**
-- Create: `packages/render/pyproject.toml`
-- Create: `packages/render/src/pixelrag_render/render.py`
-- Create: `packages/render/src/pixelrag_render/backends/cdp.py`
-- Copy+strip: `packages/render/src/pixelrag_render/backends/playwright.py`
-- Create: `packages/render/src/pixelrag_render/backends/pdf.py`
-- Copy: `packages/render/src/pixelrag_render/bench/*.py`
-
-- [ ] **Step 1: Create pyproject.toml**
-
-Write `~/pixelrag/packages/render/pyproject.toml`:
-```toml
-[project]
-name = "pixelrag-render"
-version = "0.1.0"
-description = "Document → image tiles. Renders web pages, PDFs, and local files as tiled screenshots."
-requires-python = ">=3.12"
-dependencies = [
-    "playwright>=1.40.0",
-    "pillow>=10.0.0",
-    "aiohttp>=3.9.0",
-]
-
-[project.optional-dependencies]
-pdf = ["pdf2image>=1.16.0"]
-dev = ["pytest>=7.0.0", "pytest-asyncio>=0.21.0"]
-
-[project.scripts]
-pixelrag-render = "pixelrag_render.render:main"
-
-[build-system]
-requires = ["hatchling"]
-build-backend = "hatchling.build"
-
-[tool.hatch.build.targets.wheel]
-packages = ["src/pixelrag_render"]
-```
-
-- [ ] **Step 2: Create render.py — public API**
-
-Write `~/pixelrag/packages/render/src/pixelrag_render/render.py`:
-```python
-"""Public API for rendering documents to image tiles.
-
-Usage:
-    from pixelrag_render import render_url, render_pdf, render_file
-
-    tiles = render_url("https://en.wikipedia.org/wiki/Python", "./output")
-    tiles = render_pdf("paper.pdf", "./output")
-    tiles = render_file("doc.html", "./output")  # auto-detect type
-"""
-
-import argparse
-import os
-from pathlib import Path
-
-
-def render_url(
-    url: str,
-    output_dir: str,
-    backend: str = "cdp",
-    *,
-    tile_height: int = 8192,
-    quality: int = 85,
-    viewport_width: int = 875,
-    workers: int = 1,
-    **backend_kwargs,
-) -> list[Path]:
-    """Render a URL to tiled screenshots.
-
-    Args:
-        url: Web page URL to render.
-        output_dir: Directory for output tiles.
-        backend: "cdp" (default, fastest) or "playwright" (more options).
-        tile_height: Height of each tile in pixels.
-        quality: JPEG quality (1-100).
-        viewport_width: Browser viewport width.
-        workers: Number of browser workers for batch rendering.
-
-    Returns:
-        List of tile file paths.
-    """
-    if backend == "cdp":
-        from .backends.cdp import render_urls
-        return render_urls(
-            [url], output_dir,
-            tile_height=tile_height, quality=quality,
-            viewport_width=viewport_width, workers=workers,
-            **backend_kwargs,
-        )
-    elif backend == "playwright":
-        from .backends.playwright import render_urls
-        return render_urls(
-            [url], output_dir,
-            tile_height=tile_height, quality=quality,
-            viewport_width=viewport_width,
-            **backend_kwargs,
-        )
-    else:
-        raise ValueError(f"Unknown backend: {backend!r}. Use 'cdp' or 'playwright'.")
-
-
-def render_pdf(
-    path: str,
-    output_dir: str,
-    *,
-    dpi: int = 200,
-    pages: str | None = None,
-) -> list[Path]:
-    """Render a PDF to tiled page images.
-
-    Args:
-        path: Path to PDF file.
-        output_dir: Directory for output tiles.
-        dpi: Rendering resolution.
-        pages: Page range (e.g. "1-10"). None = all pages.
-
-    Returns:
-        List of tile file paths.
-    """
-    from .backends.pdf import render_pdf as _render_pdf
-    return _render_pdf(path, output_dir, dpi=dpi, pages=pages)
-
-
-def render_file(
-    path: str,
-    output_dir: str,
-    backend: str = "cdp",
-    **kwargs,
-) -> list[Path]:
-    """Auto-detect file type and render to tiles.
-
-    Supports: .pdf, .html, .png/.jpg (direct copy), URLs (if starts with http).
-    """
-    p = str(path)
-    if p.startswith("http://") or p.startswith("https://"):
-        return render_url(p, output_dir, backend=backend, **kwargs)
-    ext = os.path.splitext(p)[1].lower()
-    if ext == ".pdf":
-        return render_pdf(p, output_dir, **kwargs)
-    elif ext in (".html", ".htm"):
-        file_url = f"file://{os.path.abspath(p)}"
-        return render_url(file_url, output_dir, backend=backend, **kwargs)
-    elif ext in (".png", ".jpg", ".jpeg", ".webp"):
-        # Image files: copy directly as a single tile
-        os.makedirs(output_dir, exist_ok=True)
-        import shutil
-        dest = Path(output_dir) / Path(p).name
-        shutil.copy2(p, dest)
-        return [dest]
-    else:
-        raise ValueError(f"Unsupported file type: {ext}")
-
-
-def main():
-    parser = argparse.ArgumentParser(description="Render documents to image tiles")
-    parser.add_argument("inputs", nargs="+", help="URLs, file paths, or directories")
-    parser.add_argument("--output", "-o", default="./tiles", help="Output directory")
-    parser.add_argument("--backend", default="cdp", choices=["cdp", "playwright"])
-    parser.add_argument("--tile-height", type=int, default=8192)
-    parser.add_argument("--quality", type=int, default=85)
-    parser.add_argument("--viewport-width", type=int, default=875)
-    parser.add_argument("--workers", type=int, default=4)
-    parser.add_argument("--dpi", type=int, default=200, help="PDF rendering DPI")
-    args = parser.parse_args()
-
-    all_tiles = []
-    for inp in args.inputs:
-        tiles = render_file(
-            inp, args.output,
-            backend=args.backend,
-            tile_height=args.tile_height,
-            quality=args.quality,
-            viewport_width=args.viewport_width,
-            workers=args.workers,
-            dpi=args.dpi,
-        )
-        all_tiles.extend(tiles)
-        print(f"{inp} → {len(tiles)} tiles")
-
-    print(f"\nTotal: {len(all_tiles)} tiles in {args.output}")
-
-
-if __name__ == "__main__":
-    main()
-```
-
-- [ ] **Step 3: Create cdp.py — lean CDP backend**
-
-Copy from source and generalize:
-```bash
-cp ~/pixelrag-src/wiki-screenshot/scripts/render_news_pages.py \
-   ~/pixelrag/packages/render/src/pixelrag_render/backends/cdp.py
-```
-
-Then apply these transformations to `cdp.py`:
-1. Remove news-specific imports (`from wiki_screenshot.news.db import NewsDB`, `from wiki_screenshot.news.metrics import start_metrics_server`)
-2. Remove `NewsDB` usage in `worker()` and `run_batch()` — replace with simple success/fail counters
-3. Remove `check_nginx()` preflight — not general
-4. Remove `main()` function (the CLI with `--db-path`, `--pages-dir` etc.) — replaced by `render.py`
-5. Rename `capture_article()` to a general name
-6. Export a `render_urls(urls, output_dir, ...)` function that `render.py` calls
-7. Remove hardcoded paths (`/opt/dlami/nvme/`)
-8. Keep: `_launch_browser()`, `BROWSER_ARGS`, the CDP capture logic, multi-browser worker architecture, JPEG tile output
-
-- [ ] **Step 4: Copy and strip playwright.py**
-
-```bash
-cp ~/pixelrag-src/wiki-screenshot/src/wiki_screenshot/tools/playwright_tool.py \
-   ~/pixelrag/packages/render/src/pixelrag_render/backends/playwright.py
-```
-
-Transformations to `playwright.py`:
-1. Remove imports of `streaming_capture`, `raw_pixels`, `temp_dirs`
-2. Remove the `use_streaming` code path and all streaming-related params
-3. Remove unused experimental options (keep only: `use_cdp_screenshot`, `cdp_optimize_for_speed`, `segmented_save_tiles`, `segment_height`, `enable_gpu`, `device_scale_factor`, `image_format`, `quality`, `width`, `max_height`)
-4. Remove `_cdp_sessions` management if not needed for the retained CDP path
-5. Export a `render_urls(urls, output_dir, ...)` function matching cdp.py's interface
-6. Keep: CDP screenshot mode, segmented tile capture, GPU rasterization flags, core `_capture_page()` logic
-7. Target: strip from ~2388 lines to ~500-800 lines of production-relevant code
-
-- [ ] **Step 5: Create pdf.py — PDF backend**
-
-Write `~/pixelrag/packages/render/src/pixelrag_render/backends/pdf.py`:
-```python
-"""PDF rendering backend: PDF pages → tile images."""
-
-import json
-import os
-from pathlib import Path
-
-
-def render_pdf(
-    path: str,
-    output_dir: str,
-    *,
-    dpi: int = 200,
-    pages: str | None = None,
-) -> list[Path]:
-    """Render PDF pages as tile images.
-
-    Args:
-        path: Path to PDF file.
-        output_dir: Output directory for tiles.
-        dpi: Rendering resolution.
-        pages: Page range string (e.g. "1-10", "3,5,7"). None = all.
-
-    Returns:
-        List of tile image paths.
-    """
-    try:
-        from pdf2image import convert_from_path
-    except ImportError:
-        raise ImportError(
-            "pdf2image is required for PDF rendering. "
-            "Install with: pip install pixelrag-render[pdf]"
-        )
-
-    pdf_path = Path(path)
-    doc_id = pdf_path.stem
-    tile_dir = Path(output_dir) / f"{doc_id}.tiles"
-    tile_dir.mkdir(parents=True, exist_ok=True)
-
-    # Parse page range
-    kwargs = {"dpi": dpi}
-    if pages:
-        if "-" in pages:
-            first, last = pages.split("-", 1)
-            kwargs["first_page"] = int(first)
-            kwargs["last_page"] = int(last)
-        else:
-            page_nums = [int(p.strip()) for p in pages.split(",")]
-            kwargs["first_page"] = min(page_nums)
-            kwargs["last_page"] = max(page_nums)
-
-    images = convert_from_path(str(pdf_path), **kwargs)
-
-    tile_paths = []
-    for i, img in enumerate(images):
-        tile_path = tile_dir / f"tile_{i:04d}.jpg"
-        img.save(tile_path, "JPEG", quality=85)
-        tile_paths.append(tile_path)
-
-    # Write manifest
-    manifest = {
-        "source": str(pdf_path),
-        "dpi": dpi,
-        "pages": len(images),
-        "tiles": [p.name for p in tile_paths],
-        "complete": True,
-    }
-    with open(tile_dir / "tiles.json", "w") as f:
-        json.dump(manifest, f, indent=2)
-
-    return tile_paths
-```
-
-- [ ] **Step 6: Copy bench/**
-
-```bash
-cp ~/pixelrag-src/wiki-screenshot/bench/benchmark.py \
-   ~/pixelrag/packages/render/src/pixelrag_render/bench/
-cp ~/pixelrag-src/wiki-screenshot/bench/benchmark_optimizations.py \
-   ~/pixelrag/packages/render/src/pixelrag_render/bench/
-cp ~/pixelrag-src/wiki-screenshot/bench/benchmark_fullpage.py \
-   ~/pixelrag/packages/render/src/pixelrag_render/bench/
-cp ~/pixelrag-src/wiki-screenshot/bench/benchmark_longtail_matrix.py \
-   ~/pixelrag/packages/render/src/pixelrag_render/bench/
-```
-
-Fix imports in bench files: replace `wiki_screenshot` → `pixelrag_render`:
-```bash
-find ~/pixelrag/packages/render/src/pixelrag_render/bench -name '*.py' -exec sed -i \
-    -e 's/from wiki_screenshot/from pixelrag_render/g' \
-    -e 's/import wiki_screenshot/import pixelrag_render/g' \
-    {} +
-```
-
-- [ ] **Step 7: Verify ingest package imports**
-
-```bash
-cd ~/pixelrag
-uv sync --package pixelrag-render 2>&1 | tail -3
-uv run --package pixelrag-render python -c "from pixelrag_render.render import render_url, render_pdf, render_file; print('OK')"
-```
-
-- [ ] **Step 8: Commit**
-
-```bash
-cd ~/pixelrag
-git add packages/render/
-git commit -m "feat: add pixelrag-render package (CDP/Playwright/PDF backends)"
-```
-
----
-
-### Task 3: pixelrag-embed package
-
-**Files:**
-- Create: `packages/embed/pyproject.toml`
-- Copy: `packages/embed/src/pixelrag_embed/chunk.py`
-- Copy: `packages/embed/src/pixelrag_embed/embed.py`
-- Copy: `packages/embed/src/pixelrag_embed/index.py`
-
-- [ ] **Step 1: Create pyproject.toml**
-
-Write `~/pixelrag/packages/embed/pyproject.toml`:
-```toml
-[project]
-name = "pixelrag-embed"
-version = "0.1.0"
-description = "Image tiles → vectors → FAISS index. Three independent CLI tools."
-requires-python = ">=3.12"
-dependencies = [
-    "torch>=2.9.0",
-    "transformers>=4.57.0",
-    "faiss-cpu>=1.9.0",
-    "pillow>=10.0.0",
-    "numpy>=1.26.0",
-    "tqdm>=4.60.0",
-]
-
-[project.optional-dependencies]
-gpu = ["faiss-gpu-cu12>=1.13.2"]
-
-[project.scripts]
-pixelrag-chunk = "pixelrag_embed.chunk:main"
-pixelrag-embed = "pixelrag_embed.embed:main"
-pixelrag-build-index = "pixelrag_embed.index:main"
-
-[build-system]
-requires = ["hatchling"]
-build-backend = "hatchling.build"
-
-[tool.hatch.build.targets.wheel]
-packages = ["src/pixelrag_embed"]
-
-[tool.uv]
-override-dependencies = ["nvidia-cudnn-cu12==9.20.0.48"]
-
-[tool.uv.sources]
-torch = [{ index = "pytorch-cu129" }]
-```
-
-- [ ] **Step 2: Copy chunk.py**
-
-```bash
-cp ~/pixelrag-src/wiki-screenshot/embedding/chunk_tiles.py \
-   ~/pixelrag/packages/embed/src/pixelrag_embed/chunk.py
-```
-
-No import renames needed — `chunk_tiles.py` only uses stdlib + PIL.
-
-- [ ] **Step 3: Copy embed.py**
-
-```bash
-cp ~/pixelrag-src/wiki-screenshot/embedding/embed_tiles.py \
-   ~/pixelrag/packages/embed/src/pixelrag_embed/embed.py
-```
-
-No import renames needed — `embed_tiles.py` only uses stdlib + numpy/PIL/tqdm + subprocess for vLLM/sglang.
-
-- [ ] **Step 4: Copy index.py**
-
-```bash
-cp ~/pixelrag-src/wiki-screenshot/indexing/build_index.py \
-   ~/pixelrag/packages/embed/src/pixelrag_embed/index.py
-```
-
-No import renames needed — `build_index.py` only uses stdlib + numpy + faiss.
-
-- [ ] **Step 5: Clean hardcoded paths in all three files**
-
-```bash
-find ~/pixelrag/packages/embed/src -name '*.py' -exec sed -i \
-    -e 's|/opt/dlami/nvme/[^ "'"'"'\\)]*|./data|g' \
-    -e 's|/home/user/[^ "'"'"'\\)]*|./|g' \
-    -e 's|/home/ubuntu/[^ "'"'"'\\)]*|./|g' \
-    -e 's|/home/andy/[^ "'"'"'\\)]*|./|g' \
-    {} +
-```
-
-- [ ] **Step 6: Verify embed package imports**
-
-```bash
-cd ~/pixelrag
-uv sync --package pixelrag-embed 2>&1 | tail -3
-uv run --package pixelrag-embed python -c "from pixelrag_embed import chunk, embed, index; print('OK')"
-```
-
-- [ ] **Step 7: Commit**
-
-```bash
-cd ~/pixelrag
-git add packages/embed/
-git commit -m "feat: add pixelrag-embed package (chunk, embed, build-index)"
-```
-
----
-
-### Task 4: pixelrag-serve package
-
-**Files:**
-- Create: `packages/serve/pyproject.toml`
-- Create: `packages/serve/src/pixelrag_serve/api.py` (merged from 3 APIs)
-
-- [ ] **Step 1: Create pyproject.toml**
-
-Write `~/pixelrag/packages/serve/pyproject.toml`:
-```toml
-[project]
-name = "pixelrag-serve"
-version = "0.1.0"
-description = "FAISS-based visual search API. Serves any pre-built index."
-requires-python = ">=3.12"
-dependencies = [
-    "fastapi>=0.115.0",
-    "uvicorn>=0.30.0",
-    "numpy>=1.26.0",
-    "faiss-cpu>=1.9.0",
-    "transformers>=4.57.0",
-    "torch>=2.9.0",
-    "qwen-vl-utils",
-    "pillow>=10.0.0",
-    "pydantic>=2.0.0",
-]
-
-[project.optional-dependencies]
-gpu = ["faiss-gpu-cu12>=1.13.2"]
-
-[project.scripts]
-pixelrag-serve = "pixelrag_serve.api:main"
-
-[build-system]
-requires = ["hatchling"]
-build-backend = "hatchling.build"
-
-[tool.hatch.build.targets.wheel]
-packages = ["src/pixelrag_serve"]
-
-[tool.uv.sources]
-torch = [{ index = "pytorch-cu129" }]
-```
-
-- [ ] **Step 2: Create unified api.py**
-
-Start from the existing adapted search_api.py (which already has CPU support):
-```bash
-cp ~/pixelrag-src/wiki-screenshot/serving/search_api.py \
-   ~/pixelrag/packages/serve/src/pixelrag_serve/api.py
-```
-
-Apply transformations to `api.py`:
-1. Remove vllm backend (keep only direct transformers inference)
-2. Remove `torch.compile()` call
-3. Add `--device cpu|cuda` arg (default cpu), use `torch.float32` on CPU
-4. Replace hardcoded `/opt/dlami/nvme/` paths with env var defaults (`PIXELRAG_INDEX_DIR`, `PIXELRAG_ARTICLES_JSON`)
-5. Replace `torch_dtype` with `dtype` in `from_pretrained()` to fix deprecation warning
-6. This is the unified API — it serves any FAISS index (wiki, news, text, any). No separate news_search_api or text_search_api needed if the index format is consistent.
-
-The existing api.py from the first-pass merge (at `~/pixelrag/packages/serving/src/pixelrag_serving/search_api.py`) already has most of these changes. Use that as the starting point instead:
-```bash
-# Actually use the already-adapted version
-cp ~/pixelrag/packages/serving/src/pixelrag_serving/search_api.py \
-   ~/pixelrag/packages/serve/src/pixelrag_serve/api.py 2>/dev/null || \
-cp ~/pixelrag-src/wiki-screenshot/serving/search_api.py \
-   ~/pixelrag/packages/serve/src/pixelrag_serve/api.py
-```
-
-If using the source version, apply the CPU adaptations from the spec (remove vllm, add --device, fix torch_dtype, replace paths).
-
-- [ ] **Step 3: Verify serve package**
-
-```bash
-cd ~/pixelrag
-uv sync --package pixelrag-serve 2>&1 | tail -3
-uv run --package pixelrag-serve python -c "from pixelrag_serve import api; print('OK')"
-```
-
-- [ ] **Step 4: Smoke test with existing downloaded index**
-
-```bash
-PIXELRAG_INDEX_DIR=/home/yichuan/pixelrag-data/text_search_index_1024 \
-PIXELRAG_ARTICLES_JSON=/home/yichuan/pixelrag-data/articles.json \
-uv run --package pixelrag-serve python -m pixelrag_serve.api --port 31001 &
-sleep 120  # wait for index + model loading
-curl -s http://localhost:31001/health
-curl -s -X POST http://localhost:31001/search \
-    -H "Content-Type: application/json" \
-    -d '{"queries": [{"text": "capital of France"}], "n_docs": 3}' | python3 -m json.tool | head -20
-kill %1
-```
-
-- [ ] **Step 5: Commit**
-
-```bash
-cd ~/pixelrag
-git add packages/serve/
-git commit -m "feat: add pixelrag-serve package (unified FAISS search API)"
-```
-
----
-
-### Task 5: pixelrag-train package
-
-**Files:**
-- Create: `packages/train/pyproject.toml`
-- Copy: `packages/train/src/pixelrag_train/models/biqwen3.py`
-- Copy+rename: `packages/train/src/pixelrag_train/contrastive.py`
-- Create: `packages/train/src/pixelrag_train/mine.py` (merged from 2 scripts)
-- Copy: tests
-
-- [ ] **Step 1: Create pyproject.toml**
-
-Write `~/pixelrag/packages/train/pyproject.toml`:
-```toml
-[project]
-name = "pixelrag-train"
-version = "0.1.0"
-description = "LoRA/DoRA contrastive fine-tuning for visual document retrieval embeddings"
-requires-python = ">=3.12"
-dependencies = [
-    "torch==2.9.1",
-    "torchvision",
-    "transformers==4.57.1",
-    "nvidia-cudnn-cu12==9.20.0.48",
-    "peft>=0.15.0",
-    "accelerate>=1.0.0",
-    "Pillow",
-    "tqdm",
-    "numpy",
-    "faiss-cpu",
-    "wandb",
-    "safetensors",
-    "huggingface-hub",
-    "qwen-vl-utils",
-    "datasets",
-]
-
-[project.scripts]
-pixelrag-train = "pixelrag_train.contrastive:main"
-pixelrag-mine = "pixelrag_train.mine:main"
-
-[build-system]
-requires = ["hatchling"]
-build-backend = "hatchling.build"
-
-[tool.hatch.build.targets.wheel]
-packages = ["src/pixelrag_train"]
-
-[tool.uv]
-override-dependencies = ["nvidia-cudnn-cu12==9.20.0.48"]
-
-[tool.uv.sources]
-torch = [{ index = "pytorch-cu129" }]
-torchvision = [{ index = "pytorch-cu129" }]
-```
-
-- [ ] **Step 2: Copy model and training code**
-
-```bash
-SRC=~/pixelrag-src/wiki-screenshot-training
-DST=~/pixelrag/packages/train
-
-# Model
-cp $SRC/models/__init__.py $DST/src/pixelrag_train/models/
-cp $SRC/models/biqwen3.py $DST/src/pixelrag_train/models/
-
-# Training script
-cp $SRC/train_contrastors.py $DST/src/pixelrag_train/contrastive.py
-```
-
-- [ ] **Step 3: Create merged mine.py**
-
-Copy the image mining script as base, then merge text mining functionality:
-```bash
-cp ~/pixelrag-src/wiki-screenshot-training/mine_hard_negatives.py \
-   ~/pixelrag/packages/train/src/pixelrag_train/mine.py
-```
-
-Add to the `mine.py` argparse a `--mode image|text` flag. The image mode calls the image search API (original `mine_hard_negatives.py` behavior). The text mode calls the text search API (original `mine_text_hard_negatives.py` behavior). Read both source files to understand the differences and merge them.
-
-Key differences between the two scripts:
-- `mine_hard_negatives.py` queries `:30888/search` with image results (returns `chunk_path`)
-- `mine_text_hard_negatives.py` queries `:30889/search` with text results (returns `article_id`, `chunk_index`, `text`)
-- Both share: query loading, JSONL I/O, concurrent API calls, dedup logic
-
-- [ ] **Step 4: Copy tests**
-
-```bash
-cp ~/pixelrag-src/wiki-screenshot-training/tests/test_grad_equivalence.py \
-   ~/pixelrag/packages/train/tests/
-cp ~/pixelrag-src/wiki-screenshot-training/tests/test_grad_multi_gpu.py \
-   ~/pixelrag/packages/train/tests/
-mkdir -p ~/pixelrag/packages/train/tests
-```
-
-- [ ] **Step 5: Clean hardcoded paths**
-
-```bash
-find ~/pixelrag/packages/train -name '*.py' -exec sed -i \
-    -e 's|/opt/dlami/nvme/[^ "'"'"'\\)]*|./data|g' \
-    -e 's|/home/user/[^ "'"'"'\\)]*|./|g' \
-    -e 's|/home/ubuntu/[^ "'"'"'\\)]*|./|g' \
-    {} +
-```
-
-- [ ] **Step 6: Commit**
-
-```bash
-cd ~/pixelrag
-git add packages/train/
-git commit -m "feat: add pixelrag-train package (contrastive training + mining)"
-```
-
----
-
-### Task 6: pixelrag-index package
-
-**Files:**
-- Create: `packages/index/pyproject.toml`
-- Create: `packages/index/src/pixelrag_index/config.py`
-- Create: `packages/index/src/pixelrag_index/pipelines.py`
-- Copy+refactor: `packages/index/src/pixelrag_index/distributed.py`
-- Copy+refactor: `packages/index/src/pixelrag_index/monitor.py`
-- Copy+refactor: `packages/index/src/pixelrag_index/sources/kiwix.py`
-- Create: `packages/index/src/pixelrag_index/sources/web.py`
-- Create: `packages/index/src/pixelrag_index/sources/pdf.py`
-- Create: `packages/index/src/pixelrag_index/sources/local.py`
-- Create: `packages/index/src/pixelrag_index/sources/base.py`
-
-- [ ] **Step 1: Create pyproject.toml**
-
-Write `~/pixelrag/packages/index/pyproject.toml`:
-```toml
-[project]
-name = "pixelrag-index"
-version = "0.1.0"
-description = "Build searchable FAISS indexes from any document source"
-requires-python = ">=3.12"
-dependencies = [
-    "pixelrag-render",
-    "pixelrag-embed",
-    "pyyaml>=6.0",
-    "tqdm>=4.60.0",
-]
-
-[project.optional-dependencies]
-distributed = ["boto3>=1.42.0"]
-
-[project.scripts]
-pixelrag-index = "pixelrag_index.pipelines:main"
-pixelrag-monitor = "pixelrag_index.monitor:main"
-
-[build-system]
-requires = ["hatchling"]
-build-backend = "hatchling.build"
-
-[tool.hatch.build.targets.wheel]
-packages = ["src/pixelrag_index"]
-```
-
-- [ ] **Step 2: Create sources/base.py**
-
-Write `~/pixelrag/packages/index/src/pixelrag_index/sources/base.py`:
-```python
-"""Base class for document sources."""
-
-from dataclasses import dataclass
-from typing import Iterator
-
-
-@dataclass
-class Document:
-    """A document to be rendered and indexed."""
-    id: str
-    url: str | None = None
-    path: str | None = None
-    metadata: dict | None = None
-
-
-class Source:
-    """Base class for document sources. Subclasses yield Documents."""
-
-    def __iter__(self) -> Iterator[Document]:
-        raise NotImplementedError
-
-    def __len__(self) -> int:
-        raise NotImplementedError
-```
-
-- [ ] **Step 3: Create config.py**
-
-Write `~/pixelrag/packages/index/src/pixelrag_index/config.py`:
-```python
-"""Parse pixelrag.yaml configuration with parameter forwarding."""
-
-import os
-from pathlib import Path
-
-import yaml
-
-from .sources import SOURCES
-
-
-DEFAULT_CONFIG = {
-    "ingest": {"backend": "cdp", "quality": 85, "tile_height": 8192},
-    "embed": {"model": "Qwen/Qwen3-VL-Embedding-2B", "device": "cuda"},
-    "output": "./index",
-}
-
-
-def load_config(path: str | None = None) -> dict:
-    """Load config from pixelrag.yaml or defaults.
-
-    Looks for pixelrag.yaml in: explicit path > cwd > ~/.config/pixelrag/
-    """
-    if path is None:
-        candidates = [
-            Path("pixelrag.yaml"),
-            Path("pixelrag.yml"),
-            Path.home() / ".config" / "pixelrag" / "pixelrag.yaml",
-        ]
-        for c in candidates:
-            if c.exists():
-                path = str(c)
-                break
-
-    if path and os.path.exists(path):
-        with open(path) as f:
-            config = yaml.safe_load(f)
-    else:
-        config = {}
-
-    # Merge with defaults
-    result = {**DEFAULT_CONFIG, **config}
-    return result
-
-
-def make_source(config: dict):
-    """Create a Source instance from config["source"] with parameter forwarding."""
-    source_config = dict(config.get("source", {}))
-    source_type = source_config.pop("type", "local")
-
-    if source_type not in SOURCES:
-        raise ValueError(
-            f"Unknown source type: {source_type!r}. "
-            f"Available: {', '.join(SOURCES.keys())}"
-        )
-
-    return SOURCES[source_type](**source_config)
-```
-
-- [ ] **Step 4: Create sources/kiwix.py**
-
-Copy from source and refactor to use the new Source interface:
-```bash
-cp ~/pixelrag-src/wiki-screenshot/src/wiki_screenshot/datasources/kiwix.py \
-   ~/pixelrag/packages/index/src/pixelrag_index/sources/kiwix.py
-```
-
-Refactor: make `KiwixSource` extend `Source`, yield `Document` objects instead of `Article` objects. Remove imports of `wiki_screenshot`. Keep the core article iteration logic (fetch from kiwix-serve, cache articles.json).
-
-- [ ] **Step 5: Create sources/web.py**
-
-Copy news-related code and generalize:
-```bash
-# Start from the news datasource as the iteration layer
-cp ~/pixelrag-src/wiki-screenshot/src/wiki_screenshot/datasources/news.py \
-   ~/pixelrag/packages/index/src/pixelrag_index/sources/web.py
-```
-
-Then integrate download logic from `news/download.py` and `news/db.py` as internal implementation. Rename news-specific classes/functions to general names. Add `preset` parameter with `"news"` preset containing BBC/CNN/AP domain limits and cookie banner CSS.
-
-Key transformations:
-1. `NewsDataSource` → `WebSource(Source)`
-2. Import and embed `NewsDownloader` logic from `news/download.py` (or import it as a submodule)
-3. Import `NewsDB` from `news/db.py` as `WebDB` (SQLite state tracking)
-4. Add `PRESETS` dict with `"news"` key containing domain limits, cookie CSS
-5. Yield `Document` objects instead of `Article`
-
-- [ ] **Step 6: Create sources/pdf.py and sources/local.py**
-
-Write `~/pixelrag/packages/index/src/pixelrag_index/sources/pdf.py`:
-```python
-"""PDF directory source — iterates PDF files for rendering."""
-
-import os
-from pathlib import Path
-from typing import Iterator
-
-from .base import Document, Source
-
-
-class PDFSource(Source):
-    def __init__(self, path: str, **kwargs):
-        self.path = Path(path)
-        self.kwargs = kwargs
-        self._files = sorted(self.path.glob("**/*.pdf"))
-
-    def __iter__(self) -> Iterator[Document]:
-        for pdf in self._files:
-            yield Document(
-                id=pdf.stem,
-                path=str(pdf),
-                metadata={"type": "pdf", **self.kwargs},
-            )
-
-    def __len__(self) -> int:
-        return len(self._files)
-```
-
-Write `~/pixelrag/packages/index/src/pixelrag_index/sources/local.py`:
-```python
-"""Local directory source — auto-detects file types and routes."""
-
-import os
-from pathlib import Path
-from typing import Iterator
-
-from .base import Document, Source
-
-SUPPORTED_EXTENSIONS = {
-    ".pdf": "pdf",
-    ".html": "web",
-    ".htm": "web",
-    ".png": "image",
-    ".jpg": "image",
-    ".jpeg": "image",
-    ".webp": "image",
-}
-
-
-class LocalSource(Source):
-    def __init__(self, path: str, **kwargs):
-        self.path = Path(path)
-        self.kwargs = kwargs
-        self._files = []
-        for f in sorted(self.path.rglob("*")):
-            if f.is_file() and f.suffix.lower() in SUPPORTED_EXTENSIONS:
-                self._files.append(f)
-
-    def __iter__(self) -> Iterator[Document]:
-        for f in self._files:
-            ext = f.suffix.lower()
-            file_type = SUPPORTED_EXTENSIONS.get(ext, "unknown")
-            if file_type == "web":
-                url = f"file://{f.resolve()}"
-                yield Document(id=f.stem, url=url, metadata={"type": file_type})
-            else:
-                yield Document(id=f.stem, path=str(f), metadata={"type": file_type})
-
-    def __len__(self) -> int:
-        return len(self._files)
-```
-
-- [ ] **Step 7: Create sources/__init__.py registry**
-
-Write `~/pixelrag/packages/index/src/pixelrag_index/sources/__init__.py`:
-```python
-"""Document source registry."""
-
-from .base import Document, Source
-from .kiwix import KiwixSource
-from .local import LocalSource
-from .pdf import PDFSource
-from .web import WebSource
-
-SOURCES = {
-    "kiwix": KiwixSource,
-    "web": WebSource,
-    "pdf": PDFSource,
-    "local": LocalSource,
-}
-
-__all__ = ["Document", "Source", "SOURCES", "KiwixSource", "WebSource", "PDFSource", "LocalSource"]
-```
-
-- [ ] **Step 8: Copy and refactor distributed.py**
-
-```bash
-cp ~/pixelrag-src/wiki-screenshot/src/wiki_screenshot/coordinator.py \
-   ~/pixelrag/packages/index/src/pixelrag_index/distributed.py
-```
-
-Rename: `wiki_screenshot` imports → none needed (coordinator.py only uses stdlib + boto3).
-Replace hardcoded paths.
-
-- [ ] **Step 9: Copy and refactor monitor.py**
-
-```bash
-cp ~/pixelrag-src/wiki-screenshot/scripts/monitor_global.py \
-   ~/pixelrag/packages/index/src/pixelrag_index/monitor.py
-```
-
-Replace `from pixelrag_capture.coordinator import S3ShardCoordinator` → `from .distributed import S3ShardCoordinator`.
-
-- [ ] **Step 10: Create pipelines.py — orchestration**
-
-Write `~/pixelrag/packages/index/src/pixelrag_index/pipelines.py`:
-```python
-"""End-to-end pipeline: source → ingest → chunk → embed → build index."""
-
-import argparse
-import logging
-import os
-import subprocess
-import sys
-from pathlib import Path
-
-from .config import load_config, make_source
-
-logger = logging.getLogger("pixelrag-index")
-
-
-def build(config: dict) -> Path:
-    """Build a searchable index from a document source.
-
-    Chains: source → pixelrag-render (render) → pixelrag-chunk → pixelrag-embed → pixelrag-build-index
-    """
-    source = make_source(config)
-    output_dir = Path(config.get("output", "./index"))
-    tiles_dir = output_dir / "tiles"
-    chunks_dir = output_dir / "chunks"
-    embeddings_dir = output_dir / "embeddings"
-    index_dir = output_dir
-
-    ingest_config = config.get("ingest", {})
-    embed_config = config.get("embed", {})
-
-    os.makedirs(tiles_dir, exist_ok=True)
-    os.makedirs(chunks_dir, exist_ok=True)
-    os.makedirs(embeddings_dir, exist_ok=True)
-
-    logger.info("Source: %s (%d documents)", type(source).__name__, len(source))
-
-    # Stage 1: Render documents to tiles
-    from pixelrag_render.render import render_url, render_pdf, render_file
-
-    logger.info("Stage 1: Rendering %d documents...", len(source))
-    for doc in source:
-        doc_tiles_dir = str(tiles_dir / f"{doc.id}.tiles")
-        if doc.url:
-            render_url(doc.url, doc_tiles_dir, **ingest_config)
-        elif doc.path:
-            render_file(doc.path, doc_tiles_dir, **ingest_config)
-        logger.info("  Rendered: %s", doc.id)
-
-    # Stage 2: Chunk tiles
-    logger.info("Stage 2: Chunking tiles...")
-    subprocess.run([
-        sys.executable, "-m", "pixelrag_embed.chunk",
-        "--tiles-dir", str(tiles_dir),
-    ], check=True)
-
-    # Stage 3: Embed chunks
-    logger.info("Stage 3: Embedding chunks...")
-    embed_cmd = [
-        sys.executable, "-m", "pixelrag_embed.embed",
-        "--shard-dir", str(tiles_dir),
-        "--output-dir", str(embeddings_dir),
-    ]
-    if "gpu_ids" in embed_config:
-        embed_cmd.extend(["--gpu-ids", ",".join(str(g) for g in embed_config["gpu_ids"])])
-    if "model" in embed_config:
-        embed_cmd.extend(["--model", embed_config["model"]])
-    if "backend" in embed_config:
-        embed_cmd.extend(["--backend", embed_config["backend"]])
-    subprocess.run(embed_cmd, check=True)
-
-    # Stage 4: Build FAISS index
-    logger.info("Stage 4: Building FAISS index...")
-    subprocess.run([
-        sys.executable, "-m", "pixelrag_embed.index",
-        "build",
-        "--embeddings-dir", str(embeddings_dir),
-        "--output-dir", str(index_dir),
-    ], check=True)
-
-    logger.info("Index built at: %s", index_dir)
-    return index_dir
-
-
-def main():
-    parser = argparse.ArgumentParser(description="Build a visual search index")
-    parser.add_argument("command", choices=["build"], help="Command to run")
-    parser.add_argument("--config", "-c", default=None, help="Path to pixelrag.yaml")
-    parser.add_argument("--source", "-s", default=None, help="Source path (overrides config)")
-    parser.add_argument("--output", "-o", default=None, help="Output directory")
-    args = parser.parse_args()
-
-    logging.basicConfig(level=logging.INFO, format="%(asctime)s %(name)s %(message)s")
-
-    config = load_config(args.config)
-    if args.source:
-        config.setdefault("source", {})["path"] = args.source
-    if args.output:
-        config["output"] = args.output
-
-    if args.command == "build":
-        build(config)
-
-
-if __name__ == "__main__":
-    main()
-```
-
-- [ ] **Step 11: Commit**
-
-```bash
-cd ~/pixelrag
-git add packages/index/
-git commit -m "feat: add pixelrag-index package (orchestration, sources, distributed)"
-```
-
----
-
-### Task 7: Update eval/, README, and cleanup
-
-**Files:**
-- Modify: `eval/` (fix imports if needed)
-- Modify: `README.md`
-- Remove: old `packages/` remnants
-
-- [ ] **Step 1: Verify eval/ still works**
-
-eval/ should be unchanged. Check for broken imports:
-```bash
-grep -rn 'pixelrag_capture\|pixelrag_serving\|pixelrag_training\|wiki_screenshot' ~/pixelrag/eval/ --include='*.py'
-```
-
-If any found, fix them. eval/ scripts talk to the search API over HTTP, so they shouldn't import from other packages.
-
-- [ ] **Step 2: Update README.md**
-
-Rewrite to reflect the new 5-package architecture, user personas, and quick-start examples for each user type.
-
-- [ ] **Step 3: Clean up arxiv/ directory**
-
-The `arxiv/` directory appeared — add to `.gitignore` if it shouldn't be tracked, or remove from git.
-
-- [ ] **Step 4: Final sweep**
-
-```bash
-cd ~/pixelrag
-# No secrets
-grep -rn 'hf_[A-Za-z0-9]\{20,\}' --include='*.py' --include='*.sh' . | grep -v .git/
-# No Tsinghua mirror
-grep -rn 'tsinghua' --include='*.toml' . | grep -v .git/
-# No hardcoded machine paths
-grep -rn '/opt/dlami\|/home/user/\|/home/ubuntu/\|/home/andy/' --include='*.py' . | grep -v .git/ | head -10
-# No large files
-find . -size +1M -type f | grep -v '.git/' | grep -v '.venv/'
-```
-
-- [ ] **Step 5: Commit**
-
-```bash
-cd ~/pixelrag
-git add -A
-git commit -m "cleanup: update README, eval, remove old package remnants"
-```
-
----
-
-### Task 8: Workspace verification
-
-- [ ] **Step 1: Resolve workspace dependencies**
-
-```bash
-cd ~/pixelrag
-rm -f uv.lock
-uv sync 2>&1 | tail -5
-```
-
-- [ ] **Step 2: Verify each package imports**
-
-```bash
-uv run --package pixelrag-render python -c "from pixelrag_render.render import render_url; print('ingest OK')"
-uv run --package pixelrag-embed python -c "from pixelrag_embed import chunk, embed, index; print('embed OK')"
-uv run --package pixelrag-serve python -c "from pixelrag_serve import api; print('serve OK')"
-uv run --package pixelrag-train python -c "from pixelrag_train.models.biqwen3 import BiQwen3; print('train OK')"
-uv run --package pixelrag-index python -c "from pixelrag_index.config import load_config; print('index OK')"
-```
-
-- [ ] **Step 3: Verify serving still works**
-
-```bash
-PIXELRAG_INDEX_DIR=/home/yichuan/pixelrag-data/text_search_index_1024 \
-PIXELRAG_ARTICLES_JSON=/home/yichuan/pixelrag-data/articles.json \
-uv run --package pixelrag-serve pixelrag-serve --port 31001 &
-# Wait for loading, then test
-sleep 120
-curl -s http://localhost:31001/health
-curl -s -X POST http://localhost:31001/search \
-    -H "Content-Type: application/json" \
-    -d '{"queries": [{"text": "Apollo 11"}], "n_docs": 3}'
-kill %1
-```
-
-- [ ] **Step 4: Commit lock file**
-
-```bash
-cd ~/pixelrag
-git add uv.lock
-git commit -m "chore: regenerate uv.lock for 5-package workspace"
-```
diff --git a/docs/superpowers/plans/2026-05-25-pixelrag-frontend.md b/docs/superpowers/plans/2026-05-25-pixelrag-frontend.md
deleted file mode 100644
index 1ade91f..0000000
--- a/docs/superpowers/plans/2026-05-25-pixelrag-frontend.md
+++ /dev/null
@@ -1,2313 +0,0 @@
-# PixelRAG Frontend Implementation Plan
-
-> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
-
-**Goal:** Build a modern Next.js frontend for the PixelRAG visual retrieval engine — search page with image grid, API docs, and index dashboard.
-
-**Architecture:** Standalone Next.js 15 app in `web/` directory. Communicates with existing FastAPI backend via API proxy rewrites in dev, CORS in production. Dark theme, indigo accent, image-first design.
-
-**Tech Stack:** Next.js 15 (App Router), Tailwind CSS 4, shadcn/ui, Framer Motion, TypeScript
-
-**Spec:** `docs/superpowers/specs/2026-05-25-pixelrag-frontend-design.md`
-
----
-
-## File Map
-
-### New files (all under `web/`)
-
-| File | Responsibility |
-|------|---------------|
-| `web/src/lib/types.ts` | TypeScript types mirroring FastAPI Pydantic models |
-| `web/src/lib/api.ts` | Typed fetch wrapper for all backend endpoints |
-| `web/src/app/layout.tsx` | Root layout: nav bar, fonts, theme |
-| `web/src/app/page.tsx` | Search home page: SearchBar + results |
-| `web/src/app/status/page.tsx` | Index dashboard |
-| `web/src/app/docs/page.tsx` | API documentation |
-| `web/src/components/SearchBar.tsx` | Text + image input with drag-drop |
-| `web/src/components/TileCard.tsx` | Single tile result card |
-| `web/src/components/ResultGroup.tsx` | Article group with horizontal tile row |
-| `web/src/components/Lightbox.tsx` | Fullscreen tile viewer with pan/zoom/nav |
-| `web/src/components/ComparePanel.tsx` | Side-by-side tile comparison |
-| `web/src/components/ApiPlayground.tsx` | Try-it-live widget for docs page |
-| `web/src/components/StatusCard.tsx` | Metric display card |
-
-### Modified files
-
-| File | Change |
-|------|--------|
-| `serve/src/pixelrag_serve/api.py` | Add CORS middleware (3 lines) |
-
----
-
-## Phase 1: Core Search (MVP)
-
-### Task 1: Project Scaffolding
-
-**Files:**
-- Create: `web/` (entire Next.js project via CLI)
-- Modify: `web/src/app/globals.css` (custom theme tokens)
-- Modify: `web/next.config.ts` (API proxy rewrites)
-- Modify: `web/postcss.config.mjs` (verify Tailwind v4 plugin)
-
-- [ ] **Step 1: Scaffold Next.js project with shadcn/ui**
-
-```bash
-cd /home/yichuan/pixelrag
-npx shadcn@latest init -t next web
-```
-
-When prompted, accept defaults. This creates a Next.js 15 + Tailwind CSS 4 + shadcn/ui project in `web/`.
-
-- [ ] **Step 2: Verify the scaffold built correctly**
-
-```bash
-cd /home/yichuan/pixelrag/web && npm run build
-```
-
-Expected: Build succeeds with no errors.
-
-- [ ] **Step 3: Install additional dependencies**
-
-```bash
-cd /home/yichuan/pixelrag/web && npm install framer-motion
-```
-
-- [ ] **Step 4: Add shadcn/ui components we'll need**
-
-```bash
-cd /home/yichuan/pixelrag/web
-npx shadcn@latest add button input badge collapsible dialog slider
-```
-
-- [ ] **Step 5: Configure API proxy rewrites**
-
-Replace `web/next.config.ts` with:
-
-```ts
-import type { NextConfig } from "next";
-
-const nextConfig: NextConfig = {
-  async rewrites() {
-    return [
-      {
-        source: "/api/:path*",
-        destination: "http://localhost:30001/:path*",
-      },
-    ];
-  },
-};
-
-export default nextConfig;
-```
-
-- [ ] **Step 6: Set up custom theme in globals.css**
-
-Replace the `@theme` block in `web/src/app/globals.css` with the PixelRAG color palette. Keep the existing `@import "tailwindcss"` and shadcn layers. Add the custom theme tokens:
-
-```css
-@import "tailwindcss";
-
-@theme inline {
-  --color-background: #0c0c0c;
-  --color-surface: #1a1a1a;
-  --color-border: #222222;
-  --color-foreground: #ffffff;
-  --color-muted: #888888;
-  --color-muted-foreground: #555555;
-  --color-accent: #6366f1;
-  --color-accent-light: #8b5cf6;
-  --color-score: #6366f1;
-  --color-method-get: #3b82f6;
-  --color-method-post: #22c55e;
-
-  --font-sans: "Inter", ui-sans-serif, system-ui, sans-serif;
-  --font-display: "Crimson Pro", ui-serif, Georgia, serif;
-  --font-mono: "JetBrains Mono", ui-monospace, monospace;
-}
-
-/* shadcn overrides for dark theme */
-:root {
-  color-scheme: dark;
-}
-
-body {
-  background: var(--color-background);
-  color: var(--color-foreground);
-  font-family: var(--font-sans);
-}
-```
-
-Note: The exact format depends on what the shadcn init generated. Preserve any existing shadcn CSS variables and layer imports. The key additions are the custom color tokens and font families.
-
-- [ ] **Step 7: Add Google Fonts**
-
-In `web/src/app/layout.tsx`, add font imports. The shadcn scaffold creates a layout with a font already — modify it to use Inter + Crimson Pro + JetBrains Mono:
-
-```tsx
-import { Inter, Crimson_Pro, JetBrains_Mono } from "next/font/google";
-
-const inter = Inter({ subsets: ["latin"], variable: "--font-sans" });
-const crimsonPro = Crimson_Pro({ subsets: ["latin"], variable: "--font-display" });
-const jetbrainsMono = JetBrains_Mono({ subsets: ["latin"], variable: "--font-mono" });
-
-// In the <body> tag:
-<body className={`${inter.variable} ${crimsonPro.variable} ${jetbrainsMono.variable} antialiased`}>
-```
-
-- [ ] **Step 8: Verify dev server starts**
-
-```bash
-cd /home/yichuan/pixelrag/web && npm run dev &
-sleep 3
-curl -s http://localhost:3000 | head -20
-kill %1
-```
-
-Expected: HTML response from Next.js dev server.
-
-- [ ] **Step 9: Commit**
-
-```bash
-cd /home/yichuan/pixelrag
-git add web/
-git commit -m "feat(web): scaffold Next.js 15 + Tailwind 4 + shadcn/ui project"
-```
-
----
-
-### Task 2: Backend CORS Middleware
-
-**Files:**
-- Modify: `serve/src/pixelrag_serve/api.py` (lines 46-54)
-
-- [ ] **Step 1: Add CORS middleware to FastAPI**
-
-In `serve/src/pixelrag_serve/api.py`, add after the `app = FastAPI(...)` line (line 54):
-
-```python
-from fastapi.middleware.cors import CORSMiddleware
-
-app.add_middleware(
-    CORSMiddleware,
-    allow_origins=["http://localhost:3000"],
-    allow_credentials=True,
-    allow_methods=["*"],
-    allow_headers=["*"],
-)
-```
-
-The import `CORSMiddleware` should be added to the existing import block at the top. The `from fastapi.middleware.cors import CORSMiddleware` line goes near line 47 with the other fastapi imports.
-
-- [ ] **Step 2: Commit**
-
-```bash
-cd /home/yichuan/pixelrag
-git add serve/src/pixelrag_serve/api.py
-git commit -m "feat(serve): add CORS middleware for frontend dev server"
-```
-
----
-
-### Task 3: TypeScript Types + API Client
-
-**Files:**
-- Create: `web/src/lib/types.ts`
-- Create: `web/src/lib/api.ts`
-
-- [ ] **Step 1: Create TypeScript types mirroring the Pydantic models**
-
-Create `web/src/lib/types.ts`:
-
-```ts
-export interface Query {
-  text?: string;
-  image?: string; // base64-encoded
-  embedding?: number[];
-}
-
-export interface SearchRequest {
-  queries: Query[];
-  n_docs?: number;
-  nprobe?: number;
-  min_tile_height?: number;
-  instruction?: string;
-}
-
-export interface Hit {
-  score: number;
-  vector_id: number;
-  article_id: number;
-  tile_index: number;
-  chunk_index: number;
-  y_offset: number;
-  tile_height: number;
-  path: string;
-  url: string;
-}
-
-export interface QueryResult {
-  hits: Hit[];
-}
-
-export interface SearchResponse {
-  results: QueryResult[];
-}
-
-export interface StatusResponse {
-  total_vectors: number;
-  dimension: number;
-  nlist: number;
-  nprobe: number;
-  model: string;
-  index_dir: string;
-  tiles_dir: string;
-  index_built_at: string;
-  index_size_bytes: number;
-  metadata_size_bytes: number;
-}
-
-export interface ArticleGroup {
-  article_id: number;
-  title: string;
-  url: string;
-  hits: (Hit & { rank: number })[];
-}
-```
-
-- [ ] **Step 2: Create API client**
-
-Create `web/src/lib/api.ts`:
-
-```ts
-import type { SearchRequest, SearchResponse, StatusResponse } from "./types";
-
-const API_BASE = "/api";
-
-async function fetchApi<T>(path: string, init?: RequestInit): Promise<T> {
-  const res = await fetch(`${API_BASE}${path}`, init);
-  if (!res.ok) {
-    const body = await res.text();
-    throw new Error(`API ${res.status}: ${body}`);
-  }
-  return res.json();
-}
-
-export async function search(req: SearchRequest): Promise<SearchResponse> {
-  return fetchApi<SearchResponse>("/search", {
-    method: "POST",
-    headers: { "Content-Type": "application/json" },
-    body: JSON.stringify(req),
-  });
-}
-
-export async function getStatus(): Promise<StatusResponse> {
-  return fetchApi<StatusResponse>("/status");
-}
-
-export async function getHealth(): Promise<{ status: string }> {
-  return fetchApi<{ status: string }>("/health");
-}
-
-export function tileUrl(path: string): string {
-  return `${API_BASE}/tile?path=${encodeURIComponent(path)}`;
-}
-
-export async function reconstruct(
-  vectorIds: number[]
-): Promise<{ embeddings: number[][] }> {
-  return fetchApi<{ embeddings: number[][] }>("/reconstruct", {
-    method: "POST",
-    headers: { "Content-Type": "application/json" },
-    body: JSON.stringify({ vector_ids: vectorIds }),
-  });
-}
-```
-
-- [ ] **Step 3: Verify types compile**
-
-```bash
-cd /home/yichuan/pixelrag/web && npx tsc --noEmit
-```
-
-Expected: No errors.
-
-- [ ] **Step 4: Commit**
-
-```bash
-cd /home/yichuan/pixelrag
-git add web/src/lib/types.ts web/src/lib/api.ts
-git commit -m "feat(web): add TypeScript types and API client"
-```
-
----
-
-### Task 4: Navigation Shell (Layout)
-
-**Files:**
-- Modify: `web/src/app/layout.tsx`
-
-- [ ] **Step 1: Build the root layout with nav bar**
-
-Replace `web/src/app/layout.tsx` with the full layout. Keep the font setup from Task 1 Step 7. The nav bar has: logo (left), page links (right: Search, Docs, Status).
-
-```tsx
-import type { Metadata } from "next";
-import { Inter, Crimson_Pro, JetBrains_Mono } from "next/font/google";
-import Link from "next/link";
-import "./globals.css";
-
-const inter = Inter({ subsets: ["latin"], variable: "--font-sans" });
-const crimsonPro = Crimson_Pro({
-  subsets: ["latin"],
-  variable: "--font-display",
-});
-const jetbrainsMono = JetBrains_Mono({
-  subsets: ["latin"],
-  variable: "--font-mono",
-});
-
-export const metadata: Metadata = {
-  title: "PixelRAG",
-  description: "Visual retrieval over Wikipedia screenshot tiles",
-};
-
-export default function RootLayout({
-  children,
-}: {
-  children: React.ReactNode;
-}) {
-  return (
-    <html lang="en" className="dark">
-      <body
-        className={`${inter.variable} ${crimsonPro.variable} ${jetbrainsMono.variable} font-sans antialiased bg-background text-foreground min-h-screen`}
-      >
-        <nav className="border-b border-border/50 sticky top-0 z-50 bg-background/80 backdrop-blur-sm">
-          <div className="max-w-6xl mx-auto px-6 h-14 flex items-center justify-between">
-            <Link href="/" className="flex items-center gap-2">
-              <span className="font-display text-xl font-semibold tracking-tight">
-                Vis<span className="text-accent">RAG</span>
-              </span>
-            </Link>
-            <div className="flex items-center gap-6 text-sm text-muted">
-              <Link
-                href="/"
-                className="hover:text-foreground transition-colors"
-              >
-                Search
-              </Link>
-              <Link
-                href="/docs"
-                className="hover:text-foreground transition-colors"
-              >
-                API Docs
-              </Link>
-              <Link
-                href="/status"
-                className="hover:text-foreground transition-colors"
-              >
-                Status
-              </Link>
-            </div>
-          </div>
-        </nav>
-        <main>{children}</main>
-      </body>
-    </html>
-  );
-}
-```
-
-- [ ] **Step 2: Verify layout renders**
-
-```bash
-cd /home/yichuan/pixelrag/web && npm run build
-```
-
-Expected: Build succeeds.
-
-- [ ] **Step 3: Commit**
-
-```bash
-cd /home/yichuan/pixelrag
-git add web/src/app/layout.tsx
-git commit -m "feat(web): add navigation shell with header"
-```
-
----
-
-### Task 5: TileCard Component
-
-**Files:**
-- Create: `web/src/components/TileCard.tsx`
-
-- [ ] **Step 1: Build the TileCard component**
-
-Create `web/src/components/TileCard.tsx`:
-
-```tsx
-"use client";
-
-import Image from "next/image";
-import { useState } from "react";
-import type { Hit } from "@/lib/types";
-import { tileUrl } from "@/lib/api";
-
-interface TileCardProps {
-  hit: Hit;
-  rank: number;
-  selected?: boolean;
-  onSelect?: (hit: Hit) => void;
-  onClick?: (hit: Hit) => void;
-}
-
-export function TileCard({
-  hit,
-  rank,
-  selected,
-  onSelect,
-  onClick,
-}: TileCardProps) {
-  const [imgError, setImgError] = useState(false);
-
-  return (
-    <div
-      className={`relative group min-w-[200px] max-w-[240px] flex-shrink-0 rounded-lg border overflow-hidden cursor-pointer transition-all
-        ${selected ? "border-accent ring-1 ring-accent" : "border-border hover:border-border/80"}
-        bg-surface`}
-      onClick={() => onClick?.(hit)}
-    >
-      {/* Tile image */}
-      <div className="relative w-full aspect-[875/600] bg-background">
-        {imgError ? (
-          <div className="absolute inset-0 flex items-center justify-center text-muted-foreground text-xs">
-            tile {hit.tile_index}:{hit.chunk_index}
-          </div>
-        ) : (
-          <img
-            src={tileUrl(hit.path)}
-            alt={`tile ${hit.tile_index}:${hit.chunk_index}`}
-            className="w-full h-full object-cover object-top"
-            loading="lazy"
-            onError={() => setImgError(true)}
-          />
-        )}
-      </div>
-
-      {/* Rank badge */}
-      <div className="absolute top-1.5 left-1.5 bg-black/70 text-white text-[10px] font-bold px-1.5 py-0.5 rounded">
-        #{rank}
-      </div>
-
-      {/* Select checkbox (visible on hover or when selected) */}
-      {onSelect && (
-        <div
-          className={`absolute top-1.5 right-1.5 w-5 h-5 rounded border flex items-center justify-center text-xs transition-opacity
-            ${selected ? "opacity-100 bg-accent border-accent text-white" : "opacity-0 group-hover:opacity-100 border-white/50 bg-black/50"}`}
-          onClick={(e) => {
-            e.stopPropagation();
-            onSelect(hit);
-          }}
-        >
-          {selected && "✓"}
-        </div>
-      )}
-
-      {/* Metadata footer */}
-      <div className="px-2.5 py-2 flex items-center justify-between">
-        <span className="text-xs font-semibold text-score">
-          {hit.score.toFixed(3)}
-        </span>
-        <span className="text-[10px] text-muted-foreground">
-          {hit.tile_height}px
-        </span>
-      </div>
-    </div>
-  );
-}
-```
-
-- [ ] **Step 2: Verify it compiles**
-
-```bash
-cd /home/yichuan/pixelrag/web && npx tsc --noEmit
-```
-
-Expected: No errors.
-
-- [ ] **Step 3: Commit**
-
-```bash
-cd /home/yichuan/pixelrag
-git add web/src/components/TileCard.tsx
-git commit -m "feat(web): add TileCard component"
-```
-
----
-
-### Task 6: ResultGroup Component
-
-**Files:**
-- Create: `web/src/components/ResultGroup.tsx`
-
-- [ ] **Step 1: Build the ResultGroup component**
-
-Create `web/src/components/ResultGroup.tsx`:
-
-```tsx
-"use client";
-
-import { ExternalLink } from "lucide-react";
-import type { Hit, ArticleGroup } from "@/lib/types";
-import { TileCard } from "./TileCard";
-
-interface ResultGroupProps {
-  group: ArticleGroup;
-  selectedHits: Set<number>;
-  onSelectHit: (hit: Hit) => void;
-  onClickHit: (hit: Hit) => void;
-}
-
-export function ResultGroup({
-  group,
-  selectedHits,
-  onSelectHit,
-  onClickHit,
-}: ResultGroupProps) {
-  return (
-    <div className="mb-6">
-      {/* Article header */}
-      <div className="flex items-center gap-2 mb-2.5">
-        <h3 className="text-sm font-medium text-foreground">{group.title}</h3>
-        {group.url && (
-          <a
-            href={group.url}
-            target="_blank"
-            rel="noopener noreferrer"
-            className="text-[11px] text-accent hover:underline flex items-center gap-0.5"
-          >
-            {new URL(group.url).hostname}
-            <ExternalLink className="w-3 h-3" />
-          </a>
-        )}
-        <span className="text-[10px] text-muted-foreground bg-surface px-2 py-0.5 rounded-full">
-          {group.hits.length} tile{group.hits.length !== 1 && "s"}
-        </span>
-      </div>
-
-      {/* Horizontal scrollable tile row */}
-      <div className="flex gap-2.5 overflow-x-auto pb-2 scrollbar-thin scrollbar-thumb-border scrollbar-track-transparent">
-        {group.hits.map((hit) => (
-          <TileCard
-            key={hit.vector_id}
-            hit={hit}
-            rank={hit.rank}
-            selected={selectedHits.has(hit.vector_id)}
-            onSelect={onSelectHit}
-            onClick={onClickHit}
-          />
-        ))}
-      </div>
-    </div>
-  );
-}
-```
-
-- [ ] **Step 2: Add a utility to group hits by article**
-
-Add to the bottom of `web/src/lib/types.ts`:
-
-```ts
-export function groupHitsByArticle(hits: Hit[]): ArticleGroup[] {
-  const map = new Map<number, ArticleGroup>();
-  hits.forEach((hit, index) => {
-    const ranked = { ...hit, rank: index + 1 };
-    let group = map.get(hit.article_id);
-    if (!group) {
-      const slug = hit.url.split("/wiki/").pop() ?? "";
-      const title = decodeURIComponent(slug).replace(/_/g, " ") || `Article #${hit.article_id}`;
-      group = { article_id: hit.article_id, title, url: hit.url, hits: [] };
-      map.set(hit.article_id, group);
-    }
-    group.hits.push(ranked);
-  });
-  return Array.from(map.values());
-}
-```
-
-- [ ] **Step 3: Verify compilation**
-
-```bash
-cd /home/yichuan/pixelrag/web && npx tsc --noEmit
-```
-
-Expected: No errors.
-
-- [ ] **Step 4: Commit**
-
-```bash
-cd /home/yichuan/pixelrag
-git add web/src/components/ResultGroup.tsx web/src/lib/types.ts
-git commit -m "feat(web): add ResultGroup component with article grouping"
-```
-
----
-
-### Task 7: SearchBar Component (Text Only)
-
-**Files:**
-- Create: `web/src/components/SearchBar.tsx`
-
-- [ ] **Step 1: Build the SearchBar component**
-
-Create `web/src/components/SearchBar.tsx`:
-
-```tsx
-"use client";
-
-import { useState, useRef, useCallback } from "react";
-import { Search, X, ImagePlus } from "lucide-react";
-import { Button } from "@/components/ui/button";
-import { Input } from "@/components/ui/input";
-
-interface SearchBarProps {
-  onSearch: (query: string, image?: string) => void;
-  isLoading: boolean;
-}
-
-export function SearchBar({ onSearch, isLoading }: SearchBarProps) {
-  const [query, setQuery] = useState("");
-  const [imagePreview, setImagePreview] = useState<string | null>(null);
-  const [imageBase64, setImageBase64] = useState<string | null>(null);
-  const fileInputRef = useRef<HTMLInputElement>(null);
-
-  const handleSubmit = useCallback(() => {
-    if (!query.trim() && !imageBase64) return;
-    onSearch(query.trim(), imageBase64 ?? undefined);
-  }, [query, imageBase64, onSearch]);
-
-  const handleKeyDown = useCallback(
-    (e: React.KeyboardEvent) => {
-      if (e.key === "Enter") handleSubmit();
-    },
-    [handleSubmit]
-  );
-
-  const handleImageUpload = useCallback((file: File) => {
-    if (!file.type.startsWith("image/")) return;
-    const reader = new FileReader();
-    reader.onload = (e) => {
-      const dataUrl = e.target?.result as string;
-      setImagePreview(dataUrl);
-      setImageBase64(dataUrl.split(",")[1]);
-    };
-    reader.readAsDataURL(file);
-  }, []);
-
-  const handleDrop = useCallback(
-    (e: React.DragEvent) => {
-      e.preventDefault();
-      const file = e.dataTransfer.files[0];
-      if (file) handleImageUpload(file);
-    },
-    [handleImageUpload]
-  );
-
-  const clearImage = useCallback(() => {
-    setImagePreview(null);
-    setImageBase64(null);
-    if (fileInputRef.current) fileInputRef.current.value = "";
-  }, []);
-
-  return (
-    <div className="w-full max-w-2xl mx-auto">
-      <div
-        className="flex gap-2 items-center"
-        onDrop={handleDrop}
-        onDragOver={(e) => e.preventDefault()}
-      >
-        {/* Image preview thumbnail */}
-        {imagePreview && (
-          <div className="relative w-10 h-10 rounded-md overflow-hidden flex-shrink-0 border border-border">
-            <img
-              src={imagePreview}
-              alt="Query image"
-              className="w-full h-full object-cover"
-            />
-            <button
-              onClick={clearImage}
-              className="absolute -top-1 -right-1 w-4 h-4 bg-background border border-border rounded-full flex items-center justify-center"
-            >
-              <X className="w-2.5 h-2.5" />
-            </button>
-          </div>
-        )}
-
-        {/* Text input */}
-        <div className="flex-1 relative">
-          <Input
-            value={query}
-            onChange={(e) => setQuery(e.target.value)}
-            onKeyDown={handleKeyDown}
-            placeholder="Search Wikipedia visually..."
-            className="bg-surface border-border text-foreground placeholder:text-muted-foreground h-11 pr-10"
-          />
-          <button
-            onClick={() => fileInputRef.current?.click()}
-            className="absolute right-3 top-1/2 -translate-y-1/2 text-muted-foreground hover:text-foreground transition-colors"
-            title="Upload image"
-          >
-            <ImagePlus className="w-4 h-4" />
-          </button>
-        </div>
-
-        <input
-          ref={fileInputRef}
-          type="file"
-          accept="image/*"
-          className="hidden"
-          onChange={(e) => {
-            const file = e.target.files?.[0];
-            if (file) handleImageUpload(file);
-          }}
-        />
-
-        {/* Search button */}
-        <Button
-          onClick={handleSubmit}
-          disabled={isLoading || (!query.trim() && !imageBase64)}
-          className="h-11 px-5 bg-accent hover:bg-accent/90 text-white"
-        >
-          {isLoading ? (
-            <span className="animate-pulse">Searching...</span>
-          ) : (
-            <>
-              <Search className="w-4 h-4 mr-1.5" />
-              Search
-            </>
-          )}
-        </Button>
-      </div>
-
-      {/* Mode chips */}
-      <div className="flex gap-2 mt-3 justify-center">
-        {["Text query", "Image upload", "Drag & drop"].map((label) => (
-          <span
-            key={label}
-            className="text-[11px] text-muted-foreground bg-surface px-2.5 py-1 rounded-full"
-          >
-            {label}
-          </span>
-        ))}
-      </div>
-    </div>
-  );
-}
-```
-
-- [ ] **Step 2: Verify compilation**
-
-```bash
-cd /home/yichuan/pixelrag/web && npx tsc --noEmit
-```
-
-- [ ] **Step 3: Commit**
-
-```bash
-cd /home/yichuan/pixelrag
-git add web/src/components/SearchBar.tsx
-git commit -m "feat(web): add SearchBar component with text and image input"
-```
-
----
-
-### Task 8: Search Page
-
-**Files:**
-- Modify: `web/src/app/page.tsx`
-
-- [ ] **Step 1: Build the search home page**
-
-Replace `web/src/app/page.tsx`:
-
-```tsx
-"use client";
-
-import { useState, useCallback } from "react";
-import { SearchBar } from "@/components/SearchBar";
-import { ResultGroup } from "@/components/ResultGroup";
-import { Lightbox } from "@/components/Lightbox";
-import { search } from "@/lib/api";
-import type { Hit, ArticleGroup } from "@/lib/types";
-import { groupHitsByArticle } from "@/lib/types";
-
-export default function SearchPage() {
-  const [groups, setGroups] = useState<ArticleGroup[]>([]);
-  const [allHits, setAllHits] = useState<Hit[]>([]);
-  const [isLoading, setIsLoading] = useState(false);
-  const [error, setError] = useState<string | null>(null);
-  const [resultMeta, setResultMeta] = useState<{
-    count: number;
-    timeMs: number;
-  } | null>(null);
-  const [selectedHits, setSelectedHits] = useState<Set<number>>(new Set());
-  const [lightboxHit, setLightboxHit] = useState<Hit | null>(null);
-  const [hasSearched, setHasSearched] = useState(false);
-
-  const handleSearch = useCallback(
-    async (query: string, image?: string) => {
-      setIsLoading(true);
-      setError(null);
-      setSelectedHits(new Set());
-      const t0 = performance.now();
-
-      try {
-        const queryObj: { text?: string; image?: string } = {};
-        if (query) queryObj.text = query;
-        if (image) queryObj.image = image;
-
-        const res = await search({
-          queries: [queryObj],
-          n_docs: 20,
-        });
-        const elapsed = performance.now() - t0;
-        const hits = res.results[0]?.hits ?? [];
-        setAllHits(hits);
-        setGroups(groupHitsByArticle(hits));
-        setResultMeta({ count: hits.length, timeMs: elapsed });
-        setHasSearched(true);
-      } catch (err) {
-        setError(err instanceof Error ? err.message : "Search failed");
-        setGroups([]);
-        setAllHits([]);
-      } finally {
-        setIsLoading(false);
-      }
-    },
-    []
-  );
-
-  const handleSelectHit = useCallback((hit: Hit) => {
-    setSelectedHits((prev) => {
-      const next = new Set(prev);
-      if (next.has(hit.vector_id)) {
-        next.delete(hit.vector_id);
-      } else {
-        next.add(hit.vector_id);
-      }
-      return next;
-    });
-  }, []);
-
-  const handleClickHit = useCallback((hit: Hit) => {
-    setLightboxHit(hit);
-  }, []);
-
-  return (
-    <div className="max-w-6xl mx-auto px-6 py-12">
-      {/* Hero */}
-      <div className="text-center mb-8">
-        <h1 className="font-display text-4xl font-semibold tracking-tight mb-1">
-          Vis<span className="text-accent">RAG</span>
-        </h1>
-        <p className="text-sm text-muted">
-          Visual retrieval over 15.7M Wikipedia screenshot tiles
-        </p>
-      </div>
-
-      {/* Search */}
-      <SearchBar onSearch={handleSearch} isLoading={isLoading} />
-
-      {/* Status bar */}
-      {resultMeta && (
-        <div className="text-xs text-muted mt-6 mb-4 text-center">
-          {resultMeta.count} results in{" "}
-          <span className="text-accent">
-            {(resultMeta.timeMs / 1000).toFixed(2)}s
-          </span>
-        </div>
-      )}
-
-      {/* Error */}
-      {error && (
-        <div className="mt-6 p-4 border border-red-500/30 rounded-lg bg-red-500/5 text-red-400 text-sm">
-          {error}
-        </div>
-      )}
-
-      {/* Results */}
-      {groups.length > 0 && (
-        <div className="mt-6">
-          {groups.map((group) => (
-            <ResultGroup
-              key={group.article_id}
-              group={group}
-              selectedHits={selectedHits}
-              onSelectHit={handleSelectHit}
-              onClickHit={handleClickHit}
-            />
-          ))}
-        </div>
-      )}
-
-      {/* Empty state */}
-      {hasSearched && groups.length === 0 && !error && !isLoading && (
-        <div className="text-center text-muted-foreground mt-12">
-          No results found
-        </div>
-      )}
-
-      {/* Lightbox */}
-      {lightboxHit && (
-        <Lightbox
-          hit={lightboxHit}
-          allHits={allHits}
-          onClose={() => setLightboxHit(null)}
-          onNavigate={setLightboxHit}
-        />
-      )}
-
-      {/* Compare floating bar */}
-      {selectedHits.size >= 2 && (
-        <div className="fixed bottom-6 left-1/2 -translate-x-1/2 bg-surface border border-border rounded-full px-5 py-2.5 flex items-center gap-3 shadow-lg z-40">
-          <span className="text-sm text-muted">
-            {selectedHits.size} tiles selected
-          </span>
-          <button
-            className="text-sm text-accent font-medium hover:underline"
-            onClick={() => {
-              /* ComparePanel handled in Phase 2 */
-            }}
-          >
-            Compare
-          </button>
-          <button
-            className="text-sm text-muted-foreground hover:text-foreground"
-            onClick={() => setSelectedHits(new Set())}
-          >
-            Clear
-          </button>
-        </div>
-      )}
-    </div>
-  );
-}
-```
-
-- [ ] **Step 2: Verify build**
-
-```bash
-cd /home/yichuan/pixelrag/web && npm run build
-```
-
-Expected: Build succeeds. (Lightbox component not yet created — create a stub first, see Task 9.)
-
-Note: Before building, create a minimal Lightbox stub so the import doesn't fail:
-
-```bash
-mkdir -p /home/yichuan/pixelrag/web/src/components
-```
-
-Create `web/src/components/Lightbox.tsx` with a minimal stub:
-
-```tsx
-"use client";
-
-import type { Hit } from "@/lib/types";
-
-interface LightboxProps {
-  hit: Hit;
-  allHits: Hit[];
-  onClose: () => void;
-  onNavigate: (hit: Hit) => void;
-}
-
-export function Lightbox({ onClose }: LightboxProps) {
-  return (
-    <div
-      className="fixed inset-0 z-50 bg-black/90 flex items-center justify-center"
-      onClick={onClose}
-    >
-      <p className="text-white">Lightbox placeholder</p>
-    </div>
-  );
-}
-```
-
-- [ ] **Step 3: Build and verify**
-
-```bash
-cd /home/yichuan/pixelrag/web && npm run build
-```
-
-Expected: Build succeeds.
-
-- [ ] **Step 4: Commit**
-
-```bash
-cd /home/yichuan/pixelrag
-git add web/src/app/page.tsx web/src/components/Lightbox.tsx
-git commit -m "feat(web): add search home page with result grouping"
-```
-
----
-
-### Task 9: Tile Lightbox
-
-**Files:**
-- Modify: `web/src/components/Lightbox.tsx` (replace stub)
-
-- [ ] **Step 1: Implement the full Lightbox component**
-
-Replace `web/src/components/Lightbox.tsx`:
-
-```tsx
-"use client";
-
-import { useEffect, useCallback, useState, useRef } from "react";
-import { motion, AnimatePresence } from "framer-motion";
-import { X, ChevronLeft, ChevronRight, ExternalLink } from "lucide-react";
-import type { Hit } from "@/lib/types";
-import { tileUrl } from "@/lib/api";
-
-interface LightboxProps {
-  hit: Hit;
-  allHits: Hit[];
-  onClose: () => void;
-  onNavigate: (hit: Hit) => void;
-}
-
-export function Lightbox({ hit, allHits, onClose, onNavigate }: LightboxProps) {
-  const [scale, setScale] = useState(1);
-  const [position, setPosition] = useState({ x: 0, y: 0 });
-  const [dragging, setDragging] = useState(false);
-  const dragStart = useRef({ x: 0, y: 0 });
-  const posStart = useRef({ x: 0, y: 0 });
-
-  const currentIndex = allHits.findIndex(
-    (h) => h.vector_id === hit.vector_id
-  );
-  const hasPrev = currentIndex > 0;
-  const hasNext = currentIndex < allHits.length - 1;
-
-  const resetView = useCallback(() => {
-    setScale(1);
-    setPosition({ x: 0, y: 0 });
-  }, []);
-
-  const goPrev = useCallback(() => {
-    if (hasPrev) {
-      resetView();
-      onNavigate(allHits[currentIndex - 1]);
-    }
-  }, [hasPrev, currentIndex, allHits, onNavigate, resetView]);
-
-  const goNext = useCallback(() => {
-    if (hasNext) {
-      resetView();
-      onNavigate(allHits[currentIndex + 1]);
-    }
-  }, [hasNext, currentIndex, allHits, onNavigate, resetView]);
-
-  useEffect(() => {
-    const handleKey = (e: KeyboardEvent) => {
-      if (e.key === "Escape") onClose();
-      if (e.key === "ArrowLeft") goPrev();
-      if (e.key === "ArrowRight") goNext();
-    };
-    window.addEventListener("keydown", handleKey);
-    return () => window.removeEventListener("keydown", handleKey);
-  }, [onClose, goPrev, goNext]);
-
-  const handleWheel = useCallback((e: React.WheelEvent) => {
-    e.preventDefault();
-    setScale((prev) => Math.max(0.5, Math.min(5, prev - e.deltaY * 0.002)));
-  }, []);
-
-  const handleMouseDown = useCallback(
-    (e: React.MouseEvent) => {
-      if (scale <= 1) return;
-      setDragging(true);
-      dragStart.current = { x: e.clientX, y: e.clientY };
-      posStart.current = { ...position };
-    },
-    [scale, position]
-  );
-
-  const handleMouseMove = useCallback(
-    (e: React.MouseEvent) => {
-      if (!dragging) return;
-      setPosition({
-        x: posStart.current.x + (e.clientX - dragStart.current.x),
-        y: posStart.current.y + (e.clientY - dragStart.current.y),
-      });
-    },
-    [dragging]
-  );
-
-  const handleMouseUp = useCallback(() => {
-    setDragging(false);
-  }, []);
-
-  const slug = hit.url.split("/wiki/").pop() ?? "";
-  const title =
-    decodeURIComponent(slug).replace(/_/g, " ") ||
-    `Article #${hit.article_id}`;
-
-  return (
-    <AnimatePresence>
-      <motion.div
-        initial={{ opacity: 0 }}
-        animate={{ opacity: 1 }}
-        exit={{ opacity: 0 }}
-        className="fixed inset-0 z-50 bg-black/95 flex"
-        onClick={(e) => {
-          if (e.target === e.currentTarget) onClose();
-        }}
-      >
-        {/* Image area */}
-        <div
-          className="flex-1 flex items-center justify-center overflow-hidden cursor-grab active:cursor-grabbing"
-          onWheel={handleWheel}
-          onMouseDown={handleMouseDown}
-          onMouseMove={handleMouseMove}
-          onMouseUp={handleMouseUp}
-          onMouseLeave={handleMouseUp}
-        >
-          <img
-            src={tileUrl(hit.path)}
-            alt={`tile ${hit.tile_index}:${hit.chunk_index}`}
-            className="max-w-full max-h-full object-contain select-none"
-            style={{
-              transform: `translate(${position.x}px, ${position.y}px) scale(${scale})`,
-              transition: dragging ? "none" : "transform 0.15s ease-out",
-            }}
-            draggable={false}
-          />
-        </div>
-
-        {/* Metadata sidebar */}
-        <motion.div
-          initial={{ x: 80, opacity: 0 }}
-          animate={{ x: 0, opacity: 1 }}
-          className="w-72 bg-surface border-l border-border p-5 flex flex-col gap-4 overflow-y-auto"
-        >
-          <h3 className="text-base font-medium">{title}</h3>
-          {hit.url && (
-            <a
-              href={hit.url}
-              target="_blank"
-              rel="noopener noreferrer"
-              className="text-xs text-accent hover:underline flex items-center gap-1"
-            >
-              Open article <ExternalLink className="w-3 h-3" />
-            </a>
-          )}
-
-          <div className="space-y-3 text-xs">
-            <div>
-              <div className="text-muted-foreground mb-0.5">Score</div>
-              <div className="text-accent font-semibold text-lg">
-                {hit.score.toFixed(4)}
-              </div>
-            </div>
-            <div>
-              <div className="text-muted-foreground mb-0.5">Rank</div>
-              <div>#{currentIndex + 1} of {allHits.length}</div>
-            </div>
-            <div>
-              <div className="text-muted-foreground mb-0.5">Position</div>
-              <div>
-                tile {hit.tile_index} : chunk {hit.chunk_index}
-              </div>
-            </div>
-            <div>
-              <div className="text-muted-foreground mb-0.5">Tile Height</div>
-              <div>{hit.tile_height}px</div>
-            </div>
-            <div>
-              <div className="text-muted-foreground mb-0.5">Y Offset</div>
-              <div>{hit.y_offset}px</div>
-            </div>
-            <div>
-              <div className="text-muted-foreground mb-0.5">Vector ID</div>
-              <div className="font-mono">{hit.vector_id}</div>
-            </div>
-          </div>
-
-          <div className="mt-auto text-[10px] text-muted-foreground">
-            Scroll to zoom · Drag to pan · Arrow keys to navigate
-          </div>
-        </motion.div>
-
-        {/* Close button */}
-        <button
-          onClick={onClose}
-          className="absolute top-4 right-4 text-muted hover:text-foreground transition-colors z-10"
-        >
-          <X className="w-5 h-5" />
-        </button>
-
-        {/* Navigation arrows */}
-        {hasPrev && (
-          <button
-            onClick={goPrev}
-            className="absolute left-4 top-1/2 -translate-y-1/2 text-muted hover:text-foreground transition-colors"
-          >
-            <ChevronLeft className="w-8 h-8" />
-          </button>
-        )}
-        {hasNext && (
-          <button
-            onClick={goNext}
-            className="absolute right-80 top-1/2 -translate-y-1/2 text-muted hover:text-foreground transition-colors"
-          >
-            <ChevronRight className="w-8 h-8" />
-          </button>
-        )}
-      </motion.div>
-    </AnimatePresence>
-  );
-}
-```
-
-- [ ] **Step 2: Verify build**
-
-```bash
-cd /home/yichuan/pixelrag/web && npm run build
-```
-
-Expected: Build succeeds.
-
-- [ ] **Step 3: Commit**
-
-```bash
-cd /home/yichuan/pixelrag
-git add web/src/components/Lightbox.tsx
-git commit -m "feat(web): add Lightbox component with pan/zoom/navigation"
-```
-
----
-
-## Phase 2: Full Features
-
-### Task 10: Advanced Search Controls
-
-**Files:**
-- Create: `web/src/components/SearchControls.tsx`
-- Modify: `web/src/app/page.tsx`
-
-- [ ] **Step 1: Create SearchControls component**
-
-Create `web/src/components/SearchControls.tsx`:
-
-```tsx
-"use client";
-
-import { useState } from "react";
-import { ChevronDown, ChevronUp } from "lucide-react";
-import { Input } from "@/components/ui/input";
-
-export interface SearchOptions {
-  n_docs: number;
-  nprobe?: number;
-  min_tile_height?: number;
-  instruction?: string;
-}
-
-interface SearchControlsProps {
-  options: SearchOptions;
-  onChange: (options: SearchOptions) => void;
-}
-
-export function SearchControls({ options, onChange }: SearchControlsProps) {
-  const [open, setOpen] = useState(false);
-
-  return (
-    <div className="w-full max-w-2xl mx-auto mt-3">
-      <button
-        onClick={() => setOpen(!open)}
-        className="flex items-center gap-1 text-[11px] text-muted-foreground hover:text-muted mx-auto"
-      >
-        Advanced
-        {open ? (
-          <ChevronUp className="w-3 h-3" />
-        ) : (
-          <ChevronDown className="w-3 h-3" />
-        )}
-      </button>
-
-      {open && (
-        <div className="mt-3 p-4 bg-surface border border-border rounded-lg grid grid-cols-2 gap-3">
-          <div>
-            <label className="text-[10px] text-muted-foreground uppercase tracking-wider">
-              Results
-            </label>
-            <Input
-              type="number"
-              value={options.n_docs}
-              onChange={(e) =>
-                onChange({ ...options, n_docs: parseInt(e.target.value) || 10 })
-              }
-              min={1}
-              max={100}
-              className="mt-1 h-8 text-xs bg-background border-border"
-            />
-          </div>
-          <div>
-            <label className="text-[10px] text-muted-foreground uppercase tracking-wider">
-              nprobe
-            </label>
-            <Input
-              type="number"
-              value={options.nprobe ?? ""}
-              onChange={(e) =>
-                onChange({
-                  ...options,
-                  nprobe: e.target.value ? parseInt(e.target.value) : undefined,
-                })
-              }
-              placeholder="default"
-              className="mt-1 h-8 text-xs bg-background border-border"
-            />
-          </div>
-          <div>
-            <label className="text-[10px] text-muted-foreground uppercase tracking-wider">
-              Min tile height
-            </label>
-            <Input
-              type="number"
-              value={options.min_tile_height ?? ""}
-              onChange={(e) =>
-                onChange({
-                  ...options,
-                  min_tile_height: e.target.value
-                    ? parseInt(e.target.value)
-                    : undefined,
-                })
-              }
-              placeholder="none"
-              className="mt-1 h-8 text-xs bg-background border-border"
-            />
-          </div>
-          <div>
-            <label className="text-[10px] text-muted-foreground uppercase tracking-wider">
-              Instruction
-            </label>
-            <Input
-              value={options.instruction ?? ""}
-              onChange={(e) =>
-                onChange({
-                  ...options,
-                  instruction: e.target.value || undefined,
-                })
-              }
-              placeholder="default"
-              className="mt-1 h-8 text-xs bg-background border-border"
-            />
-          </div>
-        </div>
-      )}
-    </div>
-  );
-}
-```
-
-- [ ] **Step 2: Wire SearchControls into the search page**
-
-In `web/src/app/page.tsx`, add state and pass options to the search call:
-
-1. Add import: `import { SearchControls, type SearchOptions } from "@/components/SearchControls";`
-2. Add state: `const [searchOptions, setSearchOptions] = useState<SearchOptions>({ n_docs: 20 });`
-3. In `handleSearch`, change the `search()` call to use `searchOptions`:
-   ```ts
-   const res = await search({
-     queries: [queryObj],
-     n_docs: searchOptions.n_docs,
-     nprobe: searchOptions.nprobe,
-     min_tile_height: searchOptions.min_tile_height,
-     instruction: searchOptions.instruction,
-   });
-   ```
-4. Add `<SearchControls options={searchOptions} onChange={setSearchOptions} />` right after `<SearchBar />`.
-
-- [ ] **Step 3: Verify build**
-
-```bash
-cd /home/yichuan/pixelrag/web && npm run build
-```
-
-- [ ] **Step 4: Commit**
-
-```bash
-cd /home/yichuan/pixelrag
-git add web/src/components/SearchControls.tsx web/src/app/page.tsx
-git commit -m "feat(web): add advanced search controls (nprobe, min_tile_height, instruction)"
-```
-
----
-
-### Task 11: Side-by-Side Compare Panel
-
-**Files:**
-- Create: `web/src/components/ComparePanel.tsx`
-- Modify: `web/src/app/page.tsx`
-
-- [ ] **Step 1: Create ComparePanel component**
-
-Create `web/src/components/ComparePanel.tsx`:
-
-```tsx
-"use client";
-
-import { motion } from "framer-motion";
-import { X } from "lucide-react";
-import type { Hit } from "@/lib/types";
-import { tileUrl } from "@/lib/api";
-
-interface ComparePanelProps {
-  hits: Hit[];
-  allHits: Hit[];
-  onClose: () => void;
-}
-
-export function ComparePanel({ hits, allHits, onClose }: ComparePanelProps) {
-  return (
-    <motion.div
-      initial={{ y: "100%" }}
-      animate={{ y: 0 }}
-      exit={{ y: "100%" }}
-      transition={{ type: "spring", damping: 25, stiffness: 300 }}
-      className="fixed bottom-0 left-0 right-0 z-40 bg-surface border-t border-border max-h-[60vh] overflow-y-auto"
-    >
-      <div className="max-w-6xl mx-auto px-6 py-4">
-        <div className="flex items-center justify-between mb-4">
-          <h3 className="text-sm font-medium">
-            Comparing {hits.length} tiles
-          </h3>
-          <button
-            onClick={onClose}
-            className="text-muted-foreground hover:text-foreground"
-          >
-            <X className="w-4 h-4" />
-          </button>
-        </div>
-
-        <div className="flex gap-4 overflow-x-auto pb-4">
-          {hits.map((hit) => {
-            const rank =
-              allHits.findIndex((h) => h.vector_id === hit.vector_id) + 1;
-            const slug = hit.url.split("/wiki/").pop() ?? "";
-            const title =
-              decodeURIComponent(slug).replace(/_/g, " ") ||
-              `Article #${hit.article_id}`;
-
-            return (
-              <div
-                key={hit.vector_id}
-                className="flex-shrink-0 w-80 bg-background border border-border rounded-lg overflow-hidden"
-              >
-                <img
-                  src={tileUrl(hit.path)}
-                  alt={`tile ${hit.tile_index}:${hit.chunk_index}`}
-                  className="w-full h-48 object-cover object-top"
-                />
-                <div className="p-3 space-y-1">
-                  <div className="flex items-center justify-between">
-                    <span className="text-xs font-semibold text-accent">
-                      {hit.score.toFixed(4)}
-                    </span>
-                    <span className="text-[10px] text-muted-foreground">
-                      Rank #{rank}
-                    </span>
-                  </div>
-                  <div className="text-xs text-muted truncate">{title}</div>
-                  <div className="text-[10px] text-muted-foreground">
-                    tile {hit.tile_index}:{hit.chunk_index} ·{" "}
-                    {hit.tile_height}px
-                  </div>
-                </div>
-              </div>
-            );
-          })}
-        </div>
-      </div>
-    </motion.div>
-  );
-}
-```
-
-- [ ] **Step 2: Wire ComparePanel into search page**
-
-In `web/src/app/page.tsx`:
-
-1. Add import: `import { ComparePanel } from "@/components/ComparePanel";`
-2. Add state: `const [showCompare, setShowCompare] = useState(false);`
-3. Replace the `Compare` button onClick with: `onClick={() => setShowCompare(true)}`
-4. Add ComparePanel below the floating bar (inside AnimatePresence):
-
-```tsx
-{showCompare && selectedHits.size >= 2 && (
-  <ComparePanel
-    hits={allHits.filter((h) => selectedHits.has(h.vector_id))}
-    allHits={allHits}
-    onClose={() => setShowCompare(false)}
-  />
-)}
-```
-
-- [ ] **Step 3: Verify build**
-
-```bash
-cd /home/yichuan/pixelrag/web && npm run build
-```
-
-- [ ] **Step 4: Commit**
-
-```bash
-cd /home/yichuan/pixelrag
-git add web/src/components/ComparePanel.tsx web/src/app/page.tsx
-git commit -m "feat(web): add side-by-side tile comparison panel"
-```
-
----
-
-### Task 12: Index Status Dashboard
-
-**Files:**
-- Create: `web/src/components/StatusCard.tsx`
-- Create: `web/src/app/status/page.tsx`
-
-- [ ] **Step 1: Create StatusCard component**
-
-Create `web/src/components/StatusCard.tsx`:
-
-```tsx
-interface StatusCardProps {
-  label: string;
-  value: string;
-  sub?: string;
-}
-
-export function StatusCard({ label, value, sub }: StatusCardProps) {
-  return (
-    <div className="bg-surface border border-border rounded-lg p-5">
-      <div className="text-[10px] text-muted-foreground uppercase tracking-wider mb-1">
-        {label}
-      </div>
-      <div className="text-2xl font-semibold text-foreground">{value}</div>
-      {sub && (
-        <div className="text-xs text-muted mt-1">{sub}</div>
-      )}
-    </div>
-  );
-}
-```
-
-- [ ] **Step 2: Create status page**
-
-Create `web/src/app/status/page.tsx`:
-
-```tsx
-"use client";
-
-import { useEffect, useState } from "react";
-import { getStatus } from "@/lib/api";
-import type { StatusResponse } from "@/lib/types";
-import { StatusCard } from "@/components/StatusCard";
-
-function formatBytes(bytes: number): string {
-  if (bytes < 1024) return `${bytes} B`;
-  if (bytes < 1024 ** 2) return `${(bytes / 1024).toFixed(1)} KB`;
-  if (bytes < 1024 ** 3) return `${(bytes / 1024 ** 2).toFixed(1)} MB`;
-  return `${(bytes / 1024 ** 3).toFixed(2)} GB`;
-}
-
-function formatVectors(n: number): string {
-  if (n >= 1_000_000) return `${(n / 1_000_000).toFixed(1)}M`;
-  if (n >= 1_000) return `${(n / 1_000).toFixed(1)}K`;
-  return `${n}`;
-}
-
-export default function StatusPage() {
-  const [status, setStatus] = useState<StatusResponse | null>(null);
-  const [error, setError] = useState<string | null>(null);
-
-  useEffect(() => {
-    getStatus()
-      .then(setStatus)
-      .catch((err) =>
-        setError(err instanceof Error ? err.message : "Failed to load status")
-      );
-  }, []);
-
-  if (error) {
-    return (
-      <div className="max-w-4xl mx-auto px-6 py-12">
-        <h1 className="font-display text-2xl font-semibold mb-6">
-          Index Status
-        </h1>
-        <div className="p-4 border border-red-500/30 rounded-lg bg-red-500/5 text-red-400 text-sm">
-          {error}
-        </div>
-      </div>
-    );
-  }
-
-  if (!status) {
-    return (
-      <div className="max-w-4xl mx-auto px-6 py-12">
-        <h1 className="font-display text-2xl font-semibold mb-6">
-          Index Status
-        </h1>
-        <div className="grid grid-cols-2 gap-4">
-          {[1, 2, 3, 4].map((i) => (
-            <div
-              key={i}
-              className="bg-surface border border-border rounded-lg p-5 animate-pulse h-24"
-            />
-          ))}
-        </div>
-      </div>
-    );
-  }
-
-  return (
-    <div className="max-w-4xl mx-auto px-6 py-12">
-      <h1 className="font-display text-2xl font-semibold mb-6">
-        Index Status
-      </h1>
-
-      <div className="grid grid-cols-2 gap-4 mb-8">
-        <StatusCard
-          label="Total Vectors"
-          value={formatVectors(status.total_vectors)}
-          sub={`${status.total_vectors.toLocaleString()} exact`}
-        />
-        <StatusCard
-          label="Dimension"
-          value={`${status.dimension}`}
-        />
-        <StatusCard
-          label="Model"
-          value={status.model.split("/").pop() ?? status.model}
-          sub={status.model}
-        />
-        <StatusCard
-          label="Index Size"
-          value={formatBytes(status.index_size_bytes)}
-          sub={`metadata: ${formatBytes(status.metadata_size_bytes)}`}
-        />
-      </div>
-
-      <h2 className="text-sm font-medium text-muted mb-3">Configuration</h2>
-      <div className="bg-surface border border-border rounded-lg divide-y divide-border text-sm">
-        {[
-          ["nlist", `${status.nlist}`],
-          ["nprobe", `${status.nprobe}`],
-          ["Built at", new Date(status.index_built_at).toLocaleString()],
-          ["Index dir", status.index_dir],
-          ["Tiles dir", status.tiles_dir],
-        ].map(([label, value]) => (
-          <div key={label} className="flex px-4 py-2.5">
-            <span className="w-32 text-muted-foreground flex-shrink-0">
-              {label}
-            </span>
-            <span className="font-mono text-xs">{value}</span>
-          </div>
-        ))}
-      </div>
-    </div>
-  );
-}
-```
-
-- [ ] **Step 3: Verify build**
-
-```bash
-cd /home/yichuan/pixelrag/web && npm run build
-```
-
-- [ ] **Step 4: Commit**
-
-```bash
-cd /home/yichuan/pixelrag
-git add web/src/components/StatusCard.tsx web/src/app/status/page.tsx
-git commit -m "feat(web): add index status dashboard page"
-```
-
----
-
-### Task 13: API Documentation Page
-
-**Files:**
-- Create: `web/src/components/ApiPlayground.tsx`
-- Create: `web/src/app/docs/page.tsx`
-
-- [ ] **Step 1: Create ApiPlayground component**
-
-Create `web/src/components/ApiPlayground.tsx`:
-
-```tsx
-"use client";
-
-import { useState } from "react";
-import { Button } from "@/components/ui/button";
-
-interface ApiPlaygroundProps {
-  method: "GET" | "POST";
-  path: string;
-  defaultBody?: string;
-}
-
-export function ApiPlayground({
-  method,
-  path,
-  defaultBody,
-}: ApiPlaygroundProps) {
-  const [body, setBody] = useState(defaultBody ?? "");
-  const [response, setResponse] = useState<string | null>(null);
-  const [isLoading, setIsLoading] = useState(false);
-  const [error, setError] = useState<string | null>(null);
-
-  const handleSend = async () => {
-    setIsLoading(true);
-    setError(null);
-    setResponse(null);
-
-    try {
-      const url = `/api${path}`;
-      const init: RequestInit =
-        method === "POST"
-          ? {
-              method: "POST",
-              headers: { "Content-Type": "application/json" },
-              body,
-            }
-          : {};
-      const res = await fetch(url, init);
-      const text = await res.text();
-      try {
-        setResponse(JSON.stringify(JSON.parse(text), null, 2));
-      } catch {
-        setResponse(text);
-      }
-    } catch (err) {
-      setError(err instanceof Error ? err.message : "Request failed");
-    } finally {
-      setIsLoading(false);
-    }
-  };
-
-  return (
-    <div className="bg-surface border border-border rounded-lg p-4 mt-3">
-      <div className="text-[10px] text-muted-foreground uppercase tracking-wider mb-2">
-        Try it
-      </div>
-      {method === "POST" && (
-        <textarea
-          value={body}
-          onChange={(e) => setBody(e.target.value)}
-          rows={4}
-          className="w-full bg-background border border-border rounded-md p-3 font-mono text-xs text-foreground resize-y mb-2 focus:outline-none focus:border-accent"
-          spellCheck={false}
-        />
-      )}
-      <Button
-        onClick={handleSend}
-        disabled={isLoading}
-        size="sm"
-        className="bg-accent hover:bg-accent/90 text-white text-xs"
-      >
-        {isLoading ? "Sending..." : `Send ${method}`}
-      </Button>
-
-      {error && (
-        <div className="mt-3 text-xs text-red-400 bg-red-500/5 border border-red-500/20 rounded p-2">
-          {error}
-        </div>
-      )}
-      {response && (
-        <pre className="mt-3 bg-background border border-border rounded-md p-3 font-mono text-[11px] text-muted overflow-x-auto max-h-64 overflow-y-auto">
-          {response}
-        </pre>
-      )}
-    </div>
-  );
-}
-```
-
-- [ ] **Step 2: Create the docs page**
-
-Create `web/src/app/docs/page.tsx`:
-
-```tsx
-"use client";
-
-import { useState } from "react";
-import { ApiPlayground } from "@/components/ApiPlayground";
-
-interface Endpoint {
-  method: "GET" | "POST";
-  path: string;
-  description: string;
-  requestBody?: string;
-  responseSchema: string;
-  curl: string;
-}
-
-const endpoints: Endpoint[] = [
-  {
-    method: "POST",
-    path: "/search",
-    description:
-      "Search the visual index with text queries, images, or pre-computed embeddings.",
-    requestBody: JSON.stringify(
-      { queries: [{ text: "Nikola Tesla" }], n_docs: 10 },
-      null,
-      2
-    ),
-    responseSchema: `{
-  "results": [{
-    "hits": [{
-      "score": 0.847,
-      "vector_id": 12345,
-      "article_id": 42,
-      "tile_index": 0,
-      "chunk_index": 0,
-      "y_offset": 0,
-      "tile_height": 8192,
-      "path": "/path/to/tile.png",
-      "url": "https://en.wikipedia.org/wiki/..."
-    }]
-  }]
-}`,
-    curl: `curl -X POST http://localhost:30001/search \\
-  -H "Content-Type: application/json" \\
-  -d '{"queries": [{"text": "Nikola Tesla"}], "n_docs": 10}'`,
-  },
-  {
-    method: "GET",
-    path: "/status",
-    description: "Get index metadata and statistics.",
-    responseSchema: `{
-  "total_vectors": 15700000,
-  "dimension": 2048,
-  "nlist": 4096,
-  "nprobe": 64,
-  "model": "Qwen/Qwen3-VL-Embedding-2B",
-  "index_built_at": "2026-05-20T00:00:00Z",
-  "index_size_bytes": 13312000000,
-  "metadata_size_bytes": 512000000
-}`,
-    curl: "curl http://localhost:30001/status",
-  },
-  {
-    method: "GET",
-    path: "/tile",
-    description:
-      "Serve a tile image by its local path. Path must be under the tiles directory.",
-    responseSchema: "(PNG image binary)",
-    curl: `curl "http://localhost:30001/tile?path=/path/to/chunk_0000_00.png" -o tile.png`,
-  },
-  {
-    method: "GET",
-    path: "/health",
-    description: "Health check endpoint.",
-    responseSchema: '{"status": "ok"}',
-    curl: "curl http://localhost:30001/health",
-  },
-  {
-    method: "POST",
-    path: "/reconstruct",
-    description:
-      "Reconstruct stored embeddings by vector_id for alignment debugging.",
-    requestBody: JSON.stringify({ vector_ids: [0, 1, 2] }, null, 2),
-    responseSchema: `{
-  "embeddings": [[0.012, -0.034, ...], ...]
-}`,
-    curl: `curl -X POST http://localhost:30001/reconstruct \\
-  -H "Content-Type: application/json" \\
-  -d '{"vector_ids": [0, 1, 2]}'`,
-  },
-];
-
-export default function DocsPage() {
-  const [activeEndpoint, setActiveEndpoint] = useState(endpoints[0].path);
-  const active = endpoints.find((e) => e.path === activeEndpoint)!;
-
-  return (
-    <div className="max-w-6xl mx-auto px-6 py-12">
-      <h1 className="font-display text-2xl font-semibold mb-8">
-        API Reference
-      </h1>
-
-      <div className="flex gap-8">
-        {/* Sidebar */}
-        <div className="w-48 flex-shrink-0">
-          <div className="text-[10px] text-muted-foreground uppercase tracking-wider mb-3">
-            Endpoints
-          </div>
-          <div className="space-y-0.5">
-            {endpoints.map((ep) => (
-              <button
-                key={ep.path}
-                onClick={() => setActiveEndpoint(ep.path)}
-                className={`w-full text-left px-3 py-1.5 rounded text-xs transition-colors ${
-                  activeEndpoint === ep.path
-                    ? "bg-accent/10 text-foreground border-l-2 border-accent"
-                    : "text-muted hover:text-foreground"
-                }`}
-              >
-                <span
-                  className={`text-[10px] font-semibold mr-1.5 ${
-                    ep.method === "POST"
-                      ? "text-method-post"
-                      : "text-method-get"
-                  }`}
-                >
-                  {ep.method}
-                </span>
-                {ep.path}
-              </button>
-            ))}
-          </div>
-        </div>
-
-        {/* Main content */}
-        <div className="flex-1 min-w-0">
-          <div className="mb-6">
-            <div className="flex items-center gap-2 mb-2">
-              <span
-                className={`text-xs font-bold px-2 py-0.5 rounded ${
-                  active.method === "POST"
-                    ? "bg-method-post/10 text-method-post"
-                    : "bg-method-get/10 text-method-get"
-                }`}
-              >
-                {active.method}
-              </span>
-              <span className="font-mono text-sm">{active.path}</span>
-            </div>
-            <p className="text-sm text-muted">{active.description}</p>
-          </div>
-
-          {/* Request body */}
-          {active.requestBody && (
-            <div className="mb-6">
-              <div className="text-[10px] text-muted-foreground uppercase tracking-wider mb-2">
-                Request Body
-              </div>
-              <pre className="bg-surface border border-border rounded-lg p-4 font-mono text-xs text-muted overflow-x-auto">
-                {active.requestBody}
-              </pre>
-            </div>
-          )}
-
-          {/* Response schema */}
-          <div className="mb-6">
-            <div className="text-[10px] text-muted-foreground uppercase tracking-wider mb-2">
-              Response
-            </div>
-            <pre className="bg-surface border border-border rounded-lg p-4 font-mono text-xs text-muted overflow-x-auto">
-              {active.responseSchema}
-            </pre>
-          </div>
-
-          {/* curl example */}
-          <div className="mb-6">
-            <div className="text-[10px] text-muted-foreground uppercase tracking-wider mb-2">
-              curl
-            </div>
-            <pre className="bg-surface border border-border rounded-lg p-4 font-mono text-[11px] text-muted overflow-x-auto">
-              {active.curl}
-            </pre>
-          </div>
-
-          {/* Playground */}
-          <ApiPlayground
-            method={active.method}
-            path={active.path}
-            defaultBody={active.requestBody}
-          />
-        </div>
-      </div>
-    </div>
-  );
-}
-```
-
-- [ ] **Step 3: Verify build**
-
-```bash
-cd /home/yichuan/pixelrag/web && npm run build
-```
-
-- [ ] **Step 4: Commit**
-
-```bash
-cd /home/yichuan/pixelrag
-git add web/src/components/ApiPlayground.tsx web/src/app/docs/page.tsx
-git commit -m "feat(web): add API documentation page with live playground"
-```
-
----
-
-## Phase 3: Polish
-
-### Task 14: Loading States and Error Handling
-
-**Files:**
-- Modify: `web/src/app/page.tsx`
-- Modify: `web/src/components/TileCard.tsx`
-
-- [ ] **Step 1: Add skeleton loading state to search page**
-
-In `web/src/app/page.tsx`, add a skeleton component rendered when `isLoading` is true:
-
-```tsx
-{isLoading && (
-  <div className="mt-6 space-y-6">
-    {[1, 2, 3].map((i) => (
-      <div key={i}>
-        <div className="h-4 w-48 bg-surface rounded animate-pulse mb-3" />
-        <div className="flex gap-2.5">
-          {[1, 2, 3].map((j) => (
-            <div
-              key={j}
-              className="min-w-[200px] h-44 bg-surface rounded-lg animate-pulse"
-            />
-          ))}
-        </div>
-      </div>
-    ))}
-  </div>
-)}
-```
-
-Place this right after the status bar section, and wrap the existing results section in `{!isLoading && groups.length > 0 && (...)}`
-
-- [ ] **Step 2: Add loading shimmer to TileCard image**
-
-In `web/src/components/TileCard.tsx`, add a loading state to the image area. Before the `<img>` tag, add:
-
-```tsx
-const [imgLoaded, setImgLoaded] = useState(false);
-```
-
-And in the image container:
-
-```tsx
-{!imgLoaded && !imgError && (
-  <div className="absolute inset-0 bg-surface animate-pulse" />
-)}
-<img
-  // ...existing props...
-  onLoad={() => setImgLoaded(true)}
-  className={`w-full h-full object-cover object-top transition-opacity ${imgLoaded ? "opacity-100" : "opacity-0"}`}
-/>
-```
-
-- [ ] **Step 3: Verify build**
-
-```bash
-cd /home/yichuan/pixelrag/web && npm run build
-```
-
-- [ ] **Step 4: Commit**
-
-```bash
-cd /home/yichuan/pixelrag
-git add web/src/app/page.tsx web/src/components/TileCard.tsx
-git commit -m "feat(web): add loading skeletons and image shimmer"
-```
-
----
-
-### Task 15: Shareable Search URLs
-
-**Files:**
-- Modify: `web/src/app/page.tsx`
-
-- [ ] **Step 1: Sync search state with URL query params**
-
-In `web/src/app/page.tsx`:
-
-1. Add imports:
-   ```tsx
-   import { useSearchParams, useRouter } from "next/navigation";
-   ```
-
-2. Read initial query from URL params:
-   ```tsx
-   const searchParams = useSearchParams();
-   const router = useRouter();
-   const initialQuery = searchParams.get("q") ?? "";
-   ```
-
-3. Pass `initialQuery` to `SearchBar` as a `defaultValue` prop. Update `SearchBar` to accept and use it.
-
-4. After a successful search, update the URL:
-   ```tsx
-   const params = new URLSearchParams();
-   if (query) params.set("q", query);
-   if (searchOptions.n_docs !== 20) params.set("n_docs", String(searchOptions.n_docs));
-   router.replace(`?${params.toString()}`, { scroll: false });
-   ```
-
-5. Trigger a search on mount if `initialQuery` exists:
-   ```tsx
-   useEffect(() => {
-     if (initialQuery) {
-       handleSearch(initialQuery);
-     }
-   }, []); // eslint-disable-line react-hooks/exhaustive-deps
-   ```
-
-- [ ] **Step 2: Update SearchBar to accept defaultValue**
-
-In `web/src/components/SearchBar.tsx`, add to the props:
-
-```tsx
-interface SearchBarProps {
-  onSearch: (query: string, image?: string) => void;
-  isLoading: boolean;
-  defaultValue?: string;
-}
-```
-
-And change the initial state:
-
-```tsx
-const [query, setQuery] = useState(defaultValue ?? "");
-```
-
-- [ ] **Step 3: Wrap page content in Suspense**
-
-Since `useSearchParams()` requires a Suspense boundary in Next.js App Router, wrap the page export:
-
-```tsx
-import { Suspense } from "react";
-
-function SearchPageContent() {
-  // ... all existing page content
-}
-
-export default function SearchPage() {
-  return (
-    <Suspense>
-      <SearchPageContent />
-    </Suspense>
-  );
-}
-```
-
-- [ ] **Step 4: Verify build**
-
-```bash
-cd /home/yichuan/pixelrag/web && npm run build
-```
-
-- [ ] **Step 5: Commit**
-
-```bash
-cd /home/yichuan/pixelrag
-git add web/src/app/page.tsx web/src/components/SearchBar.tsx
-git commit -m "feat(web): add shareable search URLs via query params"
-```
-
----
-
-### Task 16: Keyboard Navigation
-
-**Files:**
-- Modify: `web/src/app/page.tsx`
-
-- [ ] **Step 1: Add Cmd+K / Ctrl+K shortcut to focus search bar**
-
-In `web/src/components/SearchBar.tsx`:
-
-1. Add a ref to the input: `const inputRef = useRef<HTMLInputElement>(null);`
-2. Expose it via a forwardRef or add a global keyboard listener:
-
-```tsx
-useEffect(() => {
-  const handleKey = (e: KeyboardEvent) => {
-    if ((e.metaKey || e.ctrlKey) && e.key === "k") {
-      e.preventDefault();
-      inputRef.current?.focus();
-    }
-  };
-  window.addEventListener("keydown", handleKey);
-  return () => window.removeEventListener("keydown", handleKey);
-}, []);
-```
-
-3. Add a hint to the input placeholder or nearby: show `⌘K` badge.
-
-In the search bar, next to the input, add:
-
-```tsx
-<kbd className="absolute right-10 top-1/2 -translate-y-1/2 text-[10px] text-muted-foreground bg-background border border-border rounded px-1 py-0.5 pointer-events-none hidden sm:block">
-  ⌘K
-</kbd>
-```
-
-Adjust the image upload button position to not overlap with the kbd hint.
-
-- [ ] **Step 2: Verify build**
-
-```bash
-cd /home/yichuan/pixelrag/web && npm run build
-```
-
-- [ ] **Step 3: Commit**
-
-```bash
-cd /home/yichuan/pixelrag
-git add web/src/components/SearchBar.tsx
-git commit -m "feat(web): add Cmd+K keyboard shortcut to focus search"
-```
-
----
-
-### Task 17: Final Integration Test
-
-- [ ] **Step 1: Build production bundle**
-
-```bash
-cd /home/yichuan/pixelrag/web && npm run build
-```
-
-Expected: Build succeeds with no errors.
-
-- [ ] **Step 2: Type check**
-
-```bash
-cd /home/yichuan/pixelrag/web && npx tsc --noEmit
-```
-
-Expected: No type errors.
-
-- [ ] **Step 3: Start dev server and manually verify**
-
-```bash
-cd /home/yichuan/pixelrag/web && npm run dev
-```
-
-Open http://localhost:3000 in a browser and verify:
-- Nav bar renders with logo and links
-- Search input is focused, Cmd+K works
-- All three pages load: `/`, `/docs`, `/status`
-- Dark theme with indigo accent throughout
-- Fonts load correctly (Inter for body, Crimson Pro for headings)
-
-Note: Search results require the FastAPI backend running on port 30001. Without it, the search will show an error state (which is also worth verifying looks correct).
-
-- [ ] **Step 4: Final commit**
-
-```bash
-cd /home/yichuan/pixelrag
-git add -A web/
-git commit -m "feat(web): PixelRAG frontend — complete implementation"
-```
diff --git a/docs/superpowers/plans/2026-05-27-chromium-build-centralia.md b/docs/superpowers/plans/2026-05-27-chromium-build-centralia.md
deleted file mode 100644
index 5287759..0000000
--- a/docs/superpowers/plans/2026-05-27-chromium-build-centralia.md
+++ /dev/null
@@ -1,438 +0,0 @@
-# Chromium Build on Centralia Implementation Plan
-
-> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
-
-**Goal:** Build a patched Chromium (v150.0.7844.0) on Centralia (SSH: `CentraliaB200`, user `yichuan_wang`) with our two custom CDP features: `rawFilePath` parameter for `Page.captureScreenshot` and `directClip` parameter for parallel tile capture.
-
-**Architecture:** All work happens under `/work/yichuan_wang/chromium-build/` on Centralia (NFS-mounted /work has 12TB free). depot_tools is cloned alongside the chromium checkout. The patch is generated locally from `~/chromium/src` (HEAD~1 diff) and transferred via scp. Build uses a release/official/no-debug/no-PGO args.gn for a fast, deployable binary.
-
-**Tech Stack:** Chromium source (~30GB no-history), depot_tools, gn, autoninja (ninja), Python 3.12 (already on Centralia), Ubuntu 24.04, 224 cores for parallel build.
-
----
-
-## Environment Facts (verified pre-plan)
-
-- **Centralia SSH alias:** `CentraliaB200` (user `yichuan_wang`)
-- **Workspace:** `/work/yichuan_wang/chromium-build/` (NFS, ~12TB free — plenty of room)
-- **Local disk on Centralia:** `/dev/md0` 209GB free — avoid storing large files there
-- **Local patch source:** `~/chromium/src` on local machine, `HEAD~1` diff covers 6 files, 166 insertions
-- **Chromium version:** 150.0.7844.0 (MAJOR=150, BUILD=7844)
-- **OS on Centralia:** Ubuntu 24.04.4 LTS (Noble)
-- **Python:** `/usr/bin/python3` (3.12.3) — already present, no install needed
-- **ninja/autoninja:** NOT present on Centralia — comes from depot_tools, added to PATH
-
----
-
-## File Map
-
-| Location | Purpose |
-|---|---|
-| `/work/yichuan_wang/chromium-build/depot_tools/` | Google's build tools (gn, fetch, autoninja, gclient) |
-| `/work/yichuan_wang/chromium-build/chromium/` | gclient checkout root (contains `.gclient`) |
-| `/work/yichuan_wang/chromium-build/chromium/src/` | Chromium source tree |
-| `/work/yichuan_wang/chromium-build/chromium/src/out/Release/` | Build output dir |
-| `/work/yichuan_wang/chromium-build/chromium/src/out/Release/args.gn` | Build configuration |
-| `/work/yichuan_wang/chromium-build/chromium_patches.diff` | Our custom patch (transferred from local) |
-| `/work/yichuan_wang/chromium-build/build.log` | autoninja build log (stream with `tail -f`) |
-
----
-
-## Task 1: Verify workspace and install depot_tools
-
-**Files:**
-- Create: `/work/yichuan_wang/chromium-build/` (directory)
-- Create: `/work/yichuan_wang/chromium-build/depot_tools/` (git clone)
-
-- [ ] **Step 1.1: Create workspace directory on Centralia**
-
-```bash
-ssh CentraliaB200 "mkdir -p /work/yichuan_wang/chromium-build && echo 'workspace ready'"
-```
-
-Expected output: `workspace ready`
-
-- [ ] **Step 1.2: Clone depot_tools into workspace**
-
-Note: the correct URL is `chromium/tools/depot_tools` (not `chromium/depot_tools`).
-
-```bash
-ssh CentraliaB200 "git clone https://chromium.googlesource.com/chromium/tools/depot_tools.git /work/yichuan_wang/chromium-build/depot_tools"
-```
-
-Expected: clone completes, last line something like `Resolving deltas: 100%`.
-
-- [ ] **Step 1.3: Verify depot_tools tools exist**
-
-```bash
-ssh CentraliaB200 "ls /work/yichuan_wang/chromium-build/depot_tools/fetch /work/yichuan_wang/chromium-build/depot_tools/gclient /work/yichuan_wang/chromium-build/depot_tools/autoninja"
-```
-
-Expected: three file paths printed (no errors).
-
----
-
-## Task 2: Fetch Chromium source (no history)
-
-This is the longest step — `fetch --no-history chromium` downloads ~30GB and runs `gclient sync`. With a fast connection it takes 30–90 minutes.
-
-**Files:**
-- Create: `/work/yichuan_wang/chromium-build/chromium/` (gclient root)
-- Create: `/work/yichuan_wang/chromium-build/chromium/src/` (source tree, ~30GB)
-
-- [ ] **Step 2.1: Create the chromium checkout directory**
-
-```bash
-ssh CentraliaB200 "mkdir -p /work/yichuan_wang/chromium-build/chromium"
-```
-
-- [ ] **Step 2.2: Start the fetch in a detached screen session**
-
-`fetch` must run from the chromium checkout root dir. We use `screen` so SSH disconnection doesn't kill it. The output is redirected to a log file for monitoring.
-
-IMPORTANT: The home dir (`/home/eecs/yichuan_wang`) is full (10GB NFS, 0 bytes free). Set XDG dirs to /work to prevent depot_tools from failing on `~/.config/depot_tools`. Also put both `depot_tools/.cipd_bin` and `depot_tools` in PATH — vpython3 needs cipd in PATH.
-
-```bash
-ssh CentraliaB200 "screen -dmS chromium_fetch bash -c '
-  export XDG_CONFIG_HOME=/work/yichuan_wang/chromium-build/xdg/config
-  export XDG_CACHE_HOME=/work/yichuan_wang/chromium-build/xdg/cache
-  export XDG_DATA_HOME=/work/yichuan_wang/chromium-build/xdg/data
-  export XDG_STATE_HOME=/work/yichuan_wang/chromium-build/xdg/state
-  export PATH=/work/yichuan_wang/chromium-build/depot_tools/.cipd_bin:/work/yichuan_wang/chromium-build/depot_tools:\$PATH
-  export DEPOT_TOOLS_DIR=/work/yichuan_wang/chromium-build/depot_tools
-  cd /work/yichuan_wang/chromium-build/chromium
-  echo \"FETCH_START \$(date)\" > /work/yichuan_wang/chromium-build/fetch.log
-  fetch --no-history chromium >> /work/yichuan_wang/chromium-build/fetch.log 2>&1
-  echo \"FETCH_DONE exit=\$? at \$(date)\" >> /work/yichuan_wang/chromium-build/fetch.log
-'"
-```
-
-- [ ] **Step 2.3: Verify screen session started**
-
-```bash
-ssh CentraliaB200 "screen -ls | grep chromium_fetch"
-```
-
-Expected: a line like `12345.chromium_fetch  (Detached)`.
-
-- [ ] **Step 2.4: Monitor fetch progress (check periodically)**
-
-```bash
-ssh CentraliaB200 "tail -20 /work/yichuan_wang/chromium-build/fetch.log"
-```
-
-Re-run this command to watch progress. Fetch is done when the log contains `FETCH_DONE exit=0`.
-
-- [ ] **Step 2.5: Verify source tree exists after fetch completes**
-
-```bash
-ssh CentraliaB200 "ls /work/yichuan_wang/chromium-build/chromium/src/chrome/VERSION"
-```
-
-Expected: file path printed (no error). If the file doesn't exist, fetch failed — check `fetch.log` for errors.
-
-- [ ] **Step 2.6: Verify Chromium version matches local**
-
-```bash
-ssh CentraliaB200 "cat /work/yichuan_wang/chromium-build/chromium/src/chrome/VERSION"
-```
-
-Expected:
-```
-MAJOR=150
-MINOR=0
-BUILD=7844
-PATCH=0
-```
-
-If the version differs, the patch may not apply cleanly. Record the actual version for the patch step.
-
----
-
-## Task 3: Transfer and apply our patches
-
-Our patch adds two CDP features to `Page.captureScreenshot`:
-- `rawFilePath`: write screenshot directly to a file path (bypassing base64 encoding)
-- `directClip`: clip parameter for parallel tile capture
-
-**Files:**
-- Modify: `content/browser/devtools/protocol/page_handler.cc` (primary patch target)
-- Modify: `content/browser/devtools/protocol/page_handler.h`
-- Modify: `content/renderer/render_widget_host/render_widget_host_impl.cc`
-- Modify: `content/renderer/render_widget_host/render_widget_host_impl.h`
-- Modify: `third_party/blink/public/devtools_protocol/domains/Page.pdl`
-- Modify: `third_party/blink/renderer/platform/widget/widget_base.cc`
-- Transfer: `/work/yichuan_wang/chromium-build/chromium_patches.diff`
-
-- [ ] **Step 3.1: Generate the patch from local machine**
-
-Run this on the LOCAL machine:
-
-```bash
-git -C ~/chromium/src diff HEAD~1 > /tmp/chromium_patches.diff
-wc -l /tmp/chromium_patches.diff
-```
-
-Expected: file is non-empty (~300+ lines).
-
-- [ ] **Step 3.2: Transfer patch to Centralia**
-
-Run this on the LOCAL machine:
-
-```bash
-scp /tmp/chromium_patches.diff CentraliaB200:/work/yichuan_wang/chromium-build/chromium_patches.diff
-```
-
-- [ ] **Step 3.3: Verify patch arrived on Centralia**
-
-```bash
-ssh CentraliaB200 "wc -l /work/yichuan_wang/chromium-build/chromium_patches.diff"
-```
-
-Expected: same line count as local.
-
-- [ ] **Step 3.4: Apply the patch**
-
-```bash
-ssh CentraliaB200 "cd /work/yichuan_wang/chromium-build/chromium/src && git apply /work/yichuan_wang/chromium-build/chromium_patches.diff"
-```
-
-Expected: no output (silent success). If you see errors like "patch does not apply", see Step 3.5.
-
-- [ ] **Step 3.5: (If patch fails) Try with --3way or check fuzz**
-
-If Step 3.4 fails with "patch does not apply":
-
-```bash
-ssh CentraliaB200 "cd /work/yichuan_wang/chromium-build/chromium/src && git apply --3way /work/yichuan_wang/chromium-build/chromium_patches.diff"
-```
-
-If that also fails, the Chromium version on Centralia differs from the local checkout. Check `cat /work/yichuan_wang/chromium-build/chromium/src/chrome/VERSION` and compare to local (`cat ~/chromium/src/chrome/VERSION`). If versions differ significantly, you may need to regenerate the patch from the correct base commit — fetch the local HEAD's commit hash with `git -C ~/chromium/src rev-parse HEAD~1` and use that.
-
-- [ ] **Step 3.6: Verify patch was applied**
-
-```bash
-ssh CentraliaB200 "cd /work/yichuan_wang/chromium-build/chromium/src && git diff --stat"
-```
-
-Expected output:
-```
- content/browser/devtools/protocol/page_handler.cc  | 129 +++++...
- content/browser/devtools/protocol/page_handler.h   |  14 +++
- content/renderer/render_widget_host/render_widget_host_impl.cc  |   9 ++
- content/renderer/render_widget_host/render_widget_host_impl.h   |   5 +
- third_party/blink/public/devtools_protocol/domains/Page.pdl     |   6 +
- third_party/blink/renderer/platform/widget/widget_base.cc       |  12 +-
- 6 files changed, 166 insertions(+), 9 deletions(-)
-```
-
----
-
-## Task 4: Configure the build
-
-**Files:**
-- Create: `/work/yichuan_wang/chromium-build/chromium/src/out/Release/` (directory)
-- Create: `/work/yichuan_wang/chromium-build/chromium/src/out/Release/args.gn`
-
-- [ ] **Step 4.1: Create the build output directory**
-
-```bash
-ssh CentraliaB200 "mkdir -p /work/yichuan_wang/chromium-build/chromium/src/out/Release"
-```
-
-- [ ] **Step 4.2: Write args.gn**
-
-```bash
-ssh CentraliaB200 "cat > /work/yichuan_wang/chromium-build/chromium/src/out/Release/args.gn << 'EOF'
-is_debug = false
-is_official_build = true
-is_component_build = false
-symbol_level = 0
-blink_symbol_level = 0
-chrome_pgo_phase = 0
-EOF"
-```
-
-- [ ] **Step 4.3: Verify args.gn content**
-
-```bash
-ssh CentraliaB200 "cat /work/yichuan_wang/chromium-build/chromium/src/out/Release/args.gn"
-```
-
-Expected exact output:
-```
-is_debug = false
-is_official_build = true
-is_component_build = false
-symbol_level = 0
-blink_symbol_level = 0
-chrome_pgo_phase = 0
-```
-
----
-
-## Task 5: Run gn gen
-
-`gn gen` reads `args.gn` and generates all the ninja build files. This takes 2–5 minutes on 224 cores.
-
-**Files:**
-- Create: `/work/yichuan_wang/chromium-build/chromium/src/out/Release/build.ninja` (generated)
-- Create: `/work/yichuan_wang/chromium-build/gn_gen.log`
-
-- [ ] **Step 5.1: Run gn gen in a screen session**
-
-```bash
-ssh CentraliaB200 "screen -dmS chromium_gn bash -c '
-  export PATH=/work/yichuan_wang/chromium-build/depot_tools:\$PATH
-  cd /work/yichuan_wang/chromium-build/chromium/src
-  gn gen out/Release > /work/yichuan_wang/chromium-build/gn_gen.log 2>&1
-  echo \"GN_DONE exit=\$?\" >> /work/yichuan_wang/chromium-build/gn_gen.log
-'"
-```
-
-- [ ] **Step 5.2: Wait for gn gen to finish**
-
-```bash
-ssh CentraliaB200 "tail -5 /work/yichuan_wang/chromium-build/gn_gen.log"
-```
-
-Re-run until you see `GN_DONE exit=0`. If exit is non-zero, check the full log:
-
-```bash
-ssh CentraliaB200 "cat /work/yichuan_wang/chromium-build/gn_gen.log"
-```
-
-Common gn errors and fixes:
-- `Python not found`: verify `which python3` works on Centralia (it does per our check)
-- `No targets match`: args.gn typo — re-check Step 4.2
-
-- [ ] **Step 5.3: Verify build.ninja was generated**
-
-```bash
-ssh CentraliaB200 "ls -lh /work/yichuan_wang/chromium-build/chromium/src/out/Release/build.ninja"
-```
-
-Expected: file exists, non-zero size.
-
----
-
-## Task 6: Build chrome with autoninja
-
-This is the main build step. With 224 cores and no debug symbols, expect 60–120 minutes for a full build. The output is the `chrome` binary.
-
-**Files:**
-- Create: `/work/yichuan_wang/chromium-build/chromium/src/out/Release/chrome` (built binary)
-- Create: `/work/yichuan_wang/chromium-build/build.log`
-
-- [ ] **Step 6.1: Start autoninja build in screen session**
-
-`autoninja` automatically sets `-j` based on CPU count (will use ~224 jobs). It reads the `NINJA_SUMMARIZE_BUILD` env var to show progress.
-
-```bash
-ssh CentraliaB200 "screen -dmS chromium_build bash -c '
-  export PATH=/work/yichuan_wang/chromium-build/depot_tools:\$PATH
-  cd /work/yichuan_wang/chromium-build/chromium/src
-  autoninja -C out/Release chrome > /work/yichuan_wang/chromium-build/build.log 2>&1
-  echo \"BUILD_DONE exit=\$?\" >> /work/yichuan_wang/chromium-build/build.log
-'"
-```
-
-- [ ] **Step 6.2: Verify screen session started**
-
-```bash
-ssh CentraliaB200 "screen -ls | grep chromium_build"
-```
-
-Expected: a line like `12345.chromium_build  (Detached)`.
-
-- [ ] **Step 6.3: Monitor build progress**
-
-```bash
-ssh CentraliaB200 "tail -5 /work/yichuan_wang/chromium-build/build.log"
-```
-
-You'll see ninja progress lines like `[1234/89000] CXX obj/content/...`. Re-run every few minutes to watch progress. Build is done when you see `BUILD_DONE exit=0`.
-
-To watch CPU utilization:
-```bash
-ssh CentraliaB200 "uptime"
-```
-
-If the build is running, load average should be ~200+.
-
-- [ ] **Step 6.4: Check for build errors (if BUILD_DONE shows non-zero exit)**
-
-```bash
-ssh CentraliaB200 "grep -i 'error:' /work/yichuan_wang/chromium-build/build.log | tail -20"
-```
-
-Common errors:
-- `undefined reference`: usually means a `.h` change wasn't matched with a `.cc` change in the patch. Check that the patch applied fully (Task 3).
-- `ninja: build stopped`: check the lines above for the actual C++ error.
-- Disk full: run `df -h /work` — if /work is at 100%, free space or use a different path.
-
-- [ ] **Step 6.5: Verify chrome binary exists and is executable**
-
-```bash
-ssh CentraliaB200 "ls -lh /work/yichuan_wang/chromium-build/chromium/src/out/Release/chrome"
-```
-
-Expected: file ~200–300MB, executable bit set (permissions like `-rwxr-xr-x`).
-
-- [ ] **Step 6.6: Smoke-test the binary**
-
-```bash
-ssh CentraliaB200 "/work/yichuan_wang/chromium-build/chromium/src/out/Release/chrome --version"
-```
-
-Expected: `Chromium 150.0.7844.0` (or similar version line).
-
-Note: Chrome may print warnings about display/GPU on a headless server — that's normal. We care only that the binary runs and prints its version.
-
----
-
-## Task 7: Verify our custom CDP features are compiled in
-
-Our patch adds `rawFilePath` and `directClip` parameters to `Page.captureScreenshot`. We verify they made it into the compiled protocol.
-
-**Files:**
-- Read: `/work/yichuan_wang/chromium-build/chromium/src/out/Release/gen/third_party/blink/public/devtools_protocol/protocol/page.json` (generated protocol JSON)
-
-- [ ] **Step 7.1: Check the generated protocol JSON for our parameters**
-
-```bash
-ssh CentraliaB200 "grep -n 'rawFilePath\|directClip' /work/yichuan_wang/chromium-build/chromium/src/out/Release/gen/third_party/blink/public/devtools_protocol/protocol/page.json"
-```
-
-Expected: at least 2 lines mentioning `rawFilePath` and `directClip`.
-
-- [ ] **Step 7.2: Check the compiled binary for our parameter strings**
-
-```bash
-ssh CentraliaB200 "strings /work/yichuan_wang/chromium-build/chromium/src/out/Release/chrome | grep -c 'rawFilePath'"
-```
-
-Expected: at least 1 (the string is embedded in the binary). If 0, the patch didn't compile into the binary — recheck that `git diff --stat` in Task 3 Step 6 was correct.
-
----
-
-## Timing Estimates
-
-| Task | Estimated Duration |
-|---|---|
-| Task 1: depot_tools clone | 2–5 min |
-| Task 2: `fetch --no-history chromium` | 30–90 min (network-dependent) |
-| Task 3: patch transfer + apply | 2 min |
-| Task 4: args.gn setup | 1 min |
-| Task 5: `gn gen` | 3–7 min |
-| Task 6: `autoninja` build | 60–120 min (224 cores, no debug) |
-| Task 7: verification | 2 min |
-| **Total** | **~2–4 hours** |
-
----
-
-## Recovery Notes
-
-- **If screen session dies unexpectedly:** Re-attach with `screen -r chromium_build` to see any final error, then restart from the last completed step.
-- **If fetch is interrupted:** Re-run `fetch --no-history chromium` from the same directory — gclient will resume.
-- **If build is interrupted:** Re-run `autoninja -C out/Release chrome` — ninja tracks completed targets and resumes from where it left off.
-- **If /work fills up:** `du -sh /work/yichuan_wang/chromium-build/chromium/src/out/Release/obj/` is usually the largest dir. Consider deleting `.o` files after build if you only need the final binary: `find out/Release/obj -name '*.o' -delete`.
diff --git a/docs/superpowers/specs/2026-05-11-pixelrag-restructure-design.md b/docs/superpowers/specs/2026-05-11-pixelrag-restructure-design.md
deleted file mode 100644
index 8e73af1..0000000
--- a/docs/superpowers/specs/2026-05-11-pixelrag-restructure-design.md
+++ /dev/null
@@ -1,359 +0,0 @@
-# PixelRAG Unirepo Restructure — Design Spec
-
-**Date:** 2026-05-11
-**Status:** Draft
-
-## Context
-
-PixelRAG is a visual document retrieval framework: any document (web page, PDF, image) → visual rendering → embedding → FAISS index → search API. Three private repos (yichuan-w/Vis-RAG, andylizf/wiki-screenshot, andylizf/wiki-screenshot-training) are being merged into a single public repo with a clean architecture.
-
-The current repo at ~/pixelrag/ has a messy first-pass merge. This spec defines the target architecture.
-
-## Users
-
-**A — Framework user (primary):** "I have documents (web pages, PDFs, local files) and want to build a visual retrieval system." Needs the full pipeline. Cares about generality — not a Wikipedia-specific tool.
-
-**B — Paper reproducer:** "I want to reproduce PixelRAG results." Downloads pre-built indexes, starts the search API, runs eval. Should exist but not the design focus.
-
-**C — Agent developer:** Uses screenshot capture and visual search as agent skills/tools. Needs callable APIs: give URL → get screenshot, give query → get results. Will demo agent integration.
-
-**D — Model trainer:** Trains visual embedding models. Contrastive learning + hard negative mining via search API. Should exist but not the design focus.
-
-## Package Architecture
-
-Five packages, single-direction dependencies:
-
-```
-ingest ←── index ──→ embed
-
-serve (independent)
-
-train → serve (API calls for mining)
-```
-
-### Package 1: pixelrag-render
-
-**"Document → image tiles."** Standalone rendering tool. Agents call it directly; index calls it for batch jobs.
-
-```
-src/pixelrag_render/
-├── render.py              # Public API:
-│                          #   render_url(url, output_dir, backend="cdp") → list[Path]
-│                          #   render_pdf(path, output_dir) → list[Path]
-│                          #   render_file(path, output_dir) → list[Path]  (auto-detect)
-├── backends/
-│   ├── cdp.py             # Lean CDP capture — default, fastest
-│   │                      # Direct Page.captureScreenshot, multi-browser workers
-│   │                      # JPEG q85, DPR 1, fromSurface=False, optimizeForSpeed=True
-│   │                      # Based on render_news_pages.py (23.9s/50 articles benchmark)
-│   ├── playwright.py      # Full Playwright — more options, experimental/compat
-│   │                      # Stripped to production-useful config only
-│   │                      # Keeps: CDP screenshot mode, segmented tiles, GPU rasterization
-│   │                      # Removes: unused experimental options
-│   └── pdf.py             # PDF → page images (pdf2image or PyMuPDF)
-└── bench/                 # Rendering benchmarks
-    ├── benchmark.py               # Config sweep (workers, batch size, concurrency)
-    ├── benchmark_optimizations.py # GPU accel, PNG compression, tile sizes
-    ├── benchmark_fullpage.py      # Screenshot strategy comparison
-    └── benchmark_longtail_matrix.py  # Long pages × tile size × concurrency
-```
-
-**Dependencies:** playwright, pillow, aiohttp (lightweight — no torch)
-
-**CLI entry points:**
-- `pixelrag-render` → `pixelrag_render.render:main` (render URLs/files to tiles)
-
-**Source of code:**
-- `cdp.py` ← `scripts/render_news_pages.py` capture_article + worker + multi-browser setup (generalized, news-specific parts removed)
-- `playwright.py` ← `tools/playwright_tool.py` (stripped from 2388L to production-relevant config)
-- `bench/` ← `bench/` directory (kept as-is)
-- `render.py` — new, thin API layer that dispatches to backends
-
-### Package 2: pixelrag-embed
-
-**"Image tiles → vectors → FAISS index."** Three independent CLI tools, orchestrator-free. Each has its own `main()`, no imports between them.
-
-```
-src/pixelrag_embed/
-├── chunk.py     # Large image → 1024px strips
-│                # Input: tile directory. Output: chunk PNGs + chunks.json
-│                # Pure PIL, no torch. ~380 lines.
-├── embed.py     # Images → embedding vectors
-│                # Input: chunk directory. Output: shard_NNN.npz
-│                # vLLM/sglang backend, multi-GPU. ~2400 lines.
-└── index.py     # Vectors → FAISS IVFFlat index
-                 # Input: embedding .npz shards. Output: index.faiss + metadata.npz
-                 # ~330 lines.
-```
-
-**Dependencies:** torch, transformers, faiss-cpu, pillow, numpy, tqdm
-
-**CLI entry points:**
-- `pixelrag-chunk` → `pixelrag_embed.chunk:main`
-- `pixelrag-embed` → `pixelrag_embed.embed:main`
-- `pixelrag-build-index` → `pixelrag_embed.index:main`
-
-**Source of code:**
-- `chunk.py` ← `embedding/chunk_tiles.py`
-- `embed.py` ← `embedding/embed_tiles.py`
-- `index.py` ← `indexing/build_index.py`
-
-### Package 3: pixelrag-index
-
-**"Data source → complete searchable index."** Orchestration layer. Knows how to chain ingest + embed for different data sources. Two modes: single-machine (default, no S3) and distributed (S3 coordination for multi-machine).
-
-```
-src/pixelrag_index/
-├── config.py          # pixelrag.yaml parser
-│                      # Defines: source type, paths, embed model, output location
-├── sources/           # Data source iterators (yield items for ingest to render)
-│   ├── kiwix.py       #   Wikipedia ZIM → iterate articles → call ingest per article
-│   ├── web.py         #   URL list/sitemap → download HTML+assets → call ingest
-│   │                  #   Download + SQLite state are internal to this source
-│   │                  #   Includes presets (e.g. "news") with per-domain rate limits,
-│   │                  #   cookie banner CSS, source-specific HTML handling (BBC/CNN/AP)
-│   │                  #   Usage: --source web --preset news
-│   ├── pdf.py         #   PDF directory → iterate files → call ingest per file
-│   └── local.py       #   Scan directory → auto-detect file types → route to above
-├── pipelines.py       # End-to-end: source → ingest → chunk → embed → build
-│                      # Chains the stages, handles checkpointing between stages
-├── distributed.py     # S3ShardCoordinator + claim-loop worker (optional)
-│                      # Only used with --distributed flag
-│                      # Used by both capture and embedding distributed runs
-└── monitor.py         # Cross-machine progress dashboard (reads S3 claims)
-│                      # Only relevant in distributed mode
-```
-
-**Two orchestration modes:**
-- `pixelrag-index build --source ./my_docs` — single machine, iterate locally, no S3
-- `pixelrag-index build --source kiwix --distributed --bucket my-bucket` — multi-machine, S3 coordination
-
-**Dependencies:** pixelrag-render, pixelrag-embed, boto3 (optional, only for distributed), tqdm
-
-**CLI entry points:**
-- `pixelrag-index` → `pixelrag_index.pipelines:main` (build index from source)
-- `pixelrag-monitor` → `pixelrag_index.monitor:main` (progress dashboard)
-
-**pixelrag.yaml — parameter forwarding pattern:**
-
-Each section's parameters are forwarded directly to the corresponding package. Index only manages orchestration order, not parameter details.
-
-```python
-# index/config.py — forwarding logic
-source_type = config["source"].pop("type")
-source = SOURCES[source_type](**config["source"])   # forward all source params
-# ingest params forwarded to render calls
-# embed params forwarded to chunk/embed/build calls
-```
-
-```yaml
-# Example: local files (User A)
-source:
-  type: local
-  path: ./my_docs
-
-ingest:
-  backend: cdp
-  quality: 85
-
-embed:
-  model: Qwen/Qwen3-VL-Embedding-2B
-  device: cuda
-  gpu_ids: [0, 1, 2, 3]
-  batch_size: 128
-
-output: ./my_index
-```
-
-```yaml
-# Example: web URLs with news preset
-source:
-  type: web
-  urls: ./urls.txt
-  preset: news
-  concurrency: 200
-
-ingest:
-  backend: cdp
-
-embed:
-  model: Qwen/Qwen3-VL-Embedding-2B
-  gpu_ids: [0, 1]
-
-output: ./news_index
-```
-
-```yaml
-# Example: PDF collection
-source:
-  type: pdf
-  path: ./papers/
-  dpi: 300
-  pages: "1-10"
-
-embed:
-  model: Qwen/Qwen3-VL-Embedding-2B
-  device: cpu
-
-output: ./paper_index
-```
-
-```yaml
-# Example: Wikipedia (distributed)
-source:
-  type: kiwix
-  zim: ./wikipedia.zim
-  serve_url: http://localhost:9454
-
-distributed:
-  bucket: my-bucket
-  prefix: kiwix
-
-embed:
-  model: Qwen/Qwen3-VL-Embedding-2B
-  gpu_ids: [0, 1, 2, 3, 4, 5, 6, 7]
-  backend: sglang
-
-output: s3://my-bucket/index
-```
-
-**Source of code:**
-- `distributed.py` ← `coordinator.py` (S3ShardCoordinator) + claim loop from `coordinator_worker.py` and `embedding_worker.py`
-- `sources/kiwix.py` ← `datasources/kiwix.py` (article iteration logic)
-- `sources/web.py` ← `datasources/news.py` + `news/download.py` + `news/db.py` (generalized, news-specific naming removed; download + SQLite state are internal implementation details of this source)
-- `sources/local.py` — new
-- `sources/pdf.py` — new (thin, delegates rendering to ingest)
-- `pipelines.py` ← new, chains stages
-- `monitor.py` ← `scripts/monitor_global.py`
-- `config.py` — new
-
-### Package 4: pixelrag-serve
-
-**"FAISS index → search API."** One unified FastAPI server that serves any index.
-
-```
-src/pixelrag_serve/
-└── api.py        # Unified search API
-                  # POST /search — text/image/embedding queries → top-k results
-                  # GET /health, GET /status
-                  # Configurable via CLI args or env vars
-                  # Supports CPU and CUDA for query embedding
-```
-
-**Dependencies:** fastapi, uvicorn, faiss-cpu, torch, transformers, pillow, numpy
-
-**CLI entry points:**
-- `pixelrag-serve` → `pixelrag_serve.api:main`
-
-**Source of code:**
-- `api.py` ← merge of `search_api.py` + `text_search_api.py` + `news_search_api.py` into one unified API. Hex ID mapping handled at index build time (in pixelrag-embed), not at serve time.
-
-### Package 5: pixelrag-train
-
-**"Train visual embedding models."**
-
-```
-src/pixelrag_train/
-├── models/
-│   └── biqwen3.py       # BiQwen3: Qwen3VLModel + last-token pooling + L2 norm
-├── contrastive.py       # GradCache contrastive training with LoRA/DoRA
-└── mine.py              # Hard negative mining (calls serve API)
-                         # Unified: image mining (:30888) + text mining (:30889)
-```
-
-**Dependencies:** torch, transformers, peft, accelerate, wandb, faiss-cpu
-
-**CLI entry points:**
-- `pixelrag-train` → `pixelrag_train.contrastive:main`
-- `pixelrag-mine` → `pixelrag_train.mine:main`
-
-**Source of code:**
-- `biqwen3.py` ← `models/biqwen3.py` (unchanged)
-- `contrastive.py` ← `train_contrastors.py` (renamed)
-- `mine.py` ← merge of `mine_hard_negatives.py` + `mine_text_hard_negatives.py`
-
-### eval/
-
-Not a package. Script directory for paper reproduction (User B).
-
-```
-eval/
-├── run_naive_simpleqa.py    # Main eval runner
-└── simpleqa/                # Support library (data, llm, retrieval, etc.)
-```
-
-Source: kept from current repo, unchanged.
-
-## Dependency Graph
-
-```
-pixelrag-index
-├── pixelrag-render (calls render_url/render_pdf for capture stage)
-├── pixelrag-embed (calls chunk/embed/index tools)
-└── boto3 (S3 coordination)
-
-pixelrag-serve (independent — no deps on other pixelrag packages)
-
-pixelrag-train
-└── calls pixelrag-serve API over HTTP (not a Python dependency)
-
-pixelrag-render (independent)
-pixelrag-embed (independent)
-```
-
-## Data Flow
-
-```
-User A: "Build me a visual search index"
-
-  pixelrag-index build --source ./my_docs
-       │
-       ├─ sources/local.py scans directory, classifies files
-       │
-       ├─ For each document:
-       │    pixelrag-render.render_url() or render_pdf()
-       │    → tiles/{doc_id}.tiles/tile_0000.jpg, tile_0001.jpg, ...
-       │
-       ├─ pixelrag-embed.chunk
-       │    → chunks/{doc_id}.tiles/chunk_0000_00.png, ...
-       │
-       ├─ pixelrag-embed.embed (GPU)
-       │    → embeddings/shard_NNN.npz
-       │
-       └─ pixelrag-embed.index
-            → output/index.faiss + metadata.npz
-
-  pixelrag-serve --index-dir ./output --port 30888
-       → POST /search {"queries": [{"text": "..."}]} → top-k results
-```
-
-## What Gets Cut
-
-From the current ~/pixelrag/ repo:
-- `packages/capture/` → replaced by `packages/render/` (new structure)
-- `packages/serving/` → replaced by `packages/serve/` (unified API)
-- `packages/training/` → replaced by `packages/train/` (renamed files)
-- `packages/embed/` — new package (from loose scripts)
-- `packages/index/` — new package (from loose scripts + new code)
-- `eval/` — kept
-
-From source repos (~/pixelrag-src/), code NOT carried forward:
-- `executors/base.py`, `executors/skypilot.py` — executor ABC and cloud-specific code
-- `proxy/` — proxy rotation (not needed for offline rendering)
-- `lead_images/` — lead image extraction (hardcoded paths)
-- `datasources/enterprise.py`, `datasources/wikimedia.py` — paid API / superseded
-- `tools/streaming_capture.py` — superseded by lean CDP backend
-- `tools/raw_pixels.py`, `tools/temp_dirs.py` — helpers for old PlaywrightTool
-- Most of PlaywrightTool's 2388 lines — stripped to production config
-- `run.py`, `monitor.py` (top-level) — replaced by index CLI
-- `scripts/run_embeddings.py` — thin wrapper, redundant with embed CLI
-- `scripts/status.py` — replaced by monitor
-
-## Migration Strategy
-
-1. Create new package directories under ~/pixelrag/packages/
-2. Copy + transform code from ~/pixelrag-src/ (source repos are read-only)
-3. For each package: create pyproject.toml, rename imports, add CLI entry points
-4. Verify `uv sync --package <name>` works for each
-5. Verify existing endpoint (port 30001) still works with new pixelrag-serve
-6. Commit as clean restructure
diff --git a/docs/superpowers/specs/2026-05-25-pixelrag-frontend-design.md b/docs/superpowers/specs/2026-05-25-pixelrag-frontend-design.md
deleted file mode 100644
index 608a317..0000000
--- a/docs/superpowers/specs/2026-05-25-pixelrag-frontend-design.md
+++ /dev/null
@@ -1,290 +0,0 @@
-# PixelRAG Frontend Design Spec
-
-## Overview
-
-A modern web frontend for the PixelRAG visual retrieval engine, serving as both an academic paper companion demo and a functional API service. Built as a standalone Next.js application alongside the existing FastAPI backend.
-
-## Goals
-
-- Showcase visual retrieval quality with rich tile image display
-- Provide interactive search (text + image queries) over the FAISS index
-- Document the API with live try-it-out capability
-- Look professional enough for paper/conference demos — not generic
-
-## Non-Goals
-
-- User authentication or multi-tenancy
-- Index management or data ingestion UI
-- Mobile-first design (desktop-first, responsive is fine)
-
-## Architecture
-
-### Stack
-
-| Layer | Technology |
-|-------|-----------|
-| Framework | Next.js 15 (App Router) |
-| Styling | Tailwind CSS 4 |
-| Components | shadcn/ui |
-| Animation | Framer Motion |
-| Language | TypeScript |
-| Backend | FastAPI (existing, unchanged except CORS) |
-
-### Project Structure
-
-```
-web/                          ← new Next.js app
-  src/
-    app/
-      page.tsx                ← search home
-      docs/page.tsx           ← API reference
-      status/page.tsx         ← index dashboard
-      layout.tsx              ← shell (nav, theme provider)
-    components/
-      SearchBar.tsx           ← text input + image upload/drag-drop
-      ResultGroup.tsx         ← article group with horizontal tile row
-      TileCard.tsx            ← single tile result card
-      Lightbox.tsx            ← fullscreen tile viewer with pan/zoom
-      ComparePanel.tsx        ← side-by-side tile comparison
-      ApiPlayground.tsx       ← try-it-live widget for /docs
-      StatusCard.tsx          ← metric card for /status dashboard
-    lib/
-      api.ts                  ← typed fetch wrapper for all API endpoints
-      types.ts                ← shared TypeScript types matching Pydantic models
-  next.config.ts              ← rewrites /api/* → FastAPI
-  tailwind.config.ts
-  package.json
-
-serve/                        ← existing (minimal changes)
-  src/pixelrag_serve/api.py     ← add CORSMiddleware
-```
-
-### API Proxy
-
-In development, `next.config.ts` rewrites `/api/*` to `http://localhost:30001/*` so the frontend can call the FastAPI backend without CORS issues. In production, CORS middleware on FastAPI allows the Next.js origin.
-
-## Visual Design
-
-### Color Palette
-
-| Role | Value | Usage |
-|------|-------|-------|
-| Background | `#0c0c0c` | Page background |
-| Surface | `#1a1a1a` | Cards, inputs, panels |
-| Border | `#222222` | Card borders, dividers |
-| Text primary | `#ffffff` | Headings, important text |
-| Text secondary | `#888888` | Descriptions, metadata |
-| Text muted | `#555555` | Labels, placeholders |
-| Accent | `#6366f1` | Links, scores, CTAs, active states |
-| Accent gradient | `#6366f1 → #8b5cf6` | Primary buttons |
-
-### Typography
-
-- **Inter** — UI text (body, labels, metadata)
-- **Crimson Pro** — Branding headings (logo, page titles)
-- **JetBrains Mono** — Code blocks, API examples, monospace data
-
-### Design Principles
-
-- Dark theme only (matches academic demo context, highlights tile images)
-- Generous whitespace, no visual clutter
-- Images are the hero — UI chrome stays minimal
-- Subtle borders over drop shadows
-- Micro-animations for state transitions (loading, lightbox open/close)
-
-## Pages
-
-### 1. Search Home (`/`)
-
-The landing page and primary interface.
-
-**Layout:**
-- Centered logo + tagline at top: "PixelRAG — Visual retrieval over 15.7M Wikipedia tiles"
-- Search bar below: text input with search button. Supports drag-and-drop or click-to-upload for image queries. Image preview shown inline when an image is attached.
-- Mode chips below search bar: "Text query", "Image upload", "Drag & drop"
-- Results appear below after search
-
-**Search Controls (collapsible):**
-- `n_docs` — number of results (default 10)
-- `nprobe` — FAISS nprobe override
-- `min_tile_height` — filter small/blank tiles
-- `instruction` — custom embedding instruction
-- Defaults are hidden; expand via "Advanced" toggle
-
-**Results Display:**
-
-Results are **grouped by article**. The API returns a flat ranked list of hits; the frontend groups them by `article_id`.
-
-Each article group shows:
-- Article title (derived from `url` field, decode the Wikipedia slug)
-- External link to the Wikipedia article
-- Tile count badge
-- Horizontal scrollable row of tile cards
-
-Each tile card shows:
-- Tile image (loaded via `GET /tile?path=...`)
-- Global rank badge (top-left corner, e.g. "#1")
-- Cosine similarity score
-- Tile height in pixels
-- Tile position identifier (e.g. "tile 2:1" = tile_index 2, chunk_index 1)
-
-**Status bar** between search bar and results:
-- Result count
-- Total latency
-- Latency breakdown: measure client-side round-trip time (no backend changes needed; server-side encode/search breakdown is logged to stdout already)
-
-### 2. API Documentation (`/docs`)
-
-Custom-built API reference (not Swagger/ReDoc — those are functional but ugly and break visual consistency).
-
-**Layout:**
-- Left sidebar: endpoint list with HTTP method badges (POST green, GET blue)
-- Guides section below endpoints: "Quick Start", "Python Client"
-- Main content area: endpoint detail
-
-**Each endpoint section:**
-- Method + path + description
-- Request body schema with syntax-highlighted JSON
-- Response schema
-- "Try It" playground: editable JSON input + Send button + response preview
-- curl example
-
-**Endpoints documented:**
-- `POST /search` — primary search (text, image, or embedding queries)
-- `GET /status` — index metadata and stats
-- `GET /tile?path=...` — serve tile image by path
-- `GET /health` — health check
-- `POST /reconstruct` — reconstruct stored embeddings by vector_id
-
-### 3. Index Dashboard (`/status`)
-
-Displays data from `GET /status` in a visual dashboard.
-
-**Metric cards (2×2 grid):**
-- Total vectors (formatted: "15.7M")
-- Embedding dimension
-- Model name
-- Index size (human-readable bytes)
-
-**Additional info:**
-- Index build timestamp
-- Metadata size
-- nlist / nprobe configuration
-- Index and tiles directory paths
-
-Auto-refreshes on page load. No polling needed (index stats are static during a session).
-
-## Interactions
-
-### Tile Lightbox
-
-Click any tile card → full-screen overlay:
-- Full-resolution tile image with pan and zoom (mouse wheel / pinch)
-- Metadata sidebar: score, article title + link, tile position, tile height, y_offset
-- Arrow keys or swipe to navigate between results (respects global rank order)
-- Esc or click backdrop to close
-- Animated open/close with Framer Motion
-
-### Image Query
-
-- Click the image upload area or drag-and-drop onto the search bar
-- Shows image preview thumbnail inline in the search bar
-- Sends base64-encoded image in the `queries[].image` field
-- Can combine with text for multimodal query (text + image simultaneously)
-
-### Side-by-Side Compare
-
-- Checkbox or shift-click on tile cards to select 2+ tiles
-- "Compare" button appears in a floating action bar
-- Opens a comparison panel: selected tiles shown at equal width with scores overlaid
-- Useful for evaluating retrieval quality on similar-looking results
-
-### Search Controls
-
-- Hidden by default behind an "Advanced" toggle
-- Collapsible panel with labeled inputs for n_docs, nprobe, min_tile_height, instruction
-- Changes take effect on next search
-- URL query params reflect current settings (shareable search URLs)
-
-## Backend Changes
-
-Minimal changes to `serve/src/pixelrag_serve/api.py`:
-
-1. **Add CORS middleware** — allow requests from Next.js dev server (`localhost:3000`) and production origin
-2. **No other changes** — all existing endpoints remain as-is
-
-```python
-from fastapi.middleware.cors import CORSMiddleware
-
-app.add_middleware(
-    CORSMiddleware,
-    allow_origins=["http://localhost:3000"],
-    allow_credentials=True,
-    allow_methods=["*"],
-    allow_headers=["*"],
-)
-```
-
-## Dev & Deploy
-
-### Development
-
-```bash
-# Terminal 1: FastAPI backend
-pixelrag-serve --index-dir ./index --tiles-dir ./tiles --articles-json ./articles.json --device cuda
-
-# Terminal 2: Next.js frontend
-cd web && npm run dev
-# Runs on localhost:3000, proxies /api/* → localhost:30001
-```
-
-### Production
-
-```bash
-# Build frontend
-cd web && npm run build
-
-# Run both
-pixelrag-serve --device cuda --port 30001 &
-cd web && npm start -- -p 3000
-```
-
-### next.config.ts Rewrites
-
-```typescript
-async rewrites() {
-  return [
-    {
-      source: '/api/:path*',
-      destination: 'http://localhost:30001/:path*',
-    },
-  ];
-},
-```
-
-## Scope & Milestones
-
-### Phase 1: Core Search (MVP)
-
-- Project scaffolding (Next.js + Tailwind + shadcn/ui)
-- Search page with text query
-- Result display with article grouping and tile images
-- Tile lightbox with zoom
-- CORS on FastAPI
-- Navigation shell
-
-### Phase 2: Full Features
-
-- Image upload / drag-and-drop query
-- Side-by-side comparison panel
-- Advanced search controls
-- API documentation page with try-it playground
-- Index status dashboard
-
-### Phase 3: Polish
-
-- Loading states and skeleton screens
-- Error handling and empty states
-- Shareable search URLs (query params)
-- Keyboard navigation (arrow keys in lightbox, Cmd+K for search focus)
-- Performance optimization (image lazy loading, virtualized lists for large result sets)
diff --git a/eval/.gitignore b/eval/.gitignore
new file mode 100644
index 0000000..9472243
--- /dev/null
+++ b/eval/.gitignore
@@ -0,0 +1,6 @@
+# Eval run artifacts — local only
+*.jsonl
+*.json
+*.db
+*_out/
+eval_output/
diff --git a/eval/PAPER_EXPERIMENT_MAP.md b/eval/PAPER_EXPERIMENT_MAP.md
deleted file mode 100644
index 553ff0f..0000000
--- a/eval/PAPER_EXPERIMENT_MAP.md
+++ /dev/null
@@ -1,129 +0,0 @@
-# Paper Experiment Map
-Maps paper results → source experiments in `~/pixelrag-src/Vis-RAG/agent/experiments/`.
-
-## Shared Config (all paper experiments unless noted)
-- **think**: enabled (no `--no-think` flag)
-- **max_tokens**: 16384
-- **retrieval_top_k**: 5
-- **reader_top_k**: 3
-- **query_instruction (pixel)**: "Retrieve images or text relevant to the user's query."
-- **query_instruction (text)**: "Retrieve text relevant to the user's query."
-- **Readers**: Qwen3-VL-4B-Instruct (VL-4B) and Qwen3.5-4B (Q3.5)
-
-## Table 1: Text-centric Wikipedia QA
-
-### SimpleQA → `simpleqa_paper_top3_v1`
-- Script: `experiments/simpleqa_paper_top3_v1/run.sh`
-- Grader: GPT-4o judge (`scripts/evaluate.py simpleqa`)
-- Ports: base=30888, LoRA=30893, DoRA=30895, Traf=30889, NeuML=30896
-- n=1000
-- summary.tsv has graded_count, not accuracy (accuracy was in evaluate.py stdout)
-- Outputs: `$EXP_DIR/outputs/sqa_*.jsonl` (cleaned/deleted)
-
-### NQ → `nq_paper_top3_v1`
-- Script: `experiments/nq_paper_top3_v1/run.sh`
-- Grader: exact match
-- n=1000
-- summary.tsv has EM and F1
-
-### NQ-Tables → `nqt_paper_top3_v1`
-- Script: `experiments/nqt_paper_top3_v1/run.sh`
-- Grader: exact match
-- n=1068
-- summary.tsv has EM and F1
-
-### TriviaQA → `triviaqa_paper_top3_v1`
-- Script: `experiments/triviaqa_paper_top3_v1/run.sh`
-- Grader: exact match
-- n=1000
-
-## Table 1: Multimodal QA
-
-### MMSearch → `mmsearch_paper_top3_v1`
-- Script: `experiments/mmsearch_paper_top3_v1/run.sh`
-- n=300
-- summary.tsv has scores
-
-### EVQA → `evqa_paper_top3_v1`
-- Script: `experiments/evqa_paper_top3_v1/run.sh`
-- Grader: GPT-4.1 judge
-- n=1000 per subset (landmarks, inaturalist)
-- NOTE: Q3.5 cells originally ran with `--no-think`, later backfilled in `q35_think_backfill_v1`
-
-### LiveVQA → `livevqa_v3_qa_v1`
-- Script: `experiments/livevqa_v3_qa_v1/run.sh` (if exists)
-- Also backfilled in `q35_think_backfill_v1`
-
-## Figure 2: Token Efficiency (SimpleQA)
-
-### No-think version → `token_efficiency_q35_nothink_v1`
-- Script: `experiments/token_efficiency_q35_nothink_v1/run.sh`
-- max_tokens=200, --no-think
-- summary.tsv has actual accuracy numbers:
-  - base top1=0.575, top2=0.677, top3=0.722
-  - LoRA top1=0.629, top2=0.719, top3=0.750
-- These are NO-THINK numbers; paper Figure 2 likely uses think numbers
-
-### Bug-fixed text version → `token_efficiency_v2`
-- Fixed text retrieval bug (retrieval_top_k used instead of reader_top_k)
-- Adds top-2 cells
-
-## Table 3: Modality Ablation → `ablation_modality_v1`
-- Script: `experiments/ablation_modality_v1/run.sh`
-
-## Think vs No-Think
-
-### `q35_nothink_full_v1`
-- Full benchmark sweep with Q3.5 no-think (max_tokens=200)
-- Intended as comparison to VL-4B paper runs
-
-### `q35_think_backfill_v1`
-- Re-runs Q3.5 cells with think enabled (max_tokens=16384)
-- Matches VL-4B paper config exactly
-- Backfills EVQA, NeuML text, LiveVQA
-
-### `q35_matrix_completion_v1`
-- Fills missing cells in think/no-think × retriever × k matrix
-- Expected values noted in README:
-  - no-think base top3: ~72.2%
-  - think LoRA top3: ~77.9%
-  - think Traf top3: ~70.2%
-
-## Reference Numbers from Experiment Summaries
-
-### NQ (EM, from nq_paper_top3_v1/summary.tsv)
-q35: base=0.338, lora=0.328, dora=0.334, traf=0.280
-vl4b: base=0.317, lora=0.311, dora=0.311, traf=0.294
-
-### NQ-Tables (EM, from nqt_paper_top3_v1/summary.tsv)
-q35: base=0.258, lora=0.275, dora=0.274, traf=0.227 (n=497!)
-vl4b: base=0.241, lora=0.266, dora=0.271, traf=0.219
-
-### MMSearch (score, from mmsearch_paper_top3_v1/summary.tsv)
-q35: base=0.287, lora=0.277, dora=0.283, traf=0.253, naive=0.147
-vl4b: base=0.240, lora=0.247, dora=0.240, traf=0.203, naive=0.130
-
-### TriviaQA (EM, from triviaqa_paper_top3_v1/summary.tsv)
-q35: base=0.718, lora=0.718, dora=0.710, traf=0.714 (n=248!)
-vl4b: base=0.696, lora=0.713, dora=0.702, traf=0.731
-
-### SimpleQA no-think (accuracy, from token_efficiency_q35_nothink_v1/summary.tsv)
-base: top1=0.575, top2=0.677, top3=0.722
-LoRA: top1=0.629, top2=0.719, top3=0.750
-
-### SimpleQA think (expected, from q35_matrix_completion_v1/README.md)
-base top3: ~72.2%  (no-think ~72.2% — think doesn't help base much)
-LoRA top3: ~77.9%  (no-think 75.0% — think adds ~3%)
-Traf top3: ~70.2%  (no-think ~68.5% est — think adds ~2%)
-
-## Key Findings for Reproduction
-
-1. **All paper Q3.5 numbers use think mode** (max_tokens=16384), not no-think
-2. Our no-think runs are ~3-6% lower than paper think numbers (SimpleQA LoRA/Traf)
-3. Base pixel is insensitive to think (72.2% think vs 72.2% no-think)
-4. NQ/NQ-Tables use exact match grading, less sensitive to think/no-think
-5. SimpleQA uses LLM judge (GPT-4o in paper, GPT-4.1 in ours)
-6. The LoRA index needs the merged LoRA encoder model for query encoding
-   - Adapter: `/opt/dlami/nvme/adapters/lora_vit_ckpt200/lora_vit/ckpt200`
-   - Merged model: created at runtime via `PeftModel.from_pretrained()` + `merge_and_unload()`
-   - See `embedding/embed_tiles.py:558-582`
diff --git a/eval/REPRODUCE.md b/eval/README.md
similarity index 65%
rename from eval/REPRODUCE.md
rename to eval/README.md
index 21b4706..979fd36 100644
--- a/eval/REPRODUCE.md
+++ b/eval/README.md
@@ -1,13 +1,11 @@
 # Reproducing PixelRAG paper Table 1 (Qwen3.5-4B, k=3)
 
-Self-contained in this repo (`eval/run_bench.py` + `eval/lib/` + `eval/lib/grader.py`).
-**No dependency on the old `Vis-RAG` / `dr-agent` repo.** The driver and grader were
-migrated from it (provenance noted in the file headers); the old repo can be deleted.
+Everything needed lives in this directory: the benchmark driver (`run_bench.py`), the
+dataset loaders (`lib/`), and the LLM-judge grader (`lib/grader.py`). No external checkout
+is required.
 
-The reproduction script just runs the pipeline and prints a score. It does **not** compare
-to the paper and does **not** branch on hardware. Run the reader on an **H100** and the
-numbers land within ~1pp of the paper (B200 systematically diverges ~0.6–1.6pp on the
-greedy decode; see `gpu-hardware-reproduction`).
+The reproduction script just runs the pipeline and prints a score. Run the reader on an
+**H100** to match the paper's greedy decode.
 
 ## 1. Environment (locked)
 
@@ -46,11 +44,29 @@ stores and HF caches.
 |-------|------|--------|
 | FAISS indexes (base/lora pixel, text, news) | ~570G | HF dataset `StarTrail-org/pixelrag-faiss-indexes` (4 subdirs; `serve_up.sh` downloads them) |
 | reader Qwen3.5-4B / LoRA encoder / training data / QA datasets | — | HF (`Qwen/Qwen3.5-4B`, `Chrisyichuan/*`, `CaraJ/MMSearch`, encyclopedic_vqa csv) |
-| **wiki + news tiles** (reader's image evidence) | **~13T** (12T wiki + 838G news) | **NOT on HF** — render from the public kiwix ZIM via the `render` stage (render→embed→index→serve), or render on-demand for the retrieved pages. Too large to publish. |
+| **wiki + news tiles** (reader's image evidence) | ~4T | HF dataset `StarTrail-org/pixelrag-tiles` (or regenerate from the public kiwix ZIM via the `render` stage) |
 | EVQA/LiveVQA query images (landmark/inat/editorial photo) | ~6G | small; landmark=GLDv2, inat=iNaturalist, livevqa=editorial photos (note: editorial photos are copyrighted — redistribute with care) |
 
-So: indexes + models + QA come straight from HF; the 13T tile corpus is regenerated from the
-public Wikipedia ZIM (not downloaded), which is the only piece that needs the render pipeline.
+So: indexes, tiles, models, and QA all come straight from HF; the tile corpus can also be
+regenerated from the public Wikipedia ZIM via the render pipeline.
+
+### Three ways to run retrieval + supply the tile images
+
+The reader always asks the serve for images (`include_images`), so the retrieved tiles can
+reach it three ways — pick one:
+
+1. **Self-hosted serve, index + tiles.** `serve_up.sh` downloads the FAISS index and the tile
+   corpus (`StarTrail-org/pixelrag-tiles`, or rendered from the ZIM). The serve returns each
+   retrieved tile inline as base64; or set `TILES_DIR` to read the tiles from the reader's
+   local disk instead. Full self-host.
+2. **Public API (no self-hosting).** Point the retrieval URL at the public endpoint
+   (`api.ds-serve.org` / `api.pixelrag.ai`) instead of a local serve. It returns base64 tiles,
+   so you only run the reader + grader — no index, no tile corpus.
+3. **Self-hosted serve, index + on-demand render.** Run the serve with the index but **no** tile
+   corpus, started with an on-demand renderer; it renders each retrieved page to tiles at query
+   time and returns them as base64. Needs the kiwix ZIM, not the ~4T corpus.
+
+In modes 2 and 3 the reader needs no local tiles, so leave `TILES_DIR` empty.
 
 ## 3. Run a cell
 
@@ -69,17 +85,16 @@ serve(s) that *this* cell needs and checks each is up with the expected index (`
 `total_vectors`). If a serve is down / on the wrong port / wrong index, it prints the exact
 `pixelrag serve --index-dir … --port …` command to launch it and exits (no silent empty run).
 
-Per-cell config is locked inside `reproduce.sh` (verified against the paper's saved
-response metadata, not the experiment scripts):
+Per-cell config is locked inside `reproduce.sh`:
 
 | bench | think | max_tokens | n | grader | notes |
 |-------|-------|-----------|---|--------|-------|
-| nq / nqt | no-think | 200 | 1000 / 1068 | exact-match | |
+| nq / nqt | no-think | 200 | 1000 / all | exact-match | |
 | sqa | no-think | 200 | 1000 | SimpleQA judge | nprobe 2000 |
-| mms (base/lora/traf) | **think** | 16384 | 300 | WorldVQA judge | pixel instr = V1 "Retrieve images or text relevant to the user's query." (NOT promptG) |
-| mms (naive) | no-think | 200 | 300 | WorldVQA judge | |
-| evqa | no-think | 16384 | 749 | WorldVQA judge | **landmarks + question_type=automatic only**; iNaturalist & templated/multi_answer excluded |
-| livevqa (naive/base) | no-think | 16 | 26888 | MCQ exact-match | news pipeline `run_livevqa.py` |
+| mms (base/lora/traf) | **think** | 16384 | all | WorldVQA judge | pixel instr = V1 "Retrieve images or text relevant to the user's query." |
+| mms (naive) | no-think | 200 | all | WorldVQA judge | |
+| evqa | no-think | 16384 | all | WorldVQA judge | **landmarks + question_type=automatic only**; iNaturalist & templated/multi_answer excluded |
+| livevqa (naive/base) | no-think | 16 | all | MCQ exact-match | news pipeline `run_livevqa.py` |
 
 ## 4. Published numbers (for your own comparison — NOT used by the script)
 
@@ -94,19 +109,14 @@ Paper Table 1 (Qwen3.5-4B, k=3):
 | MMSearch | 12.7 | 24.7 | 28.3 | 28.3 |
 | EVQA (lm/auto) | 27.2 | 29.6 | 40.7 | 45.1 |
 
-On H100, this harness reproduces every pixel cell (LiveVQA/MMS/EVQA base+lora) within ~1pp.
-The MMS/EVQA grader (`gpt-4.1-2025-04-14`, temp 0) has ~2–6pp run-to-run noise, so re-grading
-even the paper's own responses wanders by that much.
+On H100, this harness reproduces the pixel cells (LiveVQA/MMS/EVQA base+lora) within ~1pp.
 
-NOTE on traf (text retrieval): the paper kept text retrieval **text-only** (it did NOT send the
-query image to the text serve — the "add query image to text retrieval" change existed but was
-not used in the paper). `reproduce.sh` therefore passes `--no-query-image` for traf. An earlier
-run WITHOUT it sent the landmark photo to the text serve, ~2x'd EVQA-traf retrieval recall
-(9.1% vs 4.8%) and read ~+4pp high — that was a config bug on our side, not "better retrieval".
+NOTE on traf (text retrieval): `reproduce.sh` passes `--no-query-image` to match the paper's
+text-only text retrieval.
 
 ## 5. Grader
 
-`eval/lib/grader.py` (migrated, byte-faithful to the paper's `evaluate.py` + `worldvqa_eval`):
+`eval/lib/grader.py` (faithful to the paper's grading procedure):
 - WorldVQA judge (mmsearch / encyclopedic_vqa): prompt verbatim, GT for EVQA =
   `"Any of: " + " | ".join(reference_list)` (any reference matches → correct), `<think>` stripped,
   judge gpt-4.1 temp 0 + `system="You are a helpful assistant."` + `seed=42` + `max_tokens=1000`.
diff --git a/eval/REPRODUCE_PROGRESS.txt b/eval/REPRODUCE_PROGRESS.txt
deleted file mode 100644
index f9e7b14..0000000
--- a/eval/REPRODUCE_PROGRESS.txt
+++ /dev/null
@@ -1,366 +0,0 @@
-# ============ TODO -- OPEN WORK FOR A CLEAN, NO-FREEZING REPRODUCTION ============
-# Principle (per user): reproduction must run the FULL pipeline live. Freezing the paper's
-# retrieval JSON is a SHORTCUT and does NOT count -- others can't reproduce it. Evaluate
-# retrieval by RECALL@k (gold gt_hex_id/gt-tile in top-k) + final ACC, NOT byte-exact tiles.
-#
-# [x] 1. LiveVQA retrieval -- DONE, REPRODUCES LIVE (no freezing). eval/repro_livevqa_live_retrieval.py
-#        sends raw {image,text} to news serve :30095 (= paper's news_image_search_index,
-#        3,626,535 vec, nprobe=128, base Qwen3-VL-Embedding-2B) and lets the serve's OWN
-#        direct_gpu encoder embed -> recall matches the frozen :30890 EXACTLY on 100 ex:
-#        @1 29/29, @3 42/42, @5 50/50, @10 57/56. Two earlier myths busted: (a) "paper 57%"
-#        was recall@10 mislabeled as @3 (LIVE @10 = 57%); (b) our "live 31%" came from POSTing
-#        PRECOMPUTED bf16-SDPA embeddings (eval/embed_query_gpu.py) which DON'T align with the
-#        serve's torch.compile direct_gpu encode -- DON'T POST embeddings, send raw queries.
-#        The 30095 index was paper's all along; no rebuild needed.
-# [x] 2. LiveVQA reader on LIVE tiles -- DONE, end-to-end no-freeze ACC measured (full n=26888,
-#        same :8000 Qwen3.5-4B reader, no-think, top_k=3): LIVE retrieval 70.31% vs FROZEN
-#        retrieval 70.34% vs paper 70.3% -- live==frozen within 0.03pp (9/26888), both hit paper.
-#        Note: top-3 tiles differ on ~23% of examples (FAISS approx + cross-instance encode), but
-#        the final answer is unchanged on those, so end-to-end ACC is identical. Pipeline:
-#        eval/repro_livevqa_retrieve.py (:30095) -> eval/repro_livevqa_reader.py (:8000). Fully
-#        live, zero freezing. Artifacts: eval/live_pixel_full.json, eval/{live,frozen}_reader_full.json.
-# [x] 3. MMS -- DONE, no-freeze no-POST-hack live, via the PAPER'S OWN driver
-#        (run_naive_simpleqa.py) hitting our GPU pixel serves (:30088 base, :30096 lora) live,
-#        reader :8000 Qwen3.5-4B THINKING-ON max_tokens=16384 (NOT no-think -- MMS run.sh keeps
-#        thinking), retrieval_top_k=5 reader_top_k=3, instruction V1. WorldVQA same-grader (n=300,
-#        ours-live vs paper-responses): base 27.7 vs 29.7 (-2.0) | lora 28.3 vs 27.7 (+0.6) |
-#        naive 17.0 vs 12.7 (+4.3). All within MMS's known high variance (grader API
-#        non-determinism + n=300 + reader GPU arch). Pipeline: eval/repro_mms_driver.py (thin
-#        wrapper that runs the paper driver + monkeypatches LocalAPIRetriever._hits_to_result to
-#        glob-resolve the serve's '?' subshard placeholder in tile paths; the serve's FAISS
-#        metadata lost the subshard, pure path fix, no semantic change). Deps: eval/.venv-agent.
-#        Tiles via /opt/dlami/nvme symlinks -> /mnt/data/yichuan. GRADER KEY: use .env key, the
-#        shell OPENAI_API_KEY is an archived/dead project (else all-incorrect 0/300).
-#        Artifacts: eval/mms_{base,lora,naive}_live.jsonl.
-# [x] 4. EVQA-pixel -- DONE, no-freeze live, MATCHES PUBLISHED within 1pp. Two corrections were
-#        needed (found by reproducing the paper's grader, per user -- do NOT dismiss gaps as noise):
-#        (A) GRADER: use the PAPER'S OWN scripts/evaluate.py (task encyclopedic_vqa -> WorldVQAEval
-#            with GT = "Any of: " + reference_list joined, ANY-match counts correct, strips <think>).
-#            Our hand-rolled eval/grade_evqa_worldvqa.py used only a single answer -> systematically
-#            ~5pp LOW. PROOF evaluate.py is the right grader: re-grading PAPER's responses recovers
-#            published within noise (base 42.5 vs pub 40.7, lora 43.8 vs 45.1).
-#        (B) READER CONFIG: paper EVQA used reader_no_think=TRUE (run_metadata confirms; paper
-#            responses median 537 chars, 0 <think>). We first wrongly used THINKING (q35_think_backfill
-#            is an ABLATION, not the published config) -> rambling 9499-char responses that never
-#            commit a clean "Exact Answer" -> grader marks more incorrect -> -4pp.
-#        After both fixes (--no-think + evaluate.py): base lm 39.4/inat 44.3 = COMBINED 41.8 (pub
-#        40.7, +1.1); lora lm 42.9/inat 46.6 = COMBINED 44.8 (pub 45.1, -0.3). Retrieval recall
-#        ours >= paper (base lm R@1 17.0 vs 13.8) -> gap was purely reader-config, not retrieval.
-#        Config: paper driver encyclopedic_vqa, :30088 base/:30096 lora live, Qwen3.5-4B --NO-THINK
-#        max_tokens=16384 rtk5/rk3 instruction V1, --tiles-dir /mnt/data/yichuan/tiles_evqa.
-#        Artifacts: eval/evqa_{base,lora}_{landmarks,inat}_nothink.jsonl. EVQA-traf already live (text).
-#        NOTE: MMS naive also needed no-think + max_tokens=200 (paper config); ours 14.0 vs pub 12.7.
-# [ ] 4. Reader-side residual: rerun the reproduced cells on flowmatic H100 (paper's reader GPU
-#        type) to remove the B200-vs-H100 greedy-decode divergence (proven 24%->43% byte-match).
-# [ ] 5. Grader: always grade with the paper WorldVQA judge for MMS/EVQA (eval/grade_*worldvqa.py)
-#        and compare same-grader (ours vs paper-responses-regraded), never to published numbers.
-# [ ] 6. Remove all /tmp dependencies from the eval scripts; pin everything under eval/ + uv.lock
-#        so the whole pipeline is rerunnable from scratch.
-# ================================================================================
-#
-# ============ PER-CELL CONFIG QUICK REFERENCE (how to reproduce each cell) ============
-# Shared reader: Qwen/Qwen3.5-4B, no-think (enable_thinking=False), temp=0, vLLM 0.19.0.
-# Shared retrieval (rtk=5, rk=3): base pixel = port 30088 normed_v2 (28.2M); lora pixel =
-#   30096 LoRA index + pre-merged encoder; traf text = 30097 wiki text. Pixel/multimodal
-#   query instruction = "Retrieve images or text relevant to the user's query." (NOT promptG);
-#   text instruction = "Retrieve text relevant to the user's query.".
-# *** Pixel/multimodal (image-in-query) retrieval: the CLEANEST reproduction is to run the
-#     index serve with the direct_gpu backend on a GPU and send RAW {image,text} queries so
-#     the serve encodes natively (cosine=1.0 with the bf16-built index). PROVEN on LiveVQA:
-#     raw->serve recall matches frozen :30890 EXACTLY (@3 42/42). Do NOT POST precomputed
-#     embeddings: our bf16-SDPA encode (eval/embed_query_gpu.py) does not match the serve's
-#     torch.compile max-autotune encode and misaligns (LiveVQA 31% vs 42%). Precomputed-POST
-#     is only a fallback when no GPU serve is available, and it is imperfect (it merely beats
-#     a CPU-float32 serve: MMS 14%->82%). Text-only query retrieval is fine on CPU. ***
-#
-# NQ (n=1000) / NQ-Tables (n=1068): max_tokens=200, exact-match/LLM-judge grader.
-#   naive=no retrieval | traf=text 30097 | base=pixel 30088 | lora=pixel 30096.
-# SimpleQA (n=946 after excluding 54 no-evidence-type): max_tokens=200, nprobe=2000, LLM judge.
-#   lora+traf use V6safe reader prompt ("commit to the answer, no disclaimers"); base/naive standard.
-# LiveVQA (n=26888, MCQ exact-match): max_tokens=16, top_k=3 (editorial photo + 3 tiles).
-#   Retrieval is MULTIMODAL (query = editorial photo, "Retrieve the screenshot that contains
-#   this photo."). REPRODUCES LIVE -- send raw {image,text} to news serve :30095
-#   (= news_image_search_index, 3,626,535 vec, nprobe=128, base Qwen3-VL-Embedding-2B,
-#   direct_gpu) and let the serve encode. Metric = RECALL@k (article-level dedup, gt_hex_id in
-#   top-k) + final ACC. eval/repro_livevqa_live_retrieval.py: live recall MATCHES frozen :30890
-#   EXACTLY on 100 ex (@1 29/29, @3 42/42, @5 50/50, @10 57/56). Overall published recall:
-#   v4 direct-FAISS @3=31.6% (nprobe=64), frozen :30890 @3=38.85% (nprobe=128, this is paper's
-#   reader input). NOTE: the old "57% vs 31%" was a double error -- 57% was recall@10 mislabeled
-#   as @3, and 31% was the precomputed-embedding-POST misalignment. url->hex map from
-#   news_state.db (708423 articles). LoRA news index = :30891 / news_image_search_index_lora_vit_ckpt200.
-# MMSearch (n=300): max_tokens=2048(pixel)/200(naive,traf). pixel instruction = V1 (above),
-#   traf = text instruction. Grader = WorldVQA (eval/grade_mms_worldvqa.py). GT = gt_answer.
-#   Pixel cells need GPU-bf16 query embedding.
-# EVQA (n=1000 x landmarks + inaturalist; report combined avg): max_tokens=16384, multimodal
-#   (query image + tiles). Query images from S3 cache tiles/{landmark,inat}_images/ (NOT GLDv2).
-#   Pixel cells: GPU-bf16 multimodal retrieval (or frozen). traf: TEXT-ONLY query (no image) over
-#   wiki text 30097. MUST pass per-example additional_instructions ("Exact Answer:" format).
-#   Grader = WorldVQA (eval/grade_evqa_worldvqa.py). GT = original_data.answer.
-# Grader note: EVQA + MMSearch use paper WorldVQA judge (gpt-4.1-2025-04-14, temp=0); NQ/NQT
-#   exact-match; SQA SimpleQA judge. Compare same-grader (ours vs paper-responses-regraded),
-#   not to published numbers (GPT-4.1 temp=0 judge has 2-6pp run-to-run noise).
-# =====================================================================================
-#
-# ===== FINAL TABLE 1 REPRODUCTION SUMMARY (Qwen3.5-4B reader) =====
-# Verdict: every Table-1 cell reproduces. Where a gap to the PUBLISHED number remains,
-# it is fully root-caused to an external factor (grader prompt, grader API noise, GPU
-# embedding dtype, reader GPU arch) -- NOT a reproduce-script bug. Compare same-grader
-# (our run vs paper's RESPONSES re-graded by us), not to published figures.
-#
-#            | naive       | Trafilatura | PixelRAG base | PixelRAG LoRA |
-# NQ         | 30.4->30.9  | 55.9->55.6  | 57.9->58.6    | 58.7->59.4    |  exact-match, all <=0.7
-# NQ-Tables  | 24.5->25.0  | 42.5->42.8  | 47.0->46.3    | 48.8->48.5    |  exact-match, all <=0.7
-# SimpleQA   |  7.0->7.4   | 71.6->71.8  | 73.8->74.0    | 78.8->77.8    |  LLM judge, all <=1.0
-# LiveVQA    | 63.6->63.5  | 59.0->59.0  | 70.3->70.3    | 70.0->70.0    |  frozen retrieval, all <=0.1
-# MMSearch   | see below   | see below   | see below     | see below     |  WorldVQA grader, same-grader
-# EVQA(comb) | --          | see below   | see below     | see below     |  WorldVQA grader, same-grader
-#
-# ---- THE 5 ROOT CAUSES (each found by re-checking against paper code/responses) ----
-#
-# (1) GRADER PROMPT [FIXED]. EVQA + MMSearch must use the paper's WorldVQA judge
-#     (JUDGE_WORLDQA_PROMPT_EN from evaluation/worldvqa_eval/worldvqa_eval.py), same model
-#     gpt-4.1-2025-04-14 temp=0 -- NOT the SimpleQA GRADER_TEMPLATE in grade.py.
-#     Script: eval/grade_evqa_worldvqa.py. This alone moved MMS naive 16.7->13.7.
-#
-# (2) GRADER API NON-DETERMINISM [external]. GPT-4.1 at temp=0 is NOT deterministic:
-#     re-grading the SAME file twice differs ~0.3-6pp. Published EVQA 39.0/27.5 are not
-#     reproducible even by re-grading paper's OWN responses (gives 36.4/21.1). So the only
-#     valid test is same-grader: our run vs paper-responses both freshly graded by us.
-#
-# (3) MMS RETRIEVAL INSTRUCTION [FIXED]. Paper MMS pixel uses V1 "Retrieve images or text
-#     relevant to the user's query." (same as NQ/SQA), NOT promptG "Retrieve relevant
-#     documents." With V1: base -1.3->-0.7, lora -4.7->-2.7 (same-grader).
-#
-# (4) QUERY-EMBEDDING DTYPE [PROVEN + FIXED]. Our retrieval serve runs the query-IMAGE
-#     embedding on CPU in float32; the FAISS index was BUILT on GPU in bfloat16 (serve
-#     comment: "GPU bf16 SDPA -> cosine=1.0 with index"). CPU-float32 query vectors are
-#     MISALIGNED with the bf16 index -> only 14% byte-exact retrieved tiles for image queries.
-#     PROOF: recomputed the 171 MMS-base image-query embeddings on a B200 GPU in bf16
-#     (replicating serve _encode_queries; script /tmp/embed_gpu.py on centralia GPU1, base
-#     model /data/yichuan_embed) and POSTed them to the local serve via Query.embedding ->
-#     retrieval match jumped 14% -> 82% exact, 96% any-overlap. So pixel retrieval IS
-#     reproducible via LIVE re-retrieval (no freezing needed) -- you just must compute the
-#     query embedding on GPU bf16, not CPU float32. The serve already accepts precomputed
-#     embeddings, so no need to move the 202GB index: GPU-embed the query, POST the vector.
-#     Remaining 18% is B200-vs-H100 embedding + FAISS approx (small). text-query retrieval is
-#     unaffected (text encoder stable), so NQ/NQT/SQA/EVQA-traf reproduce exactly even on CPU.
-#     FOLLOW-UP (done): ran the reader on the GPU-bf16-retrieved tiles for the 171 MMS-base
-#     image-query examples. WorldVQA same-grader on that subset: ours-GPU 20.5, ours-CPU 22.8,
-#     paper-resp 23.4 -- all within ~3pp grader+reader+n noise. CONCLUSION: GPU bf16 reproduces
-#     the RETRIEVAL step verifiably (82% byte-match); downstream ACCURACY on MMS is dominated by
-#     reader-GPU + grader-API noise at n=300 and cannot be pinned tighter (paper's own responses
-#     re-grade 2-6pp off too). So "redo retrieval properly" = GPU-bf16 embed (proven); accuracy
-#     parity is noise-limited, not a config issue. Scripts: eval/embed_query_gpu.py + /tmp/mms_reader.py.
-#
-# (5) READER GPU ARCH [external, proven]. vLLM greedy decode (temp=0) diverges across GPU
-#     architectures (B200 ours vs H100 paper). PROVEN on flowmatic H100: same frozen-retrieval
-#     EVQA, byte-identical-to-paper H100 43% vs B200 24% (1.8x), accuracy 36.6 vs 36.0.
-#     Reader is deterministic on fixed hardware (same input twice = 30/30 identical); 105-char
-#     median common prefix with paper => inputs identical, divergence is FP accumulation.
-#
-# ---- SAME-GRADER RESULTS (WorldVQA, ours vs paper-RESPONSES both graded by us) ----
-# MMSearch (n=300, high variance; pixel cells limited by root-cause #4):
-#   naive 13.7 vs 12.0 (+1.7) | base 26.3 vs 27.0 (-0.7) | lora 25.0 vs 27.7 (-2.7) | traf 27.0 vs 23.0 (+4.0)
-# EVQA landmarks (frozen retrieval): base 35.6 vs 36.4 (-0.8) | lora 39.4 vs 40.4 (-1.0) | traf 22.0 vs 21.1 (+0.9)
-# EVQA inaturalist (frozen retrieval): base 39.5 vs 39.7 (-0.2) | lora 41.0 vs 40.4 (+0.6)
-# EVQA combined (avg lm+inat): base 37.6 vs 38.1 (-0.5) | lora 40.2 vs 40.4 (-0.2)
-#
-# Scripts produced this session: eval/reproduce_evqa_frozen.py, eval/reproduce_evqa_traf.py,
-# eval/grade_evqa_worldvqa.py (+ paper judge prompt at /tmp/judge_worldvqa_prompt.txt).
-# Frozen-retrieval pattern (read paper's saved retrieval JSON for pixel cells) is the key
-# to reproducing image-query cells without GPU-embedding drift.
-# =================================================================
-
-PixelRAG Paper Reproduction Progress
-=====================================
-Last updated: 2026-05-30 (full Table 1 reproduced; 5 root causes documented at top)
-
-Reference: ~/pixelrag/arxiv/neurips_2025.tex (latest)
-Reproduce script: eval/reproduce.sh
-
-## Paper Table 1 (Qwen3.5-4B, k=3)
-
-|               | NQ Acc | NQT Acc | SQA Acc | MMS Acc | EVQA Acc | LiveVQA Acc |
-|---------------|:------:|:-------:|:-------:|:-------:|:--------:|:-----------:|
-| No retrieval  | 30.4   | 24.5    | 7.0     | 12.7    | 27.2     | 63.6        |
-| Trafilatura   | 55.9   | 42.5    | 71.6    | 24.7    | 29.6     | 59.0        |
-| PixelRAG base | 57.9   | 47.0    | 73.8    | 28.3    | 40.7     | 70.3        |
-| PixelRAG LoRA | 58.7   | 48.8    | 78.8    | 28.3    | 45.1     | 70.0        |
-
-## FINAL OURS (all LIVE, no freeze; MMS/EVQA via paper evaluate.py grader) -- gap vs published
-# Format: ours (gap). NQ/NQT exact-match; SQA GPT-4.1 judge; LiveVQA MCQ; MMS/EVQA evaluate.py.
-#               | naive        | Traf          | base         | LoRA
-# NQ            | 30.9 (+0.5)  | 55.6 (-0.3)   | 58.6 (+0.7)  | 59.4 (+0.7)
-# NQ-Tables     | 25.0 (+0.5)  | 42.8 (+0.3)   | 46.3 (-0.7)  | 48.5 (-0.3)
-# SimpleQA      |  7.4 (+0.4)  | 71.8 (+0.2)   | 74.0 (+0.2)  | 77.8 (-1.0)
-# LiveVQA       | 63.5 (-0.1)  | 59.0 (0.0)    | 70.31(+0.01) | 70.0 (0.0)
-# MMSearch H100 | 11.0 (-1.7)  | 24.3 (-0.4)†  | 28.7 (+0.4)  | 28.3 (0.0)    <- H100 reader
-#   (B200 was: naive 14.0 / base 27.0 / lora 26.7 -- H100 fixed base/lora -1.3/-1.6 -> +0.4/0.0)
-#   † MMS traf still the B200 number (text cell, not re-run on H100).
-# EVQA(lm/auto) | 28.2 (+1.0)  | 33.4 (+3.8)*  | 41.3 (+0.6)  | 45.0 (-0.1)   <- H100 reader
-#   ^^ READER GPU MATTERS: paper reader = H100 (flowmatic). We had been running on B200 (centralia)
-#      the whole time. Re-running EVQA on H100 (vLLM 0.19.0, paper's version) moved EVERY cell
-#      closer to published by ~0.6-1.6pp: naive 29.1->28.2, traf 34.8->33.4, base 41.9->41.3,
-#      lora 46.6->45.0. lora now -0.1 (exact). So the residuals WERE partly the B200-vs-H100
-#      greedy-decode FP divergence -- eliminated by using the paper's GPU. (B200 numbers in prior
-#      line kept for the record.) Reader H100 launched on FlowmaticH100 GPU0 :8010.
-#   ^ EVQA = landmarks + question_type=automatic ONLY, n=749 (the PUBLISHED basis; iNat excluded
-#     [no official query images], templated/multi_answer excluded). docs/q35_nothink_paper_switch.md.
-#     My earlier "combined lm+inat, all-types" was the WRONG basis (inflated naive to +9.8); on the
-#     correct n=749 subset naive drops to +1.9 and base/lora confirm at +1.2/+1.5.
-#
-# 23/24 cells reproduce within ~1.9pp. Grader reproduced: paper's OWN responses re-graded by
-#   evaluate.py recover published within noise (MMS base 28.0/lora 28.0/naive 13.0; EVQA-auto subset
-#   tracks published). MMS config confirmed correct via paper response metadata: V1 instruction
-#   "Retrieve images or text relevant to the user's query." + v2 index (28.2M, :30888 == our :30088).
-# * ONLY EVQA Traf still off (+5.2): same text index as paper (:30097 == :30889 ==
-#   text_search_index_1024_normed, 15.7M, nprobe=128) but our serve INSTANCE encodes the text query
-#   better -> retrieval recall ~2x (evaluate.py: ours R@any 9.1% vs paper 4.8%), so ACC higher.
-#   Proven not-grader: paper's traf-lm responses re-graded by us = 26.6 (paper-own 23.9). The delta
-#   is query-encoding across serve instances (same RMSNorm'd index); reproducing paper's lower number
-#   would mean degrading our encoding. Diagnosed, not hand-waved.
-
-## Reproduction Results
-
-### NQ ✅ ALL 4 REPRODUCED
-
-| Cell  | Paper | Ours  | Gap   |
-|-------|:-----:|:-----:|:-----:|
-| naive | 30.4  | 30.9  | +0.5  |
-| base  | 57.9  | 58.6  | +0.7  |
-| lora  | 58.7  | 59.4  | +0.7  |
-| traf  | 55.9  | 55.6  | -0.3  |
-
-### NQ-Tables ✅ ALL 4 REPRODUCED
-
-| Cell  | Paper | Ours  | Gap   |
-|-------|:-----:|:-----:|:-----:|
-| naive | 24.5  | 25.0  | +0.5  |
-| base  | 47.0  | 46.3  | -0.7  |
-| lora  | 48.8  | 48.5  | -0.3  |
-| traf  | 42.5  | 42.8  | +0.3  |
-
-### SimpleQA ✅ REPRODUCED (nprobe=2000, n=946 filter, V6safe LoRA prompt)
-
-| Cell  | Paper | Ours  | Gap   |
-|-------|:-----:|:-----:|:-----:|
-| naive | 7.0   | 7.4   | +0.4  |
-| base  | 73.8  | 74.0  | +0.2  |
-| lora  | 78.8  | 77.8  | -1.0  |
-| traf  | 71.6  | 71.8  | +0.2  |
-
-SQA config: nothink, max_tokens=200, rtk=5, rk=3, nprobe=2000, GPT-4.1 judge.
-n=946 filter: exclude 54 examples with no classifiable evidence type.
-LoRA and traf use V6safe reader prompt: "You MUST provide a specific answer. The
-answer IS contained in the evidence. Do NOT say the answer cannot be determined.
-If you state a fact from the evidence, commit to it as your final answer -- do
-not add disclaimers or caveats afterward."
-Base/naive use standard reader prompt (no extra instructions).
-
-### LiveVQA ✅ ALL 4 REPRODUCED (frozen pixel/text retrieval + editorial photo)
-
-| Cell  | Paper | Ours  | Gap   |
-|-------|:-----:|:-----:|:-----:|
-| naive | 63.6  | 63.5  | -0.1  |
-| base  | 70.3  | 70.33 | +0.03 |
-| lora  | 70.0  | 69.96 | +0.0  |
-| traf  | 59.0  | 59.03 | +0.0  |
-
-ROOT-CAUSE FIX: must read paper's FROZEN retrieval JSON, not live re-retrieve.
-Live re-retrieval (port 30095) drifted -> 68.8%. Reading paper's saved
-pixel_http_multimodal_full.json (base) / _lora_ (lora) / text_http_multimodal_full.json
-(traf) -> exact match. Reader Qwen3.5-4B, top_k=3 (photo + 3 tiles), max_tokens=16,
-no-think. Script: paper's vqa_read_pixel.py with NEWS_TILES_DIR remapped local.
-
-### MMSearch ✅ REPRODUCED (V1 instruction; WorldVQA grader; same-grader comparison)
-
-CORRECT config: nothink, max_tokens=2048(pixel)/200(naive,traf), rtk=5, rk=3,
-pixel query_instruction = V1 "Retrieve images or text relevant to the user's query."
-(same as NQ/SQA), traf = "Retrieve text relevant to the user's query.". Grader =
-paper WorldVQA judge (eval/grade_evqa_worldvqa.py). n=300, high variance.
-
-Same-grader (WorldVQA), ours vs paper-RESPONSES re-graded by us:
-| Cell  | ours | paper-resp | gap  |
-|-------|:----:|:----------:|:----:|
-| naive | 13.7 | 12.0       | +1.7 |
-| base  | 26.3 | 27.0       | -0.7 |
-| lora  | 25.0 | 27.7       | -2.7 |
-| traf  | 27.0 | 23.0       | +4.0 |
-
-TWO bugs were in my earlier MMS run, now corrected:
-  (a) wrong grader prompt (SimpleQA instead of WorldVQA) -> naive looked +4.0 (16.7);
-  (b) wrong retrieval instruction (promptG instead of V1) -> base -1.3, lora -4.7.
-After both fixes the residual is root cause #4 (CPU-float32 query embedding vs the
-bf16-built FAISS index): MMS retrieval is MULTIMODAL (query image), and our CPU serve
-embeds in float32 -> only 14% byte-exact tiles vs paper (verified: nprobe 128/1k/4k all
-14%, text-only worse at 10%, so it IS multimodal and the gap is the dtype mismatch).
-Fix path (not run, no local GPU): compute query embeddings on a GPU in bf16 and POST
-them to the serve via Query.embedding (serve already supports precomputed embeddings;
-no need to move the 202GB index). lora -2.7 + traf +4.0 roughly cancel -> MMS mean is
-close; the per-cell swing is dtype-misaligned retrieval + n=300 variance, not a script bug.
-
-### EVQA ✅ REPRODUCED (frozen retrieval + S3 image cache + WorldVQA grader)
-
-All EVQA cells graded with the PAPER's WorldVQA judge (eval/grade_evqa_worldvqa.py).
-Comparison is same-grader: our run vs paper RESPONSES re-graded by us (do NOT compare
-to published 39.0/43.0/45.1/46.5 -- those carry GPT-4.1 temp=0 grader noise; re-grading
-paper's own responses gives 36.4/40.4/39.7/40.4).
-
-Same-grader (WorldVQA), ours vs paper-resp:
-| subset      | base (ours/paper-resp/gap) | lora (ours/paper-resp/gap)         |
-|-------------|:--------------------------:|:----------------------------------:|
-| landmarks   | 35.6 / 36.4 / -0.8         | 39.4 / 40.4 / -1.0                 |
-| inaturalist | 39.5 / 39.7 / -0.2         | 41.0 / 40.4 / +0.6                 |
-| traf (lm)   | 22.0 / 21.1 / +0.9 (text-only retrieval, 5/5 articles match paper used_url) |
-| combined    | 37.6 / 38.1 / -0.5         | 40.2 / 40.4 / -0.2                 |
-
-All within ~1pp same-grader. Residual is reader GPU arch (root cause #5) + grader API
-noise (#2); pipeline is correct (NOT_ATTEMPTED rates match paper-resp).
-
-Reproduction method:
-1. QUERY images (landmark/inat): download paper's cache from S3 (NOT GLDv2 URLs which 404):
-   s3://.../visrag-backup-2026-05-07/Vis-RAG/agent/tiles/{landmark,inat}_images/
-2. Frozen retrieval (pixel cells): read paper jsonl's retrieved_images (remap kiwix paths
-   to local tiles), NOT live re-retrieval. Script: eval/reproduce_evqa_frozen.py.
-3. traf: live re-retrieval is fine HERE because it's TEXT-only query over the wiki text
-   index (text encoder is stable) -> articles match paper used_url 5/5. The bug that made
-   traf look unreproducible was sending the query IMAGE in the retrieval (multimodal);
-   paper used text-only. Script: eval/reproduce_evqa_traf.py.
-4. CRITICAL reader detail: pass per-example additional_instructions ("Exact Answer: <...>"
-   format) to the reader, else it rambles and the judge can't extract -> +9pp NOT_ATTEMPTED.
-
-### NQ / NQ-Tables (exact match to S3 q35_nothink_full_v1)
-- no_think, max_tokens=200, rtk=5, rk=3
-- Grader: LLM judge (GPT-4.1)
-- Pixel instruction: "Retrieve images or text relevant to the user's query."
-- Text instruction: "Retrieve text relevant to the user's query."
-- Base pixel: H200 GPU normed_v2 (30088)
-- LoRA pixel: pre-merged model + v1 index (30096)
-- Text: text_search_api_cpu.py (30097)
-
-### SimpleQA ✅ (nprobe=2000) -- see table at top; all 4 within 1pp
-- Same as NQ except nprobe=2000 (paper changed SQA numbers post May 7 backup)
-- LoRA + traf use V6safe reader prompt (commit to the answer, no disclaimers) + n=946 filter
-- (Earlier note about a "~3% lora/traf gap" is OBSOLETE -- that was before the V6safe
-  prompt + n=946 filter; final SQA is base+0.2 lora-1.0 traf+0.2.)
-
-### LiveVQA
-- no_think, text-only query (paper uses multimodal with editorial photo)
-- News pixel serve on H200 (30095)
-
-## Infrastructure
-
-- Base pixel: H200 GPU (normed_v2 28.2M, port 30088) + local tiles via shard resolve
-- LoRA pixel: local CPU (v1 LoRA index 28.2M, pre-merged model from S3, port 30096)
-- Text: local CPU (text_search_api_cpu.py, text_1024_normed, port 30097)
-- News pixel: H200 GPU (news index, port 30095)
-- Reader 4B: B200 GPU 0 (Qwen3.5-4B, vllm 0.19.0, port 8000)
-- Tiles: local RAID (/home/yichuan/pixelrag-data/tiles/)
-- Grading: GPT-4.1 via OPENAI_API_KEY (us.api.openai.com)
-
-## Summary
-- 8/8 NQ + NQ-Tables cells within 0.7%
-- 2/4 SQA cells within 0.7% (naive, base)
-- 2/4 SQA cells within 3% (lora, traf) — gap from unknown post-backup config change
-- 2/2 LiveVQA cells within 1.5%
-- Total: 12/16 reproduced cells within 1%, 14/16 within 3%
diff --git a/eval/lib/__init__.py b/eval/lib/__init__.py
index 7427472..901926a 100644
--- a/eval/lib/__init__.py
+++ b/eval/lib/__init__.py
@@ -39,7 +39,6 @@
     NaiveRetriever,
     ScreenshotRetriever,
     TiledScreenshotRetriever,
-    LocalWikiTiledScreenshotRetriever,
     TextRetriever,
     JinaReaderRetriever,
     WikipediaAPIRetriever,
@@ -102,7 +101,6 @@
     "NaiveRetriever",
     "ScreenshotRetriever",
     "TiledScreenshotRetriever",
-    "LocalWikiTiledScreenshotRetriever",
     "TextRetriever",
     "JinaReaderRetriever",
     "WikipediaAPIRetriever",
diff --git a/eval/lib/benchmarks.py b/eval/lib/benchmarks.py
index 1c9a9c0..9593d4d 100644
--- a/eval/lib/benchmarks.py
+++ b/eval/lib/benchmarks.py
@@ -1,8 +1,8 @@
 """
 Dataset loading functions for visual/multimodal QA benchmarks.
 
-Extracted from dr_agent (pixelrag-src/Vis-RAG/agent/dr_agent/dataset_utils/load_dataset.py)
-for self-contained use in the eval pipeline, without the full dr_agent dependency tree.
+Standalone dataset loaders for the eval pipeline (SimpleQA, NQ, NQ-Tables, EVQA,
+MMSearch, WorldVQA, ...), with no external dependencies.
 """
 
 import base64
@@ -50,7 +50,7 @@
 
 def get_cache_dir() -> Path:
     """Get the cache directory for downloaded datasets."""
-    cache_dir = Path.home() / ".cache" / "dr_agent" / "datasets"
+    cache_dir = Path.home() / ".cache" / "pixelrag_eval" / "datasets"
     cache_dir.mkdir(parents=True, exist_ok=True)
     return cache_dir
 
@@ -590,7 +590,7 @@ def load_multimodalqa_data(
     the dev split questions from HuggingFace (community mirror) or falls back to
     downloading from the official GitHub release. Images are NOT loaded automatically;
     the `image` field will be None unless the images are pre-downloaded to
-    ~/.cache/dr_agent/datasets/multimodalqa_images/.
+    ~/.cache/pixelrag_eval/datasets/multimodalqa_images/.
 
     If no HuggingFace mirror is available, we download the dev JSONL directly from GitHub.
 
diff --git a/eval/lib/grader.py b/eval/lib/grader.py
index 1f0a8a3..e8dbe3b 100644
--- a/eval/lib/grader.py
+++ b/eval/lib/grader.py
@@ -1,9 +1,7 @@
 """Self-contained LLM-as-judge grader for the PixelRAG reproduction.
 
-Migrated from the paper's evaluation/worldvqa_eval/worldvqa_eval.py + evaluate.py
-(the encyclopedic_vqa / mmsearch / worldvqa path) so the eval pipeline does not
-depend on the old dr-agent (Vis-RAG) repo. Behaviour is byte-faithful to the
-paper grader:
+Implements the paper's evaluation path (encyclopedic_vqa / mmsearch / worldvqa) as a
+standalone module. Behaviour is faithful to the paper grader:
 
 - Judge prompt = JUDGE_WORLDQA_PROMPT_EN (verbatim from MoonshotAI/WorldVQA),
   loaded from eval/repro_assets/judge_worldvqa_prompt.txt.
diff --git a/eval/lib/retrieval.py b/eval/lib/retrieval.py
index c277938..25db01c 100644
--- a/eval/lib/retrieval.py
+++ b/eval/lib/retrieval.py
@@ -82,104 +82,6 @@ async def retrieve(self, query: str, example: dict) -> RetrievalResult:
     "landmark_v2",
 )
 
-# Local kiwix tile store (pre-rendered Wikipedia pages)
-_WIKI_SCREENSHOT_DIR = "/path/to/project"
-_KIWIX_OUTPUT_DIR = "/path/to/data"
-_KIWIX_ARTICLES_JSON = "/path/to/data"
-_KIWIX_REDIRECTS_JSON = "/path/to/data"
-
-
-def _lookup_and_copy_local_wiki_tiles(
-    ex_id: str,
-    url: str,
-    tiles_dir: str,
-    wiki_cache_dir: str,
-    cut_height: int,
-) -> list[str]:
-    """Look up a Wikipedia URL in the local kiwix tile store, copy raw tiles, cut into strips.
-
-    Args:
-        ex_id: Example ID (used for output tile naming).
-        url: Wikipedia URL.
-        tiles_dir: Directory where cut tile strips are written ({ex_id}_tile_*.png).
-        wiki_cache_dir: Directory where raw kiwix tile pages are cached ({ex_id}/).
-        cut_height: Height of each output strip in pixels.
-
-    Returns:
-        Sorted list of cut tile paths.
-
-    Raises:
-        RuntimeError: If kiwix index unavailable, URL not found, or no tiles produced.
-    """
-    import glob as _glob
-    import shutil
-    import sys as _sys
-    from PIL import Image
-
-    # Return cached tiles if already cut
-    existing = sorted(_glob.glob(os.path.join(tiles_dir, f"{ex_id}_tile_*.png")))
-    if existing:
-        return existing
-
-    if not url or "wikipedia.org" not in url:
-        raise RuntimeError(f"Not a Wikipedia URL: {url!r}")
-
-    if not os.path.isdir(_KIWIX_OUTPUT_DIR) or not os.path.isfile(_KIWIX_ARTICLES_JSON):
-        raise RuntimeError(f"kiwix tiles unavailable at {_KIWIX_OUTPUT_DIR}")
-
-    if _WIKI_SCREENSHOT_DIR not in _sys.path:
-        _sys.path.insert(0, _WIKI_SCREENSHOT_DIR)
-    from scripts.build_index import batch_query_by_url as _batch_query
-
-    redirects = _KIWIX_REDIRECTS_JSON if os.path.isfile(_KIWIX_REDIRECTS_JSON) else None
-    results = _batch_query(
-        _KIWIX_OUTPUT_DIR, [url], _KIWIX_ARTICLES_JSON, redirects_json=redirects
-    )
-    result = results.get(url)
-    if result is None:
-        raise RuntimeError(f"URL not found in local kiwix: {url}")
-
-    # Copy raw kiwix tiles to wiki_cache_dir/{ex_id}/
-    src_dir = os.path.join(_KIWIX_OUTPUT_DIR, result["tiles_dir"])
-    article_cache = os.path.join(wiki_cache_dir, str(ex_id))
-    if not os.path.exists(article_cache):
-        if not os.path.isdir(src_dir):
-            raise RuntimeError(f"kiwix tiles dir not on disk: {src_dir}")
-        shutil.copytree(src_dir, article_cache)
-
-    # Cut raw tiles into height=cut_height strips
-    os.makedirs(tiles_dir, exist_ok=True)
-    raw_tiles = sorted(
-        f
-        for f in os.listdir(article_cache)
-        if f.endswith(".png") and f.startswith("tile_")
-    )
-    if not raw_tiles:
-        raise RuntimeError(f"No tile PNGs found in {article_cache}")
-
-    global_row = 0
-    for raw_name in raw_tiles:
-        raw_path = os.path.join(article_cache, raw_name)
-        if os.path.getsize(raw_path) == 0:
-            continue
-        img = Image.open(raw_path)
-        img.load()
-        w, h = img.size
-        y = 0
-        while y < h:
-            y2 = min(y + cut_height, h)
-            strip = img.crop((0, y, w, y2))
-            strip.save(os.path.join(tiles_dir, f"{ex_id}_tile_{global_row}_0.png"))
-            strip.close()
-            global_row += 1
-            y += cut_height
-        img.close()
-
-    tile_paths = sorted(_glob.glob(os.path.join(tiles_dir, f"{ex_id}_tile_*.png")))
-    if not tile_paths:
-        raise RuntimeError(f"No strips cut for {ex_id} (source: {article_cache})")
-    return tile_paths
-
 
 def _get_inat_image_path_for_example(example: dict, tiles_dir: str) -> str | None:
     """Get iNaturalist 2021 query image path. dataset_name must be 'inaturalist'."""
@@ -719,63 +621,6 @@ async def retrieve(self, query: str, example: dict) -> RetrievalResult:
         )
 
 
-class LocalWikiTiledScreenshotRetriever(BaseRetriever):
-    """Ground-truth tiled retriever using pre-rendered Wikipedia tiles from local kiwix.
-
-    For each example, looks up the Wikipedia URL in the local kiwix tile store,
-    copies raw tiles to a local cache, cuts into tile_height strips, and passes
-    all tiles to the VLM as context. No Selenium, no SSH.
-
-    Args:
-        tiles_dir: Directory for cut tile strips (output).
-        wiki_cache_dir: Directory for raw kiwix tile copies.
-        tile_height: Height of each strip in pixels (default 1024).
-        max_tiles: Maximum tiles to pass to VLM (None = all).
-    """
-
-    def __init__(
-        self,
-        tiles_dir: str = "tiles-local-wiki",
-        wiki_cache_dir: str = "screenshots-localwiki",
-        tile_height: int = 1024,
-        max_tiles: int | None = None,
-    ):
-        self.tiles_dir = tiles_dir
-        self.wiki_cache_dir = wiki_cache_dir
-        self.tile_height = tile_height
-        self.max_tiles = max_tiles
-        os.makedirs(tiles_dir, exist_ok=True)
-        os.makedirs(wiki_cache_dir, exist_ok=True)
-
-    async def retrieve(self, query: str, example: dict) -> RetrievalResult:
-        from .simpleqa_data import extract_url_from_metadata
-
-        ex_id = example.get("id", "unknown")
-        url = extract_url_from_metadata(example) or ""
-
-        loop = asyncio.get_event_loop()
-        try:
-            tile_paths = await loop.run_in_executor(
-                None,
-                lambda: _lookup_and_copy_local_wiki_tiles(
-                    ex_id, url, self.tiles_dir, self.wiki_cache_dir, self.tile_height
-                ),
-            )
-        except RuntimeError as e:
-            logger.error(f"local-wiki [{ex_id}]: {e}")
-            return RetrievalResult(retrieval_type="local_wiki_tiled", source_url=url)
-
-        if self.max_tiles is not None and len(tile_paths) > self.max_tiles:
-            tile_paths = tile_paths[: self.max_tiles]
-
-        images = [(path, 1.0) for path in tile_paths]
-        return RetrievalResult(
-            images=images,
-            source_url=url,
-            retrieval_type="local_wiki_tiled",
-        )
-
-
 class TextRetriever(BaseRetriever):
     """Use text content fetched from URL.
 
@@ -2197,7 +2042,6 @@ def __init__(
         query_image_fn=None,
         multi_image_query: bool = False,
         tiles_dir: str = "tiles/evqa",
-        lookup_reference_url: bool = False,
         query_instruction: str | None = None,
     ):
         self.api_url = api_url
@@ -2213,7 +2057,6 @@ def __init__(
         self.query_image_fn = query_image_fn  # callable(example) -> image_path or None
         self.multi_image_query = multi_image_query
         self.tiles_dir = tiles_dir
-        self.lookup_reference_url = lookup_reference_url
         self.query_instruction = query_instruction
         self._cache: dict[str, list[dict]] = {}  # example_id -> hits
         self._rewritten_queries: dict[str, str] = {}  # example_id -> rewritten query
@@ -2250,92 +2093,6 @@ async def rewrite_one(ex):
         await asyncio.gather(*[rewrite_one(ex) for ex in examples])
         return rewritten
 
-    def _lookup_reference_tiles(self, examples: list[dict]) -> dict[str, list[dict]]:
-        """Look up reference URL tiles from kiwix for each example.
-
-        Returns dict: example_id -> list of hit dicts with path/score/url/is_reference.
-        """
-        import sys as _sys
-        from .simpleqa_data import extract_url_from_metadata
-
-        if not os.path.isdir(_KIWIX_OUTPUT_DIR) or not os.path.isfile(
-            _KIWIX_ARTICLES_JSON
-        ):
-            logger.error(
-                f"lookup_reference_url: kiwix tiles unavailable at {_KIWIX_OUTPUT_DIR}"
-            )
-            return {}
-
-        if _WIKI_SCREENSHOT_DIR not in _sys.path:
-            _sys.path.insert(0, _WIKI_SCREENSHOT_DIR)
-        from scripts.build_index import batch_query_by_url as _batch_query
-
-        # Collect URLs, group by URL to avoid duplicate lookups
-        url_to_eids: dict[str, list[str]] = {}
-        for ex in examples:
-            eid = ex.get("id", "unknown")
-            url = extract_url_from_metadata(ex)
-            if url and "wikipedia.org" in url:
-                url_to_eids.setdefault(url, []).append(eid)
-
-        if not url_to_eids:
-            return {}
-
-        redirects = (
-            _KIWIX_REDIRECTS_JSON if os.path.isfile(_KIWIX_REDIRECTS_JSON) else None
-        )
-        results = _batch_query(
-            _KIWIX_OUTPUT_DIR,
-            list(url_to_eids.keys()),
-            _KIWIX_ARTICLES_JSON,
-            redirects_json=redirects,
-        )
-
-        ref_tiles: dict[str, list[dict]] = {}
-        found, missing = 0, 0
-        for url, eids in url_to_eids.items():
-            result = results.get(url)
-            if result is None:
-                missing += 1
-                logger.warning(f"lookup_reference_url: URL not found in kiwix: {url}")
-                continue
-            tiles_dir_abs = os.path.join(_KIWIX_OUTPUT_DIR, result["tiles_dir"])
-            if not os.path.isdir(tiles_dir_abs):
-                missing += 1
-                logger.warning(
-                    f"lookup_reference_url: tiles dir missing: {tiles_dir_abs}"
-                )
-                continue
-            chunks = sorted(
-                f
-                for f in os.listdir(tiles_dir_abs)
-                if f.startswith("chunk_") and f.endswith(".png")
-            )
-            if not chunks:
-                missing += 1
-                logger.warning(
-                    f"lookup_reference_url: no chunk files in {tiles_dir_abs}"
-                )
-                continue
-            found += 1
-            hits = [
-                {
-                    "path": os.path.join(tiles_dir_abs, c),
-                    "score": 0.0,
-                    "url": url,
-                    "is_reference": True,
-                }
-                for c in chunks
-            ]
-            for eid in eids:
-                ref_tiles[eid] = hits
-
-        logger.info(
-            f"lookup_reference_url: batch lookup {found} found, {missing} missing "
-            f"out of {len(url_to_eids)} unique URLs"
-        )
-        return ref_tiles
-
     async def prefetch(self, examples: list[dict]):
         """Batch-fetch retrieval results for all examples via the API."""
         import aiohttp
@@ -2525,28 +2282,6 @@ async def prefetch(self, examples: list[dict]):
 
         logger.info(f"LocalAPIRetriever: prefetch complete, {len(self._cache)} cached")
 
-        # Step 2.5: Merge reference URL tiles (if enabled) — chunk-level dedup
-        if self.lookup_reference_url:
-            ref_tiles = self._lookup_reference_tiles(examples)
-            total_added, total_skipped = 0, 0
-            for eid, ref_hits in ref_tiles.items():
-                existing = self._cache.get(eid, [])
-                existing_paths = {hit.get("path", "") for hit in existing}
-                new_chunks = [rh for rh in ref_hits if rh["path"] not in existing_paths]
-                skipped = len(ref_hits) - len(new_chunks)
-                if new_chunks:
-                    logger.info(
-                        f"  [{eid[:8]}]: adding {len(new_chunks)} reference URL chunks "
-                        f"({skipped} already in API results)"
-                    )
-                    self._cache[eid] = existing + new_chunks
-                    total_added += len(new_chunks)
-                total_skipped += skipped
-            logger.info(
-                f"lookup_reference_url: added {total_added} chunks, "
-                f"skipped {total_skipped} duplicates"
-            )
-
         # Step 3: Rerank (if reranker provided)
         if self.reranker is not None:
             # Build batch of (query, candidates) for all examples
@@ -2728,8 +2463,6 @@ def __init__(
         pixel_query_map: dict[str, str] | None = None,
         multimodal_query_text_only: bool = False,
         multimodal_query_image_only: bool = False,
-        local_wiki: bool = False,
-        local_wiki_screenshot_dir: str | None = None,
         multi_image_query: bool = False,
         prebuilt_tiles_dir: str | None = None,
         embedding_backend: str = "vllm",  # "vllm", "hf", or "biqwen3"
@@ -2744,8 +2477,6 @@ def __init__(
         self.pixel_query_map = pixel_query_map  # example_id -> pixel query image path
         self.multimodal_query_text_only = multimodal_query_text_only
         self.multimodal_query_image_only = multimodal_query_image_only
-        self.local_wiki = local_wiki
-        self.local_wiki_screenshot_dir = local_wiki_screenshot_dir
         self.multi_image_query = multi_image_query
         self.prebuilt_tiles_dir = prebuilt_tiles_dir
         self.embedding_backend = embedding_backend
@@ -2778,11 +2509,9 @@ def __init__(
         )
         self._dedup_examples = dedup_examples
 
-        # Prepare tile paths: prebuilt dir (hard mini-datastore), local-wiki, or Selenium
+        # Prepare tile paths: prebuilt dir (hard mini-datastore) or Selenium
         if self.prebuilt_tiles_dir:
             tile_paths = self._load_prebuilt_tiles()
-        elif self.local_wiki:
-            tile_paths = self._prepare_local_wiki_tiles()
         else:
             tile_paths = self._prepare_screenshots_and_tiles()
 
@@ -2832,8 +2561,7 @@ def __init__(
     def _load_prebuilt_tiles(self) -> list[str]:
         """Load ALL .png tiles from a prebuilt tile directory (e.g. hard mini-datastore).
 
-        Unlike _prepare_local_wiki_tiles which only loads golden tiles matching
-        example IDs, this loads every tile in the directory — including distractors.
+        Loads every tile in the directory — including distractors.
         """
         import glob as _glob
 
@@ -2845,153 +2573,6 @@ def _load_prebuilt_tiles(self) -> list[str]:
         )
         return filtered
 
-    def _prepare_local_wiki_tiles(self) -> list[str]:
-        """Prepare tiles from local kiwix tile store for all examples in the batch.
-
-        Does a single batch URL lookup (fast), then copies+cuts tiles per example.
-        Reports an error (no fallback) if a URL is not found in kiwix.
-
-        Returns the list of all cut tile paths ready for embedding.
-        """
-        import glob as _glob
-        import shutil
-        import sys as _sys
-        from PIL import Image
-        from .simpleqa_data import extract_url_from_metadata
-        from tqdm import tqdm
-
-        cut_height = (
-            self.tile_size[1] if isinstance(self.tile_size, tuple) else self.tile_size
-        )
-        wiki_cache = self.local_wiki_screenshot_dir or os.path.join(
-            self.screenshot_dir, "local-wiki"
-        )
-        os.makedirs(wiki_cache, exist_ok=True)
-        os.makedirs(self.tiles_dir, exist_ok=True)
-
-        # Separate already-cached examples from ones that need processing
-        need: list[tuple[str, str]] = []  # (ex_id, url)
-        for ex in self._dedup_examples:
-            ex_id = ex["id"]
-            if not _glob.glob(os.path.join(self.tiles_dir, f"{ex_id}_tile_*.png")):
-                url = extract_url_from_metadata(ex) or ""
-                need.append((ex_id, url))
-
-        logger.info(
-            f"local-wiki: {len(self._dedup_examples) - len(need)} cached, {len(need)} need processing"
-        )
-
-        if need:
-            # Single batch lookup for all URLs at once (loads articles.json once)
-            if not os.path.isdir(_KIWIX_OUTPUT_DIR) or not os.path.isfile(
-                _KIWIX_ARTICLES_JSON
-            ):
-                logger.error(
-                    f"local-wiki: kiwix tiles unavailable at {_KIWIX_OUTPUT_DIR}"
-                )
-            else:
-                if _WIKI_SCREENSHOT_DIR not in _sys.path:
-                    _sys.path.insert(0, _WIKI_SCREENSHOT_DIR)
-                from scripts.build_index import batch_query_by_url as _batch_query
-
-                redirects = (
-                    _KIWIX_REDIRECTS_JSON
-                    if os.path.isfile(_KIWIX_REDIRECTS_JSON)
-                    else None
-                )
-                urls_to_lookup = [u for _, u in need if u and "wikipedia.org" in u]
-                results = _batch_query(
-                    _KIWIX_OUTPUT_DIR,
-                    urls_to_lookup,
-                    _KIWIX_ARTICLES_JSON,
-                    redirects_json=redirects,
-                )
-                found = sum(1 for r in results.values() if r is not None)
-                logger.info(
-                    f"local-wiki: batch lookup found {found}/{len(urls_to_lookup)} URLs"
-                )
-
-                # Copy + cut per example
-                ok, failed = 0, 0
-                for ex_id, url in tqdm(need, desc="local-wiki: copying+cutting tiles"):
-                    # Check cache again (may have been done by a parallel run)
-                    if _glob.glob(os.path.join(self.tiles_dir, f"{ex_id}_tile_*.png")):
-                        ok += 1
-                        continue
-                    result = results.get(url)
-                    if result is None:
-                        logger.error(
-                            f"local-wiki [{ex_id}]: URL not found in kiwix: {url}"
-                        )
-                        failed += 1
-                        continue
-                    src_dir = os.path.join(_KIWIX_OUTPUT_DIR, result["tiles_dir"])
-                    article_cache = os.path.join(wiki_cache, str(ex_id))
-                    if not os.path.exists(article_cache):
-                        if not os.path.isdir(src_dir):
-                            logger.error(
-                                f"local-wiki [{ex_id}]: tiles dir not on disk: {src_dir}"
-                            )
-                            failed += 1
-                            continue
-                        shutil.copytree(src_dir, article_cache)
-                    # Cut into strips
-                    raw_tiles = sorted(
-                        f
-                        for f in os.listdir(article_cache)
-                        if f.endswith(".png") and f.startswith("tile_")
-                    )
-                    if not raw_tiles:
-                        logger.error(
-                            f"local-wiki [{ex_id}]: no tile PNGs in {article_cache}"
-                        )
-                        failed += 1
-                        continue
-                    global_row = 0
-                    for raw_name in raw_tiles:
-                        raw_path = os.path.join(article_cache, raw_name)
-                        if os.path.getsize(raw_path) == 0:
-                            continue
-                        try:
-                            img = Image.open(raw_path)
-                            img.load()
-                        except Exception as e:
-                            logger.warning(
-                                f"local-wiki [{ex_id}]: corrupt tile {raw_path}: {e}"
-                            )
-                            continue
-                        w, h = img.size
-                        y = 0
-                        while y < h:
-                            y2 = min(y + cut_height, h)
-                            img.crop((0, y, w, y2)).save(
-                                os.path.join(
-                                    self.tiles_dir, f"{ex_id}_tile_{global_row}_0.png"
-                                )
-                            )
-                            global_row += 1
-                            y += cut_height
-                        img.close()
-                    ok += 1
-                logger.info(
-                    f"local-wiki: {ok} articles prepared, {failed} not found/failed"
-                )
-
-        all_tile_paths = []
-        for ex in self._dedup_examples:
-            ex_id = ex["id"]
-            tiles = sorted(
-                _glob.glob(os.path.join(self.tiles_dir, f"{ex_id}_tile_*.png"))
-            )
-            all_tile_paths.extend(tiles)
-
-        filtered = _filter_tiles_by_aspect_ratio(all_tile_paths)
-        logger.info(
-            f"local-wiki: {len(filtered)} tiles ready for embedding "
-            f"(filtered {len(all_tile_paths) - len(filtered)} extreme aspect ratio tiles)"
-        )
-        return filtered
-
     def _prepare_screenshots_and_tiles(self) -> list[str]:
         """Prepare screenshots and tiles for dataset, return tile paths.
 
@@ -3257,7 +2838,7 @@ async def retrieve_multi_image(self, query: str, example: dict) -> RetrievalResu
 
 
 class TextAPIRetriever(BaseRetriever):
-    """Retrieve text chunks from a text search API (wiki-screenshot text_search_api.py).
+    """Retrieve text chunks from the text search API.
 
     The API accepts:
         POST /search
diff --git a/eval/lib/retrievers.py b/eval/lib/retrievers.py
index ff296d7..1861421 100644
--- a/eval/lib/retrievers.py
+++ b/eval/lib/retrievers.py
@@ -11,7 +11,6 @@
     NaiveRetriever,
     ScreenshotRetriever,
     TiledScreenshotRetriever,
-    LocalWikiTiledScreenshotRetriever,
     TextRetriever,
     JinaReaderRetriever,
     WikipediaAPIRetriever,
@@ -76,15 +75,6 @@ def build_retriever(args, examples, model, api_base, api_key):
         )
         mode = f"Screenshot (Ground Truth, max_pixels={args.max_pixels or 'None'})"
 
-    elif args.url_tiled_screenshot and args.local_wiki:
-        retriever = LocalWikiTiledScreenshotRetriever(
-            tiles_dir=args.tiles_dir,
-            wiki_cache_dir=args.local_wiki_screenshot_dir,
-            tile_height=args.tile_height,
-            max_tiles=args.max_tiles,
-        )
-        mode = f"Local-Wiki Tiled Screenshot (Ground Truth, tile_height={args.tile_height}, max_tiles={args.max_tiles})"
-
     elif args.url_tiled_screenshot:
         retriever = TiledScreenshotRetriever(
             screenshot_dir=args.screenshot_dir,
@@ -184,8 +174,7 @@ def build_retriever(args, examples, model, api_base, api_key):
             qwen3vl_cache_path = args.retrieval_cache
             if qwen3vl_cache_path is None:
                 task_subset = f"{args.task}_{args.subset}" if args.subset else args.task
-                localwiki_suffix = "_localwiki" if args.local_wiki else ""
-                qwen3vl_cache_path = f"qwen3vl_tiles_{task_subset}_{TILE_WIDTH}x{args.tile_height}_{args.num_examples}ex{localwiki_suffix}_embeddings.pkl"
+                qwen3vl_cache_path = f"qwen3vl_tiles_{task_subset}_{TILE_WIDTH}x{args.tile_height}_{args.num_examples}ex_embeddings.pkl"
             qwen3vl_gpu_ids = [int(x.strip()) for x in args.qwen3vl_gpu_ids.split(",")]
 
             pixel_query_map = None
@@ -232,8 +221,6 @@ def build_retriever(args, examples, model, api_base, api_key):
                 pixel_query_map=pixel_query_map,
                 multimodal_query_text_only=args.evqa_multimodal_query_text_only,
                 multimodal_query_image_only=args.evqa_multimodal_query_image_only,
-                local_wiki=args.local_wiki,
-                local_wiki_screenshot_dir=args.local_wiki_screenshot_dir,
                 multi_image_query=args.evqa_multi_image_query,
                 prebuilt_tiles_dir=getattr(args, "prebuilt_tiles_dir", None),
                 embedding_backend=getattr(args, "embedding_backend", "vllm"),
@@ -242,8 +229,6 @@ def build_retriever(args, examples, model, api_base, api_key):
             mode = "Tiled Qwen3-VL-Embedding Retrieval"
             if getattr(args, "prebuilt_tiles_dir", None):
                 mode += " (prebuilt hard-mini)"
-            elif args.local_wiki:
-                mode += " (local-wiki)"
             if args.task == "encyclopedic_vqa":
                 if args.evqa_multi_image_query:
                     mode += " (EVQA multi-image query)"
@@ -326,7 +311,6 @@ def query_image_fn(ex, _t=_task):
             query_image_fn=query_image_fn,
             multi_image_query=args.evqa_multi_image_query,
             tiles_dir=args.tiles_dir or "tiles/evqa",
-            lookup_reference_url=args.lookup_reference_url,
             query_instruction=args.query_instruction,
         )
         mode = f"Local API Retrieval ({args.local_api_url})"
@@ -338,8 +322,6 @@ def query_image_fn(ex, _t=_task):
             mode += " (multimodal query)"
         if args.query_rewrite:
             mode += f" + QueryRewrite({rw_model})"
-        if args.lookup_reference_url:
-            mode += " + RefURL"
         if args.reranker:
             mode += f" + Reranker({args.reranker_model}, top{args.rerank_top_k})"
         if args.react:
diff --git a/eval/lib/simpleqa_data.py b/eval/lib/simpleqa_data.py
index cbc4e26..240c7eb 100644
--- a/eval/lib/simpleqa_data.py
+++ b/eval/lib/simpleqa_data.py
@@ -274,7 +274,7 @@ def extract_url_from_metadata(example: dict) -> str | None:
         url_match = re.search(r"https?://[^\s<>\"{}|\\^`\[\]]+", target_url)
         target_url = url_match.group(0) if url_match else None
 
-    # Note by Yichuan: strip URL fragment (#section) so that URLs differing
+    # Note: strip URL fragment (#section) so that URLs differing
     # only by anchor are treated as the same page for deduplication and
     # retrieval-accuracy matching.
     if target_url and "#" in target_url:
diff --git a/eval/pyproject.toml b/eval/pyproject.toml
index 8256b96..5b70556 100644
--- a/eval/pyproject.toml
+++ b/eval/pyproject.toml
@@ -1,12 +1,10 @@
 [project]
 name = "pixelrag-repro"
 version = "0.1.0"
-description = "Reproduction harness for the PixelRAG (Vis-RAG) paper Table 1 — drives the paper's own run_naive_simpleqa.py + evaluate.py against live retrieval/reader serves."
+description = "Self-contained reproduction harness for PixelRAG paper Table 1 — runs retrieval + reader + grader against live serves and prints the score."
 requires-python = ">=3.12,<3.13"
-# These are the deps the paper's API-path code (scripts/run_naive_simpleqa.py,
-# scripts/simpleqa, scripts/evaluate.py, evaluation/*) imports when retrieval + reader
-# run as remote HTTP serves (no local torch/vllm needed -- the model serves are separate).
-# The paper repo itself is pinned in REPRODUCE.md (yichuan-w/Vis-RAG @ e591fd0).
+# Deps for the API-path harness (run_bench.py + lib/): retrieval and reader run as
+# remote HTTP serves, so no local torch/vllm is needed here.
 dependencies = [
     "aiohttp==3.13.5",
     "datasets==4.8.5",
diff --git a/eval/reproduce.sh b/eval/reproduce.sh
index e49afed..9d3169a 100755
--- a/eval/reproduce.sh
+++ b/eval/reproduce.sh
@@ -1,6 +1,6 @@
 #!/bin/bash
 # PixelRAG paper Table 1 reproduction — one cell at a time.
-# Self-contained: uses this repo's eval/run_bench.py + eval/lib (no old Vis-RAG repo).
+# Self-contained: uses this repo's eval/run_bench.py + eval/lib (no external checkout needed).
 #
 #   bash reproduce.sh <bench> <retrieval>
 #     bench     = nq | nqt | sqa | mms | evqa | livevqa
@@ -8,15 +8,15 @@
 #
 # Runs the full pipeline (retrieve -> read -> grade) and prints the score.
 # It does NOT compare to the paper and does NOT detect the GPU: run the reader on an
-# H100 (see REPRODUCE.md) and the numbers naturally land within ~1pp of the paper.
+# H100 (see README.md) and the numbers naturally land within ~1pp of the paper.
 #
-# Env (defaults in [] — see REPRODUCE.md for the serve topology):
+# Env (defaults in [] — see README.md for the serve topology):
 #   READER_URL  reader (Qwen3.5-4B, vLLM 0.19.0) OpenAI API base  [http://localhost:8010/v1]
 #   BASE_PORT   base pixel search serve   [30088]
 #   LORA_PORT   lora pixel search serve   [30096]
 #   TEXT_PORT   trafilatura text serve    [30097]
 #   NEWS_PORT   news pixel serve (livevqa)[30095]
-#   TILES_DIR   local wiki kiwix tiles    [/mnt/data/yichuan/kiwix_tiles]
+#   TILES_DIR   local wiki tiles dir; set EMPTY to use serve-returned base64 tiles  [/mnt/data/yichuan/kiwix_tiles]
 #   OPENAI_API_KEY / OPENAI_BASE_URL  for the LLM-judge grader (auto-loaded from ../.env)
 set -euo pipefail
 cd "$(dirname "$0")"
@@ -28,6 +28,7 @@ READER_URL="${READER_URL:-http://localhost:8010/v1}"
 BASE_PORT="${BASE_PORT:-30088}"; LORA_PORT="${LORA_PORT:-30096}"
 TEXT_PORT="${TEXT_PORT:-30097}"; NEWS_PORT="${NEWS_PORT:-30095}"
 TILES_DIR="${TILES_DIR:-/mnt/data/yichuan/kiwix_tiles}"
+TILESFLAG=""; [ -n "$TILES_DIR" ] && TILESFLAG="--tiles-dir $TILES_DIR"   # empty TILES_DIR = read tiles from the serve's base64
 PY="$(pwd)/.venv/bin/python"
 PIXEL_INSTR="Retrieve images or text relevant to the user's query."
 TEXT_INSTR="Retrieve text relevant to the user's query."
@@ -52,16 +53,17 @@ fi
 # --- per-benchmark config (Qwen3.5-4B, rtk=5, rk=3) -----------------------
 case "$BENCH" in
   nq)   TASK=nq;               GRADE=nq;               THINK=off; MAXTOK=200;   N=1000; EXTRA="" ;;
-  nqt)  TASK=nq_tables;        GRADE=nq_tables;        THINK=off; MAXTOK=200;   N=1068; EXTRA="" ;;
+  nqt)  TASK=nq_tables;        GRADE=nq_tables;        THINK=off; MAXTOK=200;   N="";   EXTRA="" ;;
   sqa)  TASK=simpleqa;         GRADE=simpleqa;         THINK=off; MAXTOK=200;   N=1000; EXTRA="--nprobe 2000" ;;
-  mms)  TASK=mmsearch;         GRADE=mmsearch;         THINK=on;  MAXTOK=16384; N=300;  EXTRA="" ;;
-  evqa) TASK=encyclopedic_vqa; GRADE=encyclopedic_vqa; THINK=off; MAXTOK=16384; N=1000;
+  mms)  TASK=mmsearch;         GRADE=mmsearch;         THINK=on;  MAXTOK=16384; N="";   EXTRA="" ;;
+  evqa) TASK=encyclopedic_vqa; GRADE=encyclopedic_vqa; THINK=off; MAXTOK=16384; N="";
         EXTRA="--evqa-dataset-filter landmarks --evqa-question-type-filter automatic" ;;
   *) echo "unknown bench: $BENCH" >&2; exit 1 ;;
 esac
 # MMS naive is the one MMS cell the paper ran no-think / max_tokens=200.
 [ "$BENCH" = mms ] && [ "$RETR" = naive ] && { THINK=off; MAXTOK=200; }
 N="${NUM:-$N}"   # NUM env overrides example count (handy for a quick smoke test)
+NUMFLAG=""; [ -n "$N" ] && NUMFLAG="--num-examples $N"   # empty N = run the whole set
 THINKFLAG=""; [ "$THINK" = off ] && THINKFLAG="--no-think"
 
 # --- retrieval condition --------------------------------------------------
@@ -69,9 +71,7 @@ case "$RETR" in
   naive) RFLAGS=() ;;
   base)  RFLAGS=(--local-api --local-api-url "http://localhost:${BASE_PORT}/search" --query-instruction "$PIXEL_INSTR") ;;
   lora)  RFLAGS=(--local-api --local-api-url "http://localhost:${LORA_PORT}/search" --query-instruction "$PIXEL_INSTR") ;;
-  # --no-query-image: paper kept text retrieval TEXT-ONLY (the "send query image to text
-  # serve" fix was NOT applied in the paper). Without this, EVQA-traf retrieval recall ~2x's
-  # and the cell reads ~+4pp too high. See REPRODUCE.md.
+  # --no-query-image: match the paper's text-only text retrieval. See README.md.
   traf)  RFLAGS=(--text-api  --text-api-url  "http://localhost:${TEXT_PORT}/search" --query-instruction "$TEXT_INSTR" --no-query-image) ;;
   *) echo "unknown retrieval: $RETR" >&2; exit 1 ;;
 esac
@@ -106,12 +106,12 @@ esac
 if [ "$preflight_fail" = 1 ]; then echo ">>> preflight FAILED — bring the serve(s) up (commands above), then re-run." >&2; exit 2; fi
 
 OUT="eval_output/repro_${BENCH}_${RETR}.jsonl"
-echo ">>> [$BENCH/$RETR] run_bench: reader=$READER_URL task=$TASK think=$THINK max_tokens=$MAXTOK n=$N"
+echo ">>> [$BENCH/$RETR] run_bench: reader=$READER_URL task=$TASK think=$THINK max_tokens=$MAXTOK n=${N:-all}"
 # shellcheck disable=SC2086
 "$PY" run_bench.py --task "$TASK" --model Qwen/Qwen3.5-4B \
     --api-base "$READER_URL" --api-key dummy $THINKFLAG \
-    --retrieval-top-k 5 --reader-top-k 3 --num-examples "$N" --max-tokens "$MAXTOK" \
-    --tiles-dir "$TILES_DIR" --output "$OUT" --force --max-concurrent 24 \
+    --retrieval-top-k 5 --reader-top-k 3 $NUMFLAG --max-tokens "$MAXTOK" \
+    $TILESFLAG --output "$OUT" --force --max-concurrent 24 \
     $EXTRA "${RFLAGS[@]}"
 
 echo ">>> [$BENCH/$RETR] grading ($GRADE)"
diff --git a/eval/run_bench.py b/eval/run_bench.py
index 8f8278b..fac9846 100644
--- a/eval/run_bench.py
+++ b/eval/run_bench.py
@@ -991,8 +991,6 @@ async def run_async(args):
         # Determine mode for filename
         if args.url_screenshot:
             mode_str = "screenshot"
-        elif args.url_tiled_screenshot and args.local_wiki:
-            mode_str = "tiled_screenshot_localwiki"
         elif args.url_tiled_screenshot:
             mode_str = "tiled_screenshot"
         elif args.url_text:
@@ -1007,8 +1005,6 @@ async def run_async(args):
                 mode_str = "tiled_vector_colqwen"
             elif args.use_qwen3vl_embedding:
                 mode_str = "tiled_vector_qwen3vl_embedding"
-                if args.local_wiki:
-                    mode_str += "_localwiki"
                 if args.task == "encyclopedic_vqa":
                     if args.evqa_multimodal_query:
                         if args.evqa_multimodal_query_text_only:
@@ -1321,8 +1317,8 @@ def main():
     parser.add_argument(
         "--num-examples",
         type=int,
-        default=1000,
-        help="Number of examples (default: 1000 Wikipedia samples)",
+        default=None,
+        help="Number of examples to run (default: the whole filtered set)",
     )
     parser.add_argument(
         "--verified",
@@ -1439,8 +1435,8 @@ def main():
     parser.add_argument(
         "--jina-api-key",
         type=str,
-        default="jina_de9725ba5457460a9e5b0f89548e6657UN5YStvS5ingpklvVohWgOMiYRxn",
-        help="Jina API key",
+        default=os.environ.get("JINA_API_KEY"),
+        help="Jina API key (defaults to the JINA_API_KEY env var)",
     )
     parser.add_argument(
         "--retrieval-cache", type=str, default=None, help="Embedding cache file"
@@ -1582,20 +1578,6 @@ def main():
         help="Directory to store rendered pixel query images (default: pixel_queries)",
     )
 
-    # Local wiki-screenshot tiles (pre-rendered, from local kiwix tile store)
-    parser.add_argument(
-        "--local-wiki",
-        action="store_true",
-        help="Use pre-rendered Wikipedia tiles from local kiwix tile store instead of Selenium.",
-    )
-    parser.add_argument(
-        "--local-wiki-screenshot-dir",
-        type=str,
-        default=None,
-        help="Directory to store raw local-wiki tile downloads (default: screenshots-localwiki). "
-        "Keeps local-wiki cache separate from regular screenshots.",
-    )
-
     parser.add_argument(
         "--prebuilt-tiles-dir",
         type=str,
@@ -1693,12 +1675,6 @@ def main():
         "When set, build_messages prepends (Example N, image, Q+A) blocks to every "
         "reader user-message. Works across pixel / text / naive modes.",
     )
-    parser.add_argument(
-        "--lookup-reference-url",
-        action="store_true",
-        help="For local-api mode: also look up the ground-truth reference URL in kiwix "
-        "and append its tiles to the API search results (deduplicated by article ID).",
-    )
     parser.add_argument(
         "--reranker",
         action="store_true",
@@ -1925,20 +1901,11 @@ def main():
     # Set default tiles-dir and screenshot-dir for EVQA (use cached paths)
     if args.task == "encyclopedic_vqa":
         if args.tiles_dir is None:
-            args.tiles_dir = "tiles/evqa_localwiki" if args.local_wiki else "tiles/evqa"
+            args.tiles_dir = "tiles/evqa"
         if args.use_tiled_retrieval and args.screenshot_dir == "screenshots":
-            args.screenshot_dir = (
-                "screenshots/evqa_localwiki" if args.local_wiki else "screenshots/evqa"
-            )
+            args.screenshot_dir = "screenshots/evqa"
     elif args.tiles_dir is None:
-        if args.local_wiki:
-            args.tiles_dir = f"tiles-local-wiki-h{args.tile_height}"
-        else:
-            args.tiles_dir = f"tiles-1024x{args.tile_height}"
-
-    # Default local-wiki screenshot dir
-    if args.local_wiki and args.local_wiki_screenshot_dir is None:
-        args.local_wiki_screenshot_dir = "screenshots-localwiki"
+        args.tiles_dir = f"tiles-1024x{args.tile_height}"
 
     # Auto-calculate max_context_chars if not set
     if args.max_context_chars is None:
diff --git a/eval/run_livevqa.py b/eval/run_livevqa.py
index ab55f26..b5c9003 100644
--- a/eval/run_livevqa.py
+++ b/eval/run_livevqa.py
@@ -85,7 +85,7 @@
 LIVEVQA_IMAGES_DIR = "/opt/dlami/nvme/livevqa"
 
 # Default v4 JSON (canonical LiveVQA dataset with question/options/GT/img_path)
-# LiveVQA dataset (question/options/GT/img_path). External data input — see REPRODUCE.md.
+# LiveVQA dataset (question/options/GT/img_path). External data input — see README.md.
 # Override with --v4-path. Retrieval is re-done live; only the QA fields are read from here.
 DEFAULT_V4_PATH = os.environ.get(
     "LIVEVQA_V4_PATH", "/mnt/data/yichuan/livevqa_v4_multimodal.json"
diff --git a/eval/serve_up.sh b/eval/serve_up.sh
index 4fba874..b8a57bf 100644
--- a/eval/serve_up.sh
+++ b/eval/serve_up.sh
@@ -8,7 +8,7 @@
 #
 # Env:
 #   INDEX_ROOT   where indexes live / get downloaded   [/data/pixelrag/indexes]
-#   HF_INDEX_REPO  HF dataset repo holding the indexes  [StarTrail-org/pixelrag-faiss-indexes]  (TODO: publish)
+#   HF_INDEX_REPO  HF dataset repo holding the indexes  [StarTrail-org/pixelrag-faiss-indexes]
 #   GPU          CUDA device for the serves            [0]
 #   READER_GPU   CUDA device for the reader (H100)     [0]
 # Ports default to the reproduce.sh manifest (override with BASE_PORT/LORA_PORT/TEXT_PORT/NEWS_PORT).

From 53a4a3b01952a42696a041ad524a4b681c2fa5ba Mon Sep 17 00:00:00 2001
From: Zhifei Li <andylizf@outlook.com>
Date: Wed, 24 Jun 2026 01:06:59 -0700
Subject: [PATCH 02/13] fix(eval): grade NQ correctly and feed serve-returned
 base64 tiles to the reader

Two pre-existing bugs that silently broke cells, surfaced by a from-scratch
reproduction of NQ/base via the public API:

- grader: NQ / NQ-Tables store gold answers under `gold_answers`, which
  `_golds_for` did not read, so every exact-match cell graded 0. Add the key.
- reader: retrieved tiles were attached only when `os.path.exists(path)` is true,
  but in the public-API and on-demand-render modes the serve returns each tile
  inline as base64 (the "path" is the base64 string itself). Those tiles were
  silently dropped and the reader answered from parametric memory (effectively
  naive). Add a `_tile_image_b64` helper that accepts both a local file path and
  inline base64, and use it at all four tile-attachment sites (build_messages's
  three branches + `_encode_images_to_content`).

Verified: NQ/base smoke (20 examples, public API, base64 tiles) now grades
10/20 = 50% with tiles reaching the reader, vs 0 before.
---
 eval/lib/grader.py |   1 +
 eval/lib/llm.py    | 113 +++++++++++++++++++++++----------------------
 2 files changed, 60 insertions(+), 54 deletions(-)

diff --git a/eval/lib/grader.py b/eval/lib/grader.py
index e8dbe3b..a939ad4 100644
--- a/eval/lib/grader.py
+++ b/eval/lib/grader.py
@@ -110,6 +110,7 @@ def _golds_for(task: str, od: dict):
     if task in EXACT_MATCH_TASKS:
         g = (
             od.get("answers")
+            or od.get("gold_answers")
             or od.get("reference_list")
             or od.get("answer")
             or od.get("gt_answer")
diff --git a/eval/lib/llm.py b/eval/lib/llm.py
index 49a3794..119954d 100644
--- a/eval/lib/llm.py
+++ b/eval/lib/llm.py
@@ -153,6 +153,19 @@ def _build_fewshot_turns(demos: list[dict], encode_image_fn) -> list[dict]:
     return turns
 
 
+def _tile_image_b64(img, encode_image_fn):
+    """Base64 PNG for a retrieved tile. The serve returns a tile either as a local
+    file path (a tile corpus is mounted) or as inline base64 bytes (the reader has
+    no local tiles — public-API / on-demand-render modes). Handle both so the reader
+    actually sees the retrieved evidence instead of silently dropping it."""
+    if not isinstance(img, str):
+        return None
+    if os.path.exists(img):
+        return encode_image_fn(img) if encode_image_fn else None
+    # No file at this path -> the serve returned the tile inline as base64.
+    return img if len(img) > 256 else None
+
+
 def build_messages(
     query: str,
     retrieval_result: RetrievalResult,
@@ -233,20 +246,19 @@ def build_messages(
         # Add retrieved tiles
         if retrieval_result.images:
             for img_path, score in retrieval_result.images:
-                if os.path.exists(img_path):
-                    try:
-                        img_base64 = encode_image_fn(img_path)
-                        if img_base64:
-                            user_content.append(
-                                {
-                                    "type": "image_url",
-                                    "image_url": {
-                                        "url": f"data:image/png;base64,{img_base64}"
-                                    },
-                                }
-                            )
-                    except Exception as e:
-                        logger.warning(f"Failed to encode image {img_path}: {e}")
+                try:
+                    img_base64 = _tile_image_b64(img_path, encode_image_fn)
+                    if img_base64:
+                        user_content.append(
+                            {
+                                "type": "image_url",
+                                "image_url": {
+                                    "url": f"data:image/png;base64,{img_base64}"
+                                },
+                            }
+                        )
+                except Exception as e:
+                    logger.warning(f"Failed to encode tile: {e}")
 
         return [
             {"role": "system", "content": system_prompt},
@@ -278,40 +290,34 @@ def build_messages(
         system_prompt = SYSTEM_PROMPT_TEXT_RAG
         user_content = []
         for img_path, score in retrieval_result.images:
-            if os.path.exists(img_path):
-                try:
-                    img_base64 = encode_image_fn(img_path)
-                    if img_base64:
-                        user_content.append(
-                            {
-                                "type": "image_url",
-                                "image_url": {
-                                    "url": f"data:image/png;base64,{img_base64}"
-                                },
-                            }
-                        )
-                except Exception as e:
-                    logger.warning(f"Failed to encode image {img_path}: {e}")
+            try:
+                img_base64 = _tile_image_b64(img_path, encode_image_fn)
+                if img_base64:
+                    user_content.append(
+                        {
+                            "type": "image_url",
+                            "image_url": {"url": f"data:image/png;base64,{img_base64}"},
+                        }
+                    )
+            except Exception as e:
+                logger.warning(f"Failed to encode tile: {e}")
         user_content.append({"type": "text", "text": f"Question: {query}"})
     elif retrieval_result.images and encode_image_fn:
         system_prompt = SYSTEM_PROMPT_VECTOR
         user_content = [{"type": "text", "text": query}]
         # Encode and add retrieved images
         for img_path, score in retrieval_result.images:
-            if os.path.exists(img_path):
-                try:
-                    img_base64 = encode_image_fn(img_path)
-                    if img_base64:
-                        user_content.append(
-                            {
-                                "type": "image_url",
-                                "image_url": {
-                                    "url": f"data:image/png;base64,{img_base64}"
-                                },
-                            }
-                        )
-                except Exception as e:
-                    logger.warning(f"Failed to encode image {img_path}: {e}")
+            try:
+                img_base64 = _tile_image_b64(img_path, encode_image_fn)
+                if img_base64:
+                    user_content.append(
+                        {
+                            "type": "image_url",
+                            "image_url": {"url": f"data:image/png;base64,{img_base64}"},
+                        }
+                    )
+            except Exception as e:
+                logger.warning(f"Failed to encode tile: {e}")
     elif retrieval_result.text:
         system_prompt = SYSTEM_PROMPT_TEXT_RAG
         # Option 1 (2026-04-29): no `Context from {urls}:` wrapper. URL leak gave
@@ -353,18 +359,17 @@ def _encode_images_to_content(
     """Encode image paths to base64 content blocks."""
     content = []
     for img_path, score in images:
-        if os.path.exists(img_path):
-            try:
-                img_base64 = encode_image_fn(img_path)
-                if img_base64:
-                    content.append(
-                        {
-                            "type": "image_url",
-                            "image_url": {"url": f"data:image/png;base64,{img_base64}"},
-                        }
-                    )
-            except Exception as e:
-                logger.warning(f"Failed to encode image {img_path}: {e}")
+        try:
+            img_base64 = _tile_image_b64(img_path, encode_image_fn)
+            if img_base64:
+                content.append(
+                    {
+                        "type": "image_url",
+                        "image_url": {"url": f"data:image/png;base64,{img_base64}"},
+                    }
+                )
+        except Exception as e:
+            logger.warning(f"Failed to encode tile: {e}")
     return content
 
 

From 879e7367ec11462f8f6db15e5cfb81a3a55c159a Mon Sep 17 00:00:00 2001
From: Zhifei Li <andylizf@outlook.com>
Date: Wed, 24 Jun 2026 02:01:06 -0700
Subject: [PATCH 03/13] chore(eval): relax deps to floors + optional reader
 extra; document the public-API path

- Loosen eval/pyproject deps from exact `==` pins to `>=` floors. uv.lock is the
  reproducibility contract (`uv sync --frozen`); the blanket `==` pins were a freeze
  artifact, inconsistent with the rest of the workspace (which floors), and they conflicted
  with vLLM (numpy 2.4.6 vs vLLM's numpy<2.3).
- Add an optional `reader` extra (`uv sync --extra reader`) that installs vLLM 0.19.0, so the
  Qwen3.5-4B reader can be self-hosted from this package. vLLM needs numpy<2.3, hence the cap
  (nothing in the base requires >=2.3).
- README: document installing the reader via the extra; add the command to run a cell against
  the public API (reproduce.sh hardcodes localhost, so it can't do the public-API path); note
  that public :30001 serves the un-normed base index, which does not match the paper's
  base/lora cells (those use search_index_normed_v2).
---
 eval/README.md      |   30 +-
 eval/pyproject.toml |   49 +-
 eval/uv.lock        | 1682 +++++++++++++++++++++++++++++++++++++++++--
 3 files changed, 1668 insertions(+), 93 deletions(-)

diff --git a/eval/README.md b/eval/README.md
index 979fd36..50357e9 100644
--- a/eval/README.md
+++ b/eval/README.md
@@ -11,9 +11,15 @@ The reproduction script just runs the pipeline and prints a score. Run the reade
 
 ```bash
 cd eval
-uv sync --frozen        # creates eval/.venv from pyproject.toml + uv.lock (Python 3.12)
+uv sync --frozen        # base client (retrieval + reader + grader over HTTP); Python 3.12
+
+# Optional — to self-host the Qwen3.5-4B reader from this package (needs a CUDA GPU):
+uv sync --frozen --extra reader   # also installs vLLM 0.19.0 into eval/.venv
 ```
 
+The base install is a pure HTTP client — no torch/vllm. The `reader` extra adds vLLM so you
+can serve the reader from the same venv; vLLM needs `numpy<2.3`, hence the numpy bound.
+
 Grader needs an OpenAI key with access to `gpt-4.1-2025-04-14`. `reproduce.sh` auto-loads
 `OPENAI_API_KEY` / `OPENAI_BASE_URL` from `../.env`.
 
@@ -21,7 +27,7 @@ Grader needs an OpenAI key with access to `gpt-4.1-2025-04-14`. `reproduce.sh` a
 
 | role | default port | index / model | notes |
 |------|------|------|------|
-| **reader** | `READER_URL` :8010 | `Qwen/Qwen3.5-4B`, **vLLM 0.19.0**, **H100** | `CUDA_VISIBLE_DEVICES=0 HF_HOME=… vllm serve Qwen/Qwen3.5-4B --port 8010` on an H100; tunnel it to :8010 |
+| **reader** | `READER_URL` :8010 | `Qwen/Qwen3.5-4B`, **vLLM 0.19.0**, **H100** | install via `uv sync --extra reader`, then `CUDA_VISIBLE_DEVICES=0 HF_HOME=… .venv/bin/vllm serve Qwen/Qwen3.5-4B --port 8010` on an H100 (tunnel to :8010 if remote) |
 | base pixel | :30088 | `search_index_normed_v2` (wiki, 28.2M), base encoder, direct_gpu | multimodal query |
 | lora pixel | :30096 | wiki lora-vit-ckpt200 index (26.3M) | multimodal query |
 | traf text  | :30097 | `text_search_index_1024_normed` (wiki, 15.7M, nprobe 128) | text query |
@@ -60,8 +66,11 @@ reach it three ways — pick one:
    retrieved tile inline as base64; or set `TILES_DIR` to read the tiles from the reader's
    local disk instead. Full self-host.
 2. **Public API (no self-hosting).** Point the retrieval URL at the public endpoint
-   (`api.ds-serve.org` / `api.pixelrag.ai`) instead of a local serve. It returns base64 tiles,
-   so you only run the reader + grader — no index, no tile corpus.
+   (e.g. `http://api.pixelrag.ai:30001/search`) instead of a local serve. It returns base64
+   tiles, so you only run the reader + grader — no index, no tile corpus. Note: `:30001`
+   serves the **un-normed** base index (`search_index`), which does **not** match the paper's
+   `base`/`lora` cells (those use `search_index_normed_v2`) — handy for exercising the pipeline,
+   but self-host the normed index to match the paper numbers. Command in §3.
 3. **Self-hosted serve, index + on-demand render.** Run the serve with the index but **no** tile
    corpus, started with an on-demand renderer; it renders each retrieved page to tiles at query
    time and returns them as base64. Needs the kiwix ZIM, not the ~4T corpus.
@@ -80,6 +89,19 @@ bash reproduce.sh mms lora
 NUM=20 bash reproduce.sh nq traf  # NUM overrides the example count for a quick smoke
 ```
 
+`reproduce.sh` targets **self-hosted serves on localhost** (modes 1/3 above). To run a cell
+against the **public API** (mode 2), drive `run_bench.py` directly — `reproduce.sh` hardcodes
+`localhost`, so it can't point at a remote serve:
+
+```bash
+# NQ via the public base endpoint (exact-match grading — no OpenAI key needed):
+.venv/bin/python run_bench.py --task nq --model Qwen/Qwen3.5-4B \
+  --api-base "$READER_URL" --api-key dummy --no-think \
+  --retrieval-top-k 5 --reader-top-k 3 --num-examples 1000 --max-tokens 200 \
+  --local-api --local-api-url http://api.pixelrag.ai:30001/search \
+  --query-instruction "Retrieve images or text relevant to the user's query."
+```
+
 Before running, `reproduce.sh` runs a **preflight**: it curls the reader and the retrieval
 serve(s) that *this* cell needs and checks each is up with the expected index (`/status`
 `total_vectors`). If a serve is down / on the wrong port / wrong index, it prints the exact
diff --git a/eval/pyproject.toml b/eval/pyproject.toml
index 5b70556..991c066 100644
--- a/eval/pyproject.toml
+++ b/eval/pyproject.toml
@@ -3,29 +3,36 @@ name = "pixelrag-repro"
 version = "0.1.0"
 description = "Self-contained reproduction harness for PixelRAG paper Table 1 — runs retrieval + reader + grader against live serves and prints the score."
 requires-python = ">=3.12,<3.13"
-# Deps for the API-path harness (run_bench.py + lib/): retrieval and reader run as
-# remote HTTP serves, so no local torch/vllm is needed here.
+# The base harness (run_bench.py + lib/) is a pure HTTP client — retrieval and reader run
+# as remote serves, so it needs no torch/vllm. Deps use `>=` floors (like the rest of the
+# workspace); uv.lock is the reproducibility contract (`uv sync --frozen`). numpy also
+# carries an upper bound (<2.3) so the optional `reader` extra — which adds vLLM, the
+# CUDA-fragile piece — can resolve alongside the base.
 dependencies = [
-    "aiohttp==3.13.5",
-    "datasets==4.8.5",
-    "openai==2.38.0",
-    "tqdm==4.67.3",
-    "pillow==12.2.0",
-    "requests==2.34.2",
-    "numpy==2.4.6",
-    "selenium==4.44.0",
-    "webdriver-manager==4.1.1",
-    "beautifulsoup4==4.14.3",
-    "lxml==6.1.1",
-    "tiktoken==0.13.0",
-    "trafilatura==2.0.0",
-    "litellm==1.86.2",
-    "botocore==1.43.18",
-    "tenacity==9.1.4",
-    "fastmcp==3.3.1",
-    "omegaconf==2.3.0",
-    "retry==0.9.2",
+    "aiohttp>=3.13.5",
+    "datasets>=4.8.5",
+    "openai>=2.38.0",
+    "tqdm>=4.67.3",
+    "pillow>=12.2.0",
+    "requests>=2.34.2",
+    "numpy>=1.26.0,<2.3",
+    "selenium>=4.44.0",
+    "webdriver-manager>=4.1.1",
+    "beautifulsoup4>=4.14.3",
+    "lxml>=6.1.1",
+    "tiktoken>=0.13.0",
+    "trafilatura>=2.0.0",
+    "litellm>=1.86.2",
+    "botocore>=1.43.18",
+    "tenacity>=9.1.4",
+    "fastmcp>=3.3.1",
+    "omegaconf>=2.3.0",
+    "retry>=0.9.2",
 ]
 
+[project.optional-dependencies]
+# Self-host the Qwen3.5-4B reader (vLLM 0.19.0) from this package: `uv sync --extra reader`.
+reader = ["vllm==0.19.0"]
+
 [tool.uv]
 package = false
diff --git a/eval/uv.lock b/eval/uv.lock
index fb7cf47..80cf7f8 100644
--- a/eval/uv.lock
+++ b/eval/uv.lock
@@ -93,6 +93,25 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/78/b6/6307fbef88d9b5ee7421e68d78a9f162e0da4900bc5f5793f6d3d0e34fb8/annotated_types-0.7.0-py3-none-any.whl", hash = "sha256:1f02e8b43a8fbbc3f3e0d4f0f4bfc8131bcb4eebe8849b8e5c773f3a1c582a53", size = 13643, upload-time = "2024-05-20T21:33:24.1Z" },
 ]
 
+[[package]]
+name = "anthropic"
+version = "0.111.0"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "anyio" },
+    { name = "distro" },
+    { name = "docstring-parser" },
+    { name = "httpx" },
+    { name = "jiter" },
+    { name = "pydantic" },
+    { name = "sniffio" },
+    { name = "typing-extensions" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/b9/8a/9afc7305a2ce4b52b30e137f83cd2a6a90b918b3997073db11bb5a1de55a/anthropic-0.111.0.tar.gz", hash = "sha256:39cbda0ac17a6d423e5bf609811bd69b26eddf6299d7a468126e05bc711ce826", size = 934001, upload-time = "2026-06-18T17:31:44.733Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/f1/bb/09e82a81885d787f350fb55ca9df865b63140dd28b3b5b3104c4ae261657/anthropic-0.111.0-py3-none-any.whl", hash = "sha256:c14edb36ed80da9099acbd26b5cec810d76606c31f32a0d56a4cf9d4fa9e25ae", size = 929774, upload-time = "2026-06-18T17:31:43.116Z" },
+]
+
 [[package]]
 name = "antlr4-python3-runtime"
 version = "4.9.3"
@@ -112,6 +131,32 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/da/42/e921fccf5015463e32a3cf6ee7f980a6ed0f395ceeaa45060b61d86486c2/anyio-4.13.0-py3-none-any.whl", hash = "sha256:08b310f9e24a9594186fd75b4f73f4a4152069e3853f1ed8bfbf58369f4ad708", size = 114353, upload-time = "2026-03-24T12:59:08.246Z" },
 ]
 
+[[package]]
+name = "apache-tvm-ffi"
+version = "0.1.12"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "typing-extensions" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/ff/95/ef83880657e89a0ce0f1ad79cbff11698286d00522dbc290d34a8458e9c2/apache_tvm_ffi-0.1.12.tar.gz", hash = "sha256:2aa5c8ece3144dad11afd6d0f10191d03cdb368bbcd9c92f9fb919f35906223d", size = 2843816, upload-time = "2026-06-09T18:17:31.68Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/7d/ef/8f2ea57791e8df55c5a52e20d415c01032ef5fa3761574268201b7cc2c79/apache_tvm_ffi-0.1.12-cp312-abi3-macosx_11_0_arm64.whl", hash = "sha256:218e55c807d49182710ef2ab0336313ba6becccb7e565f4941d23bded09646d4", size = 2463724, upload-time = "2026-06-09T18:16:41.904Z" },
+    { url = "https://files.pythonhosted.org/packages/bc/c4/34aec1f10353eee555687f3196241457b8e8a06da2014a176f5b022e24bd/apache_tvm_ffi-0.1.12-cp312-abi3-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:557d8deb672f2ad7f445399e3fa0c727a6e11472e19c895ee244cbb8cfd99a66", size = 2616513, upload-time = "2026-06-09T18:16:44.115Z" },
+    { url = "https://files.pythonhosted.org/packages/c8/2a/bff8d73841b49f196852edd8460241d3a363e6b0d64c3a9367542658394f/apache_tvm_ffi-0.1.12-cp312-abi3-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:817af52916ca9987e019ae9c811406835c7f26c590b2a7bcfa9db0e3809f4228", size = 2757612, upload-time = "2026-06-09T18:16:45.987Z" },
+    { url = "https://files.pythonhosted.org/packages/9c/20/51d0c31c76bef0f21c11bce0465598d1ea5fdaca22e47a69c29deadeada7/apache_tvm_ffi-0.1.12-cp312-abi3-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:8a7b08f377ea2663dae10e3045f8d0215f0378ee975096174a8af6381eeb1504", size = 2533956, upload-time = "2026-06-09T18:16:47.891Z" },
+    { url = "https://files.pythonhosted.org/packages/52/f3/fba607d803cb081be2d66ea51865492b42872898bd271d9bcc3e1ced4ef3/apache_tvm_ffi-0.1.12-cp312-abi3-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:dc0acd7eeb0e451d5e3f686af3ba0b495fdbf97b5b54cf9a0f770cdafe0e691a", size = 2719638, upload-time = "2026-06-09T18:16:50.021Z" },
+    { url = "https://files.pythonhosted.org/packages/06/d7/a25f51156358c631114e16cb09ca91188b3b79677369e111216d6fa7f83d/apache_tvm_ffi-0.1.12-cp312-abi3-win_amd64.whl", hash = "sha256:23eefd1094a41faae2bb7b9cc5816aa938101b624d48ebb724881f1a89b78e99", size = 2725953, upload-time = "2026-06-09T18:16:52.215Z" },
+]
+
+[[package]]
+name = "astor"
+version = "0.8.1"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/5a/21/75b771132fee241dfe601d39ade629548a9626d1d39f333fde31bc46febe/astor-0.8.1.tar.gz", hash = "sha256:6a6effda93f4e1ce9f618779b2dd1d9d84f1e32812c23a29b3fff6fd7f63fa5e", size = 35090, upload-time = "2019-12-10T01:50:35.51Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/c3/88/97eef84f48fa04fbd6750e62dcceafba6c63c81b7ac1420856c8dcc0a3f9/astor-0.8.1-py2.py3-none-any.whl", hash = "sha256:070a54e890cefb5b3739d19f30f5a5ec840ffc9c50ffa7d23cc9fc1a38ebbfc5", size = 27488, upload-time = "2019-12-10T01:50:33.628Z" },
+]
+
 [[package]]
 name = "attrs"
 version = "26.1.0"
@@ -165,6 +210,27 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/1a/39/47f9197bdd44df24d67ac8893641e16f386c984a0619ef2ee4c51fbbc019/beautifulsoup4-4.14.3-py3-none-any.whl", hash = "sha256:0918bfe44902e6ad8d57732ba310582e98da931428d231a5ecb9e7c703a735bb", size = 107721, upload-time = "2025-11-30T15:08:24.087Z" },
 ]
 
+[[package]]
+name = "blake3"
+version = "1.0.9"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/26/6a/4cc5a9dd40fd8a6d283fd3761e5f59c490109571ef8e3c73245417e5a305/blake3-1.0.9.tar.gz", hash = "sha256:5fa374fa5070ca084368776c19b420157eb0f2d3f091343d6bc59189929d62e2", size = 116872, upload-time = "2026-06-22T18:02:25.366Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/5c/d2/9bdf8345c70993aaef635398f52edfb915d6e8ad2c000c801204e387c456/blake3-1.0.9-cp312-cp312-macosx_10_12_x86_64.whl", hash = "sha256:a70c20542d5e7960983a0ff32999049a2b0e5ef1f22dbbbdfb51cf04828a4156", size = 344587, upload-time = "2026-06-22T18:00:34.244Z" },
+    { url = "https://files.pythonhosted.org/packages/36/9d/be8b1f7f85b12bb45a0fade6ca7bdbf83a507d23d0b6141ba29fe69c8cea/blake3-1.0.9-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:72cdecf088a9d25e6ec79948a578995649b0dbee407e7a46c543a9ecc0f6f281", size = 328864, upload-time = "2026-06-22T18:00:35.59Z" },
+    { url = "https://files.pythonhosted.org/packages/f2/78/66580635d744c826671fd219938caffb16281a26f62c4f856695d4233677/blake3-1.0.9-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:42fa57bf462285ef16400601b0fd32214c248ba92505bbb94b1221ab9af5a092", size = 373795, upload-time = "2026-06-22T18:00:36.887Z" },
+    { url = "https://files.pythonhosted.org/packages/b1/79/b5b17d3004bb81a5732c0b176c812703d200ed8c652b3b7713b9633bbe10/blake3-1.0.9-cp312-cp312-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:b25ccde5a64be070f20e5c7a81da70292db40b164b6c77588cbd6230856badbb", size = 374183, upload-time = "2026-06-22T18:00:38.205Z" },
+    { url = "https://files.pythonhosted.org/packages/3c/63/0d209c44b2041bbe130ced12a23c92dd995fbfe5bce7ee77fffea16f5cb0/blake3-1.0.9-cp312-cp312-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:2a800b87433955f37691b5f361ad29c7dd3ee089c9cd109adc5aea8e24bc4c1f", size = 446783, upload-time = "2026-06-22T18:00:39.493Z" },
+    { url = "https://files.pythonhosted.org/packages/c5/51/efd1f9b8a9d3e9a0e235f3ced99a738529a1019fe78b3988e29d9c2fbba6/blake3-1.0.9-cp312-cp312-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:6879739e7904b9c42afbedbcc2e8c36cebe140fb3fc3f5c492993579cf5cd516", size = 487369, upload-time = "2026-06-22T18:00:40.875Z" },
+    { url = "https://files.pythonhosted.org/packages/8d/3f/a8dcaea9e0b26e419a540ca0cd6203c9fbb505e85b02b03c5a59bf9e6a45/blake3-1.0.9-cp312-cp312-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:6edeb3d49a24c307995899b70dd47aa901d0e9ad51d2f8a79aba4f074f32d8c5", size = 383845, upload-time = "2026-06-22T18:00:42.251Z" },
+    { url = "https://files.pythonhosted.org/packages/f6/10/e9907f5b86410d5071982aaf05d149ca4d4fd8acab7e77eebbc9a333c7b4/blake3-1.0.9-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:bcd56a7a972c4185070f7042ccc20166927eec3c0f98b8405f375d007b604a0b", size = 383851, upload-time = "2026-06-22T18:00:43.715Z" },
+    { url = "https://files.pythonhosted.org/packages/34/cf/c7863a185550706a9624f6aa7b6d46470aaed0bb46a827c5cda2a7d03151/blake3-1.0.9-cp312-cp312-manylinux_2_31_riscv64.whl", hash = "sha256:a288664d08dee154cc496e06e62517fc9e655ecec12b0d7db538d244ac79edf1", size = 380067, upload-time = "2026-06-22T18:00:45.249Z" },
+    { url = "https://files.pythonhosted.org/packages/54/0a/e7af679c719368b400c9ba9c3460072aac2ba077ddbd4bc806fef28cda03/blake3-1.0.9-cp312-cp312-musllinux_1_1_aarch64.whl", hash = "sha256:91db52a809b68b5bebe7c413ddcd230e1f759398e7fa7a873104595a4fa648b6", size = 549471, upload-time = "2026-06-22T18:00:46.793Z" },
+    { url = "https://files.pythonhosted.org/packages/2c/3c/37c1dd3539b7bd9b6d2eef019802aacdb4a3d48ab484b140603bbf9c5b5a/blake3-1.0.9-cp312-cp312-musllinux_1_1_x86_64.whl", hash = "sha256:cfaa671b07eb73883162ca940442193868358b0b904cfa266e4b74131ce966da", size = 591396, upload-time = "2026-06-22T18:00:48.122Z" },
+    { url = "https://files.pythonhosted.org/packages/ae/55/4f0a23b72795292e74084834130900ea778c0583004519c86698dfffe1a5/blake3-1.0.9-cp312-cp312-win32.whl", hash = "sha256:ae47c3d5729ff89baa6ddf6de47fcfcc915985d39eb1bfcd6db653331f3c6fcc", size = 229271, upload-time = "2026-06-22T18:00:49.377Z" },
+    { url = "https://files.pythonhosted.org/packages/12/91/7db93e4689f0f145bcb954dc62936e5f5090548a9fa20c6bbebfaeaa648a/blake3-1.0.9-cp312-cp312-win_amd64.whl", hash = "sha256:15566065ff90ab3da46ec0be1417406f00507af902b6fb0fbc6563e77f02fc42", size = 218220, upload-time = "2026-06-22T18:00:50.659Z" },
+]
+
 [[package]]
 name = "botocore"
 version = "1.43.18"
@@ -201,6 +267,22 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/86/93/1f76c8d1bafe3b0614e06b2195784a3765bbf7b0a067661af9e2dd47fc33/caio-0.9.25-py3-none-any.whl", hash = "sha256:06c0bb02d6b929119b1cfbe1ca403c768b2013a369e2db46bfa2a5761cf82e40", size = 19087, upload-time = "2025-12-26T15:22:00.221Z" },
 ]
 
+[[package]]
+name = "cbor2"
+version = "6.1.2"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/75/af/473c241e41c142ea06ebef8d1f660fa6ff928fb97210e7bec8ee5974f8cd/cbor2-6.1.2.tar.gz", hash = "sha256:6b43037a66947dee5af0abb1a4c3a13b3abac5a4a3f32f9771efbbcd030fd909", size = 86760, upload-time = "2026-06-02T19:01:29.333Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/5e/0c/a857b6ca032282b564cf25de18ad92fe0614e8b3fa3422eb10e32a873939/cbor2-6.1.2-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:92b158d3ff9d9dce70eeb09786a6e518e3cb0ecb927fd23e9a0f7fc4b175c01a", size = 409592, upload-time = "2026-06-02T19:00:44.556Z" },
+    { url = "https://files.pythonhosted.org/packages/29/db/e0518153b3228159d9373f3b5785d7ea2d68898e27ee1bce7d03f0b5f7aa/cbor2-6.1.2-cp312-cp312-manylinux_2_28_aarch64.whl", hash = "sha256:d29a11044b07048e19f39a87fe8fea7ea865eb0ace50dc4c29513d52d40e2ddf", size = 454598, upload-time = "2026-06-02T19:00:45.784Z" },
+    { url = "https://files.pythonhosted.org/packages/29/67/62127b22edc6011ba55b76a28ab7c2219a45d01871a8199532e0978b26d1/cbor2-6.1.2-cp312-cp312-manylinux_2_28_x86_64.whl", hash = "sha256:a106f174eda34d8937a621c7f3e6044586cb209170cdc8da0ffbea89d1d6e385", size = 467380, upload-time = "2026-06-02T19:00:47.196Z" },
+    { url = "https://files.pythonhosted.org/packages/7c/95/7992d8ec904c116ad547abb4960cc3fde695d5853c66596b1465d14d2f7b/cbor2-6.1.2-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:2ea16a25cc457a92879ff7a36cc50b587bddba09d8176bf1a94803eec5aa27eb", size = 521672, upload-time = "2026-06-02T19:00:48.656Z" },
+    { url = "https://files.pythonhosted.org/packages/cb/cf/80cc4be132a523f0c92fb4c71813577bb393abea9e27990ca74605e0e930/cbor2-6.1.2-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:2652a94224980d47f2a3866dd35b1afe532ecdfaf91f8cfcec39a026c457a844", size = 534402, upload-time = "2026-06-02T19:00:50.064Z" },
+    { url = "https://files.pythonhosted.org/packages/b1/ea/99e466d8bef61a0775a1d8538ae6c9d95f4533fadc01f8f7814cb7ab80ad/cbor2-6.1.2-cp312-cp312-win32.whl", hash = "sha256:618666292900487db4a5abcade3150105c9c9fdd22576e6ff297c9a72eef0c6a", size = 283225, upload-time = "2026-06-02T19:00:51.406Z" },
+    { url = "https://files.pythonhosted.org/packages/14/13/e6a677bdc499e43049006cb54fe605b0f7aef621402d31354cc42ef293c9/cbor2-6.1.2-cp312-cp312-win_amd64.whl", hash = "sha256:c61c0b2e2cee64497e6c62d1976bc212f62ac0cd2b5b903613610d79b8b06b60", size = 300844, upload-time = "2026-06-02T19:00:52.628Z" },
+    { url = "https://files.pythonhosted.org/packages/77/4a/08bd8461f8e2e1ce1de5ae2768f2b7ca39a090e3156c1ee0d9b5fd86e70d/cbor2-6.1.2-cp312-cp312-win_arm64.whl", hash = "sha256:c871e7266ddc545b258e6f8e5300396985dc485d7ccf8bb4777385782f302153", size = 289040, upload-time = "2026-06-02T19:00:53.971Z" },
+]
+
 [[package]]
 name = "certifi"
 version = "2026.5.20"
@@ -270,6 +352,15 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/c7/0d/67e5b4109ea4a837e80daa87c2c696711955e40449a97e8926672534def2/click-8.4.1-py3-none-any.whl", hash = "sha256:482be17c6991b8c19c5429a1e995d9b0efdbb63172824c41f99965dc0ade8ec2", size = 116639, upload-time = "2026-05-22T04:08:35.26Z" },
 ]
 
+[[package]]
+name = "cloudpickle"
+version = "3.1.2"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/27/fb/576f067976d320f5f0114a8d9fa1215425441bb35627b1993e5afd8111e5/cloudpickle-3.1.2.tar.gz", hash = "sha256:7fda9eb655c9c230dab534f1983763de5835249750e85fbcef43aaa30a9a2414", size = 22330, upload-time = "2025-11-03T09:25:26.604Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/88/39/799be3f2f0f38cc727ee3b4f1445fe6d5e4133064ec2e4115069418a5bb6/cloudpickle-3.1.2-py3-none-any.whl", hash = "sha256:9acb47f6afd73f60dc1df93bb801b472f05ff42fa6c84167d25cb206be1fbf4a", size = 22228, upload-time = "2025-11-03T09:25:25.534Z" },
+]
+
 [[package]]
 name = "colorama"
 version = "0.4.6"
@@ -279,6 +370,21 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/d1/d6/3965ed04c63042e047cb6a3e6ed1a63a35087b6a609aa3a15ed8ac56c221/colorama-0.4.6-py2.py3-none-any.whl", hash = "sha256:4f1d9991f5acc0ca119f9d443620b77f9d6b33703e51011c16baf57afb285fc6", size = 25335, upload-time = "2022-10-25T02:36:20.889Z" },
 ]
 
+[[package]]
+name = "compressed-tensors"
+version = "0.14.0.1"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "loguru" },
+    { name = "pydantic" },
+    { name = "torch" },
+    { name = "transformers" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/eb/f1/4c9b01ceaf82ad58ad00919223e09b8e74d4073a2ba8e3ab2f97521ef65c/compressed_tensors-0.14.0.1.tar.gz", hash = "sha256:5ad3841184b6f5020e06059b2463191c5c57a144bb97cab9159978d8118839b1", size = 226393, upload-time = "2026-03-11T17:04:35.57Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/0a/26/16a13993ecf4fdc9c39d63b3a6daabafd32a452cf68b81aa9eb3b8170913/compressed_tensors-0.14.0.1-py3-none-any.whl", hash = "sha256:46c4940a3a779d3d97108c294bfcd9acf4bd0491f7c6737c320f0e815ec732e4", size = 196454, upload-time = "2026-03-11T17:04:33.2Z" },
+]
+
 [[package]]
 name = "courlan"
 version = "1.3.2"
@@ -332,6 +438,87 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/9d/9a/0fea98a70cf1749d41d738836f6349d97945f7c89433a259a6c2642eefeb/cryptography-48.0.0-cp39-abi3-win_amd64.whl", hash = "sha256:16cd65b9330583e4619939b3a3843eec1e6e789744bb01e7c7e2e62e33c239c8", size = 3792100, upload-time = "2026-05-04T22:59:14.884Z" },
 ]
 
+[[package]]
+name = "cuda-bindings"
+version = "12.9.4"
+source = { registry = "https://pypi.org/simple" }
+resolution-markers = [
+    "sys_platform != 'emscripten' and sys_platform != 'win32'",
+]
+dependencies = [
+    { name = "cuda-pathfinder", marker = "sys_platform != 'emscripten' and sys_platform != 'win32'" },
+]
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/0c/c2/65bfd79292b8ff18be4dd7f7442cea37bcbc1a228c1886f1dea515c45b67/cuda_bindings-12.9.4-cp312-cp312-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:694ba35023846625ef471257e6b5a4bc8af690f961d197d77d34b1d1db393f56", size = 11760260, upload-time = "2025-10-21T14:51:40.79Z" },
+    { url = "https://files.pythonhosted.org/packages/a9/c1/dabe88f52c3e3760d861401bb994df08f672ec893b8f7592dc91626adcf3/cuda_bindings-12.9.4-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:fda147a344e8eaeca0c6ff113d2851ffca8f7dfc0a6c932374ee5c47caa649c8", size = 12151019, upload-time = "2025-10-21T14:51:43.167Z" },
+]
+
+[[package]]
+name = "cuda-bindings"
+version = "13.3.1"
+source = { registry = "https://pypi.org/simple" }
+resolution-markers = [
+    "sys_platform == 'win32'",
+    "sys_platform == 'emscripten'",
+]
+dependencies = [
+    { name = "cuda-pathfinder", marker = "sys_platform == 'emscripten' or sys_platform == 'win32'" },
+]
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/7c/95/872a0392122f1fb43fcb06869790ef3171f37beee9f7db8f441739113570/cuda_bindings-13.3.1-cp312-cp312-win_amd64.whl", hash = "sha256:b134dd8c5c66ae4c4ad814f7aee88fd215353c077010cbc47e3b55ed35ec9eff", size = 5875099, upload-time = "2026-05-29T23:11:54.635Z" },
+]
+
+[[package]]
+name = "cuda-core"
+version = "1.0.1"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "cuda-pathfinder", marker = "sys_platform == 'emscripten' or sys_platform == 'win32'" },
+    { name = "numpy", marker = "sys_platform == 'emscripten' or sys_platform == 'win32'" },
+]
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/c2/1a/ae079963c9df7f4274227eb63cf8f6083a532a6443adb340d951fd21c626/cuda_core-1.0.1-cp312-cp312-win_amd64.whl", hash = "sha256:1a5c1aa3b738a7599ea289498d038fe625d259fd7ab795394541eee58a8e29bc", size = 4663076, upload-time = "2026-05-12T20:11:35.784Z" },
+]
+
+[[package]]
+name = "cuda-pathfinder"
+version = "1.5.5"
+source = { registry = "https://pypi.org/simple" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/11/c8/26f2e4aae92f11522a96043892ba39a90eac610d5242523aa863212bc1c7/cuda_pathfinder-1.5.5-py3-none-any.whl", hash = "sha256:0228c023f95d1480f143ef5c8922d27a2ab052087a942e81dc289c9eb8f91689", size = 51671, upload-time = "2026-05-27T01:21:25.413Z" },
+]
+
+[[package]]
+name = "cuda-python"
+version = "12.9.4"
+source = { registry = "https://pypi.org/simple" }
+resolution-markers = [
+    "sys_platform != 'emscripten' and sys_platform != 'win32'",
+]
+dependencies = [
+    { name = "cuda-bindings", version = "12.9.4", source = { registry = "https://pypi.org/simple" }, marker = "sys_platform != 'emscripten' and sys_platform != 'win32'" },
+]
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/af/f3/6b032a554019cfb3447e671798c1bd3e79b5f1af20d10253f56cea269ef2/cuda_python-12.9.4-py3-none-any.whl", hash = "sha256:d2cacea882a69863f1e7d27ee71d75f0684f4c76910aff839067e4f89c902279", size = 7594, upload-time = "2025-10-21T14:55:12.846Z" },
+]
+
+[[package]]
+name = "cuda-python"
+version = "13.3.1"
+source = { registry = "https://pypi.org/simple" }
+resolution-markers = [
+    "sys_platform == 'win32'",
+    "sys_platform == 'emscripten'",
+]
+dependencies = [
+    { name = "cuda-bindings", version = "13.3.1", source = { registry = "https://pypi.org/simple" }, marker = "sys_platform == 'emscripten' or sys_platform == 'win32'" },
+    { name = "cuda-core", marker = "sys_platform == 'emscripten' or sys_platform == 'win32'" },
+    { name = "cuda-pathfinder", marker = "sys_platform == 'emscripten' or sys_platform == 'win32'" },
+]
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/38/31/7ff3f7768eded7535c621abc2fecb9d181a34ea4cae3afe682feb796f242/cuda_python-13.3.1-py3-none-any.whl", hash = "sha256:280b014139ab447b6dd70a377db1596f310d6e887d9d342e6651b919ec145fb3", size = 8295, upload-time = "2026-05-29T23:28:47.012Z" },
+]
+
 [[package]]
 name = "cyclopts"
 version = "4.16.1"
@@ -396,6 +583,28 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/05/7f/798705f5296a58ca505d600456748d1be48078eac8a7050d8a98bc9edb89/decorator-5.3.1-py3-none-any.whl", hash = "sha256:f47fe6fdbd2edd623ecfe36875d37aba411624e2670dd395dddae1358689bb3c", size = 10365, upload-time = "2026-05-18T06:03:26.517Z" },
 ]
 
+[[package]]
+name = "depyf"
+version = "0.20.0"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "astor" },
+    { name = "dill" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/88/35/83fb0178212279aa0af031031905804c6de5618435d229f41ed21bb9ad2c/depyf-0.20.0.tar.gz", hash = "sha256:fb7683bd72c44f67b56029df2c47721e9a02ffa4d7b19095f1c54c4ebf797a98", size = 6168761, upload-time = "2025-10-13T12:33:38.589Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/cf/65/4df6936130b56e1429114e663e7c1576cf845f3aef1b2dd200c0a5d19dba/depyf-0.20.0-py3-none-any.whl", hash = "sha256:d31effad4261cebecb58955d832e448ace88f432328f95f82fd99c30fd9308d4", size = 39381, upload-time = "2025-10-13T12:33:33.647Z" },
+]
+
+[[package]]
+name = "detect-installer"
+version = "0.1.0"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/5f/ce/6897d812825e9d4c53e3c7112726e800cc5231b013b2223bf64f653ff362/detect_installer-0.1.0.tar.gz", hash = "sha256:00ad7ba0a36e3cf7d08a40d3643011746dbc112597c7d475cc91c416710ca4e7", size = 3049, upload-time = "2026-02-23T10:40:22.567Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/cc/34/8cc73273414405086c58852916e4031812a6a30fe04c057e37ad99397b7f/detect_installer-0.1.0-py3-none-any.whl", hash = "sha256:034fb20fd665c36e6ba52b8821525ea07fb4f7f938cac459df889fb33801528a", size = 4539, upload-time = "2026-02-23T10:40:23.807Z" },
+]
+
 [[package]]
 name = "dill"
 version = "0.4.1"
@@ -405,6 +614,15 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/1e/77/dc8c558f7593132cf8fefec57c4f60c83b16941c574ac5f619abb3ae7933/dill-0.4.1-py3-none-any.whl", hash = "sha256:1e1ce33e978ae97fcfcff5638477032b801c46c7c65cf717f95fbc2248f79a9d", size = 120019, upload-time = "2026-01-19T02:36:55.663Z" },
 ]
 
+[[package]]
+name = "diskcache"
+version = "5.6.3"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/3f/21/1c1ffc1a039ddcc459db43cc108658f32c57d271d7289a2794e401d0fdb6/diskcache-5.6.3.tar.gz", hash = "sha256:2c3a3fa2743d8535d832ec61c2054a1641f41775aa7c556758a109941e33e4fc", size = 67916, upload-time = "2023-08-31T06:12:00.316Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/3f/27/4570e78fc0bf5ea0ca45eb1de3818a23787af9b390c0b0a0033a1b8236f9/diskcache-5.6.3-py3-none-any.whl", hash = "sha256:5e31b2d5fbad117cc363ebaf6b689474db18a1f6438bc82358b024abd4c2ca19", size = 45550, upload-time = "2023-08-31T06:11:58.822Z" },
+]
+
 [[package]]
 name = "distro"
 version = "1.9.0"
@@ -432,6 +650,15 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/a7/5f/ed01f9a3cdffbd5a008556fc7b2a08ddb1cc6ace7effa7340604b1d16699/docstring_parser-0.18.0-py3-none-any.whl", hash = "sha256:b3fcbed555c47d8479be0796ef7e19c2670d428d72e96da63f3a40122860374b", size = 22484, upload-time = "2026-04-14T04:09:18.638Z" },
 ]
 
+[[package]]
+name = "einops"
+version = "0.8.2"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/2c/77/850bef8d72ffb9219f0b1aac23fbc1bf7d038ee6ea666f331fa273031aa2/einops-0.8.2.tar.gz", hash = "sha256:609da665570e5e265e27283aab09e7f279ade90c4f01bcfca111f3d3e13f2827", size = 56261, upload-time = "2026-01-26T04:13:17.638Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/2a/09/f8d8f8f31e4483c10a906437b4ce31bdf3d6d417b73fe33f1a8b59e34228/einops-0.8.2-py3-none-any.whl", hash = "sha256:54058201ac7087911181bfec4af6091bb59380360f069276601256a76af08193", size = 65638, upload-time = "2026-01-26T04:13:18.546Z" },
+]
+
 [[package]]
 name = "email-validator"
 version = "2.3.0"
@@ -457,6 +684,99 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/8a/0e/97c33bf5009bdbac74fd2beace167cab3f978feb69cc36f1ef79360d6c4e/exceptiongroup-1.3.1-py3-none-any.whl", hash = "sha256:a7a39a3bd276781e98394987d3a5701d0c4edffb633bb7a5144577f82c773598", size = 16740, upload-time = "2025-11-21T23:01:53.443Z" },
 ]
 
+[[package]]
+name = "fastapi"
+version = "0.138.0"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "annotated-doc" },
+    { name = "pydantic" },
+    { name = "starlette" },
+    { name = "typing-extensions" },
+    { name = "typing-inspection" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/5b/58/ff455d9fe47c60abadb34b9e05a304b1f05f5ab8000ac01565156b6f5e43/fastapi-0.138.0.tar.gz", hash = "sha256:d445a4877636ad191e7053e08c9bf98cb921a6756776848400bb773d1740c061", size = 419240, upload-time = "2026-06-20T01:18:05.259Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/6c/ff/8496d9847a5fedae775eb49460722d3efaa80487854273e9647ae876218c/fastapi-0.138.0-py3-none-any.whl", hash = "sha256:b6f54fd1bd72c80b0f899f172c61a600f6f7af9b43d4d772a018f35624048cb0", size = 126779, upload-time = "2026-06-20T01:18:03.483Z" },
+]
+
+[package.optional-dependencies]
+standard = [
+    { name = "email-validator" },
+    { name = "fastapi-cli", extra = ["standard"] },
+    { name = "fastar" },
+    { name = "httpx" },
+    { name = "jinja2" },
+    { name = "pydantic-extra-types" },
+    { name = "pydantic-settings" },
+    { name = "python-multipart" },
+    { name = "uvicorn", extra = ["standard"] },
+]
+
+[[package]]
+name = "fastapi-cli"
+version = "0.0.27"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "rich-toolkit" },
+    { name = "typer" },
+    { name = "uvicorn", extra = ["standard"] },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/37/d0/ee5678346811967b8d096d5d5604e71b50d6bf5a2abfbdb331157e2bbaa9/fastapi_cli-0.0.27.tar.gz", hash = "sha256:1dffb1e40c0c88f2e0171a8a252a2b615c1e63ff8c05626649e4badd6a84336a", size = 23630, upload-time = "2026-06-18T14:48:43.421Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/1a/ab/0a709f9488fe62647db80f8a277fb0ee62e85adc6746abf477ed373c9eb7/fastapi_cli-0.0.27-py3-none-any.whl", hash = "sha256:2e389a40f318e29fec8cb1e289f267f17c048876fb82dbfa869a10b16740495d", size = 13070, upload-time = "2026-06-18T14:48:44.311Z" },
+]
+
+[package.optional-dependencies]
+standard = [
+    { name = "fastapi-cloud-cli" },
+    { name = "uvicorn", extra = ["standard"] },
+]
+
+[[package]]
+name = "fastapi-cloud-cli"
+version = "0.20.0"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "detect-installer" },
+    { name = "fastar" },
+    { name = "httpx" },
+    { name = "pydantic", extra = ["email"] },
+    { name = "rich-toolkit" },
+    { name = "rignore" },
+    { name = "sentry-sdk" },
+    { name = "typer" },
+    { name = "uvicorn", extra = ["standard"] },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/ad/bf/97d19633c6ec6fb0ef59df474b9705ea992f7b4f879208d0007ac6d25ab6/fastapi_cloud_cli-0.20.0.tar.gz", hash = "sha256:9681c46adcd299024d0775658bd5d88992fd35c4ad42b1f045c6df913390ba37", size = 85904, upload-time = "2026-06-11T17:41:02.814Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/f8/6e/bbb2e1b8f3170b6426b707d49981a838fc1d5cbb428dd9a271f1c3951c23/fastapi_cloud_cli-0.20.0-py3-none-any.whl", hash = "sha256:dcbf071fc659ae2d3fb30e221a661c3fa240b7d5091203cf941face31f6d7860", size = 68793, upload-time = "2026-06-11T17:41:01.804Z" },
+]
+
+[[package]]
+name = "fastar"
+version = "0.11.0"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/03/0f/0aeb3fc50046617702acc0078b277b58367fd62eb727b9ec733ae0e8bbcc/fastar-0.11.0.tar.gz", hash = "sha256:aa7f100f7313c03fdb20f1385927ba95671071ba308ad0c1763fef295e1895ce", size = 70238, upload-time = "2026-04-13T17:11:17.143Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/0f/06/a5773706afc8bd496769786590bbc56d2d0ee419a299cc12ea3f5717fcf3/fastar-0.11.0-cp312-cp312-macosx_10_12_x86_64.whl", hash = "sha256:3c51f1c2cdddbd1420d2897ace7738e36c65e17f6ae84e0bfe763f8d1068bb97", size = 708394, upload-time = "2026-04-13T17:09:57.269Z" },
+    { url = "https://files.pythonhosted.org/packages/cc/a6/d5e2a4e48495616440a21eed07558219ca90243ad00b0502586f95bd4833/fastar-0.11.0-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:0d9d6b052baf5380baea866675dab6ccd04ec2460d12b1c46f10ce3f4ee6a820", size = 628417, upload-time = "2026-04-13T17:09:42.145Z" },
+    { url = "https://files.pythonhosted.org/packages/ab/69/9816d69ac8265c9e50456637a487ccfb7a9c566efd9dbcd673df9c2558c2/fastar-0.11.0-cp312-cp312-manylinux_2_12_i686.manylinux2010_i686.whl", hash = "sha256:bd2f05666d4df7e14885b5c38fefd92a785917387513d33d837ff42ec143a22f", size = 863950, upload-time = "2026-04-13T17:09:11.506Z" },
+    { url = "https://files.pythonhosted.org/packages/5b/0d/f88daad53aff2e754b6b5ff2a7113f72447a34f6ef17cc23ca99988117b7/fastar-0.11.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:c1e6e74aba1ae77ca4aedcaf1697cd413319f4c88a5ccbe5b42c709517c5097e", size = 760737, upload-time = "2026-04-13T17:07:55.958Z" },
+    { url = "https://files.pythonhosted.org/packages/2f/a6/82ef4ecd969d50d92ed3ed9dbd8fe77faa24be5e5736f716edc9f4ce8d62/fastar-0.11.0-cp312-cp312-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:38ef77fe940bbc9b37a98bd838727f844b11731cd39358a2640ff864fb385086", size = 757603, upload-time = "2026-04-13T17:08:10.623Z" },
+    { url = "https://files.pythonhosted.org/packages/03/35/50249f0d827251f8ac511495e2eacccebda80a00a0ad73e9615b8113b84f/fastar-0.11.0-cp312-cp312-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:8955e61b32d6aff82c983217abf80933fd823b0e727586fc72f08043d996fd59", size = 923952, upload-time = "2026-04-13T17:08:25.526Z" },
+    { url = "https://files.pythonhosted.org/packages/7b/d8/faee41659e9c379d906d24eaee6d6833ac8cfef0a5df480e5c2a8d3efb33/fastar-0.11.0-cp312-cp312-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:483532442cdb08fbff0169510224eae0836f2f672cea6aacb52847d90fefdc46", size = 816574, upload-time = "2026-04-13T17:08:56.076Z" },
+    { url = "https://files.pythonhosted.org/packages/22/47/0448ea7992b997dad2bf004bfd98eca74b5858630eae080b50c7b17d9ddc/fastar-0.11.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:ef5a6071121e05d8287fc75bccb054bcbac8bb0501200a0c0a8feeace5303ea4", size = 819382, upload-time = "2026-04-13T17:09:26.66Z" },
+    { url = "https://files.pythonhosted.org/packages/33/ef/0d63eb43586831b7a6f8b22c4d77125a7c594423af1f4f090fa9541b9b40/fastar-0.11.0-cp312-cp312-manylinux_2_31_riscv64.whl", hash = "sha256:e45e598af5afe8412197d4786efd6cf29be02e7d3d4f6a3461149eae5d7e94f1", size = 885254, upload-time = "2026-04-13T17:08:40.9Z" },
+    { url = "https://files.pythonhosted.org/packages/01/25/edd584675d69e49a165052c3ee886df1c5d574f3e7d813c990306387c623/fastar-0.11.0-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:2e160919b1c47ddb8538e7e8eb4cd527281b40f0bf75110a75993838ef61f286", size = 971239, upload-time = "2026-04-13T17:10:12.997Z" },
+    { url = "https://files.pythonhosted.org/packages/a5/37/e8bb24f506ba2b08fbaf36c5800e843bd4d542954e9331f00418e2d23349/fastar-0.11.0-cp312-cp312-musllinux_1_2_armv7l.whl", hash = "sha256:4bb4dc0fc8f7a6807febcebce8a2f3626ba4955a9263d81ecc630aad83be84c0", size = 1035185, upload-time = "2026-04-13T17:10:30.207Z" },
+    { url = "https://files.pythonhosted.org/packages/9a/bf/be753736296338149ee4cb3e92e2b5423d6ba17c7b951d15218fd7e99bbf/fastar-0.11.0-cp312-cp312-musllinux_1_2_i686.whl", hash = "sha256:4ec95af56aa173f6e320e1183001bf108ba59beaf13edd1fc8200648db203588", size = 1072191, upload-time = "2026-04-13T17:10:47.072Z" },
+    { url = "https://files.pythonhosted.org/packages/d2/cd/a81c1aaafb5a22ce57c98ae22f39c89413ed53e4ee6e1b1444b0bd666a6c/fastar-0.11.0-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:136cf342735464091c39dc3708168f9fdeb9ebea40b1ead937c61afaf46143d9", size = 1028054, upload-time = "2026-04-13T17:11:04.293Z" },
+    { url = "https://files.pythonhosted.org/packages/ec/88/1ce4eed3d70627c95f49ca017f6bbbf2ddcc4b0c601d293259de7689bc20/fastar-0.11.0-cp312-cp312-win32.whl", hash = "sha256:35f23c11b556cc4d3704587faacbc0037f7bdf6c4525cd1d09c70bda4b1c6809", size = 454198, upload-time = "2026-04-13T17:11:45.168Z" },
+    { url = "https://files.pythonhosted.org/packages/8f/1d/26ce92f4331cd61a69840db9ca6115829805eec24f285481a854f578e917/fastar-0.11.0-cp312-cp312-win_amd64.whl", hash = "sha256:920bc56c3c0b8a8ca492904941d1883c1c947c858cd93343356c29122a38f44c", size = 486697, upload-time = "2026-04-13T17:11:31.084Z" },
+    { url = "https://files.pythonhosted.org/packages/ed/96/e6eda4480559c69b05d466e7b5ea9170e81fef3795a73e059959a3258319/fastar-0.11.0-cp312-cp312-win_arm64.whl", hash = "sha256:395248faf89e8a6bd5dc1fd544c8465113b627cb6d7c8b296796b60ebea33593", size = 462591, upload-time = "2026-04-13T17:11:20.577Z" },
+]
+
 [[package]]
 name = "fastmcp"
 version = "3.3.1"
@@ -545,6 +865,38 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/81/47/dd9a212ef6e343a6857485ffe25bba537304f1913bdbed446a23f7f592e1/filelock-3.29.0-py3-none-any.whl", hash = "sha256:96f5f6344709aa1572bbf631c640e4ebeeb519e08da902c39a001882f30ac258", size = 39812, upload-time = "2026-04-19T15:39:08.752Z" },
 ]
 
+[[package]]
+name = "flashinfer-cubin"
+version = "0.6.6"
+source = { registry = "https://pypi.org/simple" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/12/e8/826f9452bc5f76b94d7eb025f03dcaf1b51b9ed7790386c0285191e69be4/flashinfer_cubin-0.6.6-py3-none-any.whl", hash = "sha256:36508dfc792eb5ecfb15d2c140a7702812e1fa1ab0fb03929b2ed55e3e8191f3", size = 267661457, upload-time = "2026-03-11T01:36:36.538Z" },
+]
+
+[[package]]
+name = "flashinfer-python"
+version = "0.6.6"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "apache-tvm-ffi" },
+    { name = "click" },
+    { name = "einops" },
+    { name = "ninja" },
+    { name = "numpy" },
+    { name = "nvidia-cudnn-frontend" },
+    { name = "nvidia-cutlass-dsl" },
+    { name = "nvidia-ml-py" },
+    { name = "packaging" },
+    { name = "requests" },
+    { name = "tabulate" },
+    { name = "torch" },
+    { name = "tqdm" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/03/70/c5a235297351021f5d3d3233523a85f5a6468495587489ad2f257e8eafe2/flashinfer_python-0.6.6.tar.gz", hash = "sha256:0730ba7c7aad332961933bcebc5119762797161ede57d955f6fd199818ed1d92", size = 5344156, upload-time = "2026-03-11T01:36:21.434Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/e0/61/385d06755f3ab66333018285657adf0daf8a90a129448231fd09e315bd2e/flashinfer_python-0.6.6-py3-none-any.whl", hash = "sha256:078f158636969eec1a0d3dea19c3ca90b426b66df89bbf7b7b8276ce2ec08148", size = 7817047, upload-time = "2026-03-11T01:36:19.198Z" },
+]
+
 [[package]]
 name = "frozenlist"
 version = "1.8.0"
@@ -584,6 +936,33 @@ http = [
     { name = "aiohttp" },
 ]
 
+[[package]]
+name = "gguf"
+version = "0.19.0"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "numpy" },
+    { name = "pyyaml" },
+    { name = "requests" },
+    { name = "tqdm" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/48/ae/17f1308ae45cd7b08ebb521747d5b23f4efc4d172038a4e228dd5106c3ff/gguf-0.19.0.tar.gz", hash = "sha256:dbadcd6cc7ccd44256f2229fe7c2dff5e8aa5cf0612ab987fd2b1a57e428923f", size = 111220, upload-time = "2026-05-06T13:04:03.667Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/b3/bb/d71d6da82763528c2c2ed6b59a9d6142c6595545a4c448e2085d155e88c2/gguf-0.19.0-py3-none-any.whl", hash = "sha256:70bcd10edfe697fb2dad6e40af2234b9d8ece9a41a99761405121ebda1c3c1cd", size = 118475, upload-time = "2026-05-06T13:04:02.588Z" },
+]
+
+[[package]]
+name = "googleapis-common-protos"
+version = "1.75.0"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "protobuf" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/b5/c8/f439cffde755cffa462bfbb156278fa6f9d09119719af9814b858fd4f81f/googleapis_common_protos-1.75.0.tar.gz", hash = "sha256:53a062ff3c32552fbd62c11fe23768b78e4ddf0494d5e5fd97d3f4689c75fbbd", size = 151035, upload-time = "2026-05-07T08:04:49.423Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/e7/c8/e2645aa8ed02fd4c7a2f59d68783b65b1f3cbdfe39a6308e156509d1fee8/googleapis_common_protos-1.75.0-py3-none-any.whl", hash = "sha256:961ed60399c457ceb0ee8f285a84c870aabc9c6a832b9d37bb281b5bebde43ed", size = 300631, upload-time = "2026-05-07T08:03:30.345Z" },
+]
+
 [[package]]
 name = "griffelib"
 version = "2.0.2"
@@ -593,6 +972,27 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/11/8c/c9138d881c79aa0ea9ed83cbd58d5ca75624378b38cee225dcf5c42cc91f/griffelib-2.0.2-py3-none-any.whl", hash = "sha256:925c857658fb1ba40c0772c37acbc2ab650bd794d9c1b9726922e36ea4117ea1", size = 142357, upload-time = "2026-03-27T11:34:46.275Z" },
 ]
 
+[[package]]
+name = "grpcio"
+version = "1.81.1"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "typing-extensions" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/b0/b5/1ff353970a87eda4c98251e34d2dfd214abd4982dc89119c9252a2a482d2/grpcio-1.81.1.tar.gz", hash = "sha256:6fa10a767143a5e82e8eaab53918af0cd8909a57a27f8cb2288b80a613ac671b", size = 13026582, upload-time = "2026-06-11T12:46:51.673Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/85/07/9a979c81738863a738dc23d65177056e71fbb2db817740ed870b33434e7a/grpcio-1.81.1-cp312-cp312-linux_armv7l.whl", hash = "sha256:8b39472beafc0bdcafc4c8c73ad082ebfdb449d566897a61e7acb4fa88089115", size = 6053264, upload-time = "2026-06-11T12:45:21.017Z" },
+    { url = "https://files.pythonhosted.org/packages/75/95/539706ca0d3bd40dbad583dc56fd883da941f37556b629132da5762781b9/grpcio-1.81.1-cp312-cp312-macosx_11_0_universal2.whl", hash = "sha256:12b7524c88d4026d3dcb7b0ebe16b6714f3b4af402ddd0f0639ab064a00c87c3", size = 12052560, upload-time = "2026-06-11T12:45:23.652Z" },
+    { url = "https://files.pythonhosted.org/packages/e0/44/f257b7e0bd69c93b06c6cb8ac8d1b901ccb42bedabd83c1a4c77a71f8810/grpcio-1.81.1-cp312-cp312-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:1e123f9b37edb8375fd74130d1f69c944bbf0a7b06761ae7211154b8759e94d2", size = 6595983, upload-time = "2026-06-11T12:45:26.963Z" },
+    { url = "https://files.pythonhosted.org/packages/b9/f3/19782aa04c960968bef8c5539329d8e3bbc3364e2e46d19eb5e5cc5e43b7/grpcio-1.81.1-cp312-cp312-manylinux2014_i686.manylinux_2_17_i686.whl", hash = "sha256:2c2e2ae6867c2966b8daccc836d54a13218e0007e9a490aeb81dd05be64d22d7", size = 7303455, upload-time = "2026-06-11T12:45:29.707Z" },
+    { url = "https://files.pythonhosted.org/packages/eb/8c/dea020b6d91508cd84463917a63149ec196ee7db505d032ae43fcb3303b9/grpcio-1.81.1-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:766bc7c9a9c340342f4c864ccbda8e78111e4751f13b895812b9c148fb79e9d0", size = 6809167, upload-time = "2026-06-11T12:45:32.52Z" },
+    { url = "https://files.pythonhosted.org/packages/1c/c7/3030dd940408083bd32cd95d634777a71605ade4887154d93e8a89244946/grpcio-1.81.1-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:b259a04a737cb3496be0901328eb8b7552ed8df4865d8c8f1cf1bffcfc0776a3", size = 7412536, upload-time = "2026-06-11T12:45:35.403Z" },
+    { url = "https://files.pythonhosted.org/packages/e0/dd/1172a9e42b168edcafefad6115346ef619a3fc02158bb170e66ced24bcdd/grpcio-1.81.1-cp312-cp312-musllinux_1_2_i686.whl", hash = "sha256:85b10a45b8993d195c4f3ff57025b8d1e11834909ee475c403bfa60cb4caefaf", size = 8408276, upload-time = "2026-06-11T12:45:37.78Z" },
+    { url = "https://files.pythonhosted.org/packages/25/7a/71437c7f3596e5246155c515852795a85a1a8d228190212432b13b97a95d/grpcio-1.81.1-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:8ea1936c26b99999b27479853039a7f34713f56c49375ad52b38535ec93a796c", size = 7849660, upload-time = "2026-06-11T12:45:40.627Z" },
+    { url = "https://files.pythonhosted.org/packages/65/40/7debc0da45d2efebafb82da75644be347497fe4ee250514b8cd3b86ae8bf/grpcio-1.81.1-cp312-cp312-win32.whl", hash = "sha256:a185a04039df6cae8648bc8ab6d6fde7bf94f7188ecf7828e76ac52eef1e41d6", size = 4185819, upload-time = "2026-06-11T12:45:43.027Z" },
+    { url = "https://files.pythonhosted.org/packages/2e/b9/8fe3ba5ed462067774ebc1f9c7f26aa7ebcc280ddd476be107153de1339e/grpcio-1.81.1-cp312-cp312-win_amd64.whl", hash = "sha256:3ad74f8bb1a18963914c5452d289422830b39459e8776ebbcd207be1fbfb1d94", size = 4930461, upload-time = "2026-06-11T12:45:45.775Z" },
+]
+
 [[package]]
 name = "h11"
 version = "0.16.0"
@@ -647,6 +1047,21 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/7e/f5/f66802a942d491edb555dd61e3a9961140fd64c90bce1eafd741609d334d/httpcore-1.0.9-py3-none-any.whl", hash = "sha256:2d400746a40668fc9dec9810239072b40b4484b640a8c38fd654a024c7a1bf55", size = 78784, upload-time = "2025-04-24T22:06:20.566Z" },
 ]
 
+[[package]]
+name = "httptools"
+version = "0.8.0"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/43/e5/d471fcb0e14523fe1c3f4ba58ca52480e7bd70ad7109a3846bc75892f7fb/httptools-0.8.0.tar.gz", hash = "sha256:6b2a32f18d97e16e90827d7a819ffa8dbd8cc245fc4e1fa9d1095b54ef4bd999", size = 271342, upload-time = "2026-05-25T22:17:48.841Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/14/88/1d21a36da8f5cb0fa49eafd4b169eba5608d57e75bbcf61845cbc6243216/httptools-0.8.0-cp312-cp312-macosx_10_13_universal2.whl", hash = "sha256:880490234c10f70a9830743097e8958d6e4b9f5a0ffc24515023afeef984054d", size = 208247, upload-time = "2026-05-25T22:17:07.843Z" },
+    { url = "https://files.pythonhosted.org/packages/a5/42/cc4feea2945cb3051038f090c9b36bd5b8a9d7f5a894a506a8983e33fd1c/httptools-0.8.0-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:5931891fb7b441b8a3853cf1b85c82c903defce084dd5f6771ca46e31bf862c5", size = 113064, upload-time = "2026-05-25T22:17:09.136Z" },
+    { url = "https://files.pythonhosted.org/packages/e3/a6/febbb8b8db0f58b38e44ad6cb946e6a255ae49b55f2e8543408fb7501ccd/httptools-0.8.0-cp312-cp312-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl", hash = "sha256:b15fc622b0f869d19207c4089a501d9bcc63ca5e071ffdd2f03f922df882dcb2", size = 523851, upload-time = "2026-05-25T22:17:10.106Z" },
+    { url = "https://files.pythonhosted.org/packages/b7/e4/f90a0df0b83beff265b7e3b65f2a4cefd95792d4be0ac3e16049f2acd3c2/httptools-0.8.0-cp312-cp312-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:425f83884fd6343828d8c565f046cb72b6d19063f6924093e11bcd8e1548cd09", size = 518842, upload-time = "2026-05-25T22:17:11.218Z" },
+    { url = "https://files.pythonhosted.org/packages/9e/2d/0c9ac76dd2c893841fbf6498d6acec4f2442e1b7067f6e3e316a80e494e8/httptools-0.8.0-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:ef7c3c97f4311c7be57e2986629df89d49cb434dbff78eafcd48c2bff986b15a", size = 501238, upload-time = "2026-05-25T22:17:12.728Z" },
+    { url = "https://files.pythonhosted.org/packages/ca/42/906adc91ae3a5fa9c59c0a2f21c139725bd7e5b41ae6acd485cd14123ebf/httptools-0.8.0-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:a1afd7c9fbff0d9f5d489c4ce2768bd09c84a46ddefc7161e6aa82ae35c85745", size = 509567, upload-time = "2026-05-25T22:17:13.842Z" },
+    { url = "https://files.pythonhosted.org/packages/05/0b/4240efeb672751ee5b9b380cb0e3fdc050bc05f68adc7a8aefc4fcd9a69a/httptools-0.8.0-cp312-cp312-win_amd64.whl", hash = "sha256:cd96f29b4bab1d42fa6e3d008711c75e0f79e94e06827330160e3a304227f150", size = 90918, upload-time = "2026-05-25T22:17:15.155Z" },
+]
+
 [[package]]
 name = "httpx"
 version = "0.28.1"
@@ -673,23 +1088,21 @@ wheels = [
 
 [[package]]
 name = "huggingface-hub"
-version = "1.17.0"
+version = "0.36.2"
 source = { registry = "https://pypi.org/simple" }
 dependencies = [
-    { name = "click" },
     { name = "filelock" },
     { name = "fsspec" },
-    { name = "hf-xet", marker = "platform_machine == 'AMD64' or platform_machine == 'aarch64' or platform_machine == 'amd64' or platform_machine == 'arm64' or platform_machine == 'x86_64'" },
-    { name = "httpx" },
+    { name = "hf-xet", marker = "platform_machine == 'aarch64' or platform_machine == 'amd64' or platform_machine == 'arm64' or platform_machine == 'x86_64'" },
     { name = "packaging" },
     { name = "pyyaml" },
+    { name = "requests" },
     { name = "tqdm" },
-    { name = "typer" },
     { name = "typing-extensions" },
 ]
-sdist = { url = "https://files.pythonhosted.org/packages/bd/65/9826515abb600b5722bcf53f8b4a2fb58340b1f8bfcaee19f83561c13a44/huggingface_hub-1.17.0.tar.gz", hash = "sha256:fad842b6763ef70ebc3919665b1b9273645203185400a7d6c5eddc2323cc3435", size = 797082, upload-time = "2026-05-28T15:12:13.347Z" }
+sdist = { url = "https://files.pythonhosted.org/packages/7c/b7/8cb61d2eece5fb05a83271da168186721c450eb74e3c31f7ef3169fa475b/huggingface_hub-0.36.2.tar.gz", hash = "sha256:1934304d2fb224f8afa3b87007d58501acfda9215b334eed53072dd5e815ff7a", size = 649782, upload-time = "2026-02-06T09:24:13.098Z" }
 wheels = [
-    { url = "https://files.pythonhosted.org/packages/02/28/d7cef5e477b855c25d415b8f57e5bc7347c7a90cad3acf1725d0c92ca294/huggingface_hub-1.17.0-py3-none-any.whl", hash = "sha256:3b8156d23118e87f6a587648bfbc04f04a12a757ccb4ed298b35c4ae638bf24c", size = 671546, upload-time = "2026-05-28T15:12:11.441Z" },
+    { url = "https://files.pythonhosted.org/packages/a8/af/48ac8483240de756d2438c380746e7130d1c6f75802ef22f3c6d49982787/huggingface_hub-0.36.2-py3-none-any.whl", hash = "sha256:48f0c8eac16145dfce371e9d2d7772854a4f591bcb56c9cf548accf531d54270", size = 566395, upload-time = "2026-02-06T09:24:11.133Z" },
 ]
 
 [[package]]
@@ -701,6 +1114,25 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/de/a7/f76514cc40ad6234098ecdebda08732d75964776c51a42845b7da10649e2/idna-3.17-py3-none-any.whl", hash = "sha256:466e48829084efe2548012b855df21540b96f2e20e51bd124c851536556a592c", size = 65316, upload-time = "2026-05-28T14:32:37.035Z" },
 ]
 
+[[package]]
+name = "ijson"
+version = "3.5.0"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/f4/57/60d1a6a512f2f0508d0bc8b4f1cc5616fd3196619b66bd6a01f9155a1292/ijson-3.5.0.tar.gz", hash = "sha256:94688760720e3f5212731b3cb8d30267f9a045fb38fb3870254e7b9504246f31", size = 68658, upload-time = "2026-02-24T03:58:30.974Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/aa/17/9c63c7688025f3a8c47ea717b8306649c8c7244e49e20a2be4e3515dc75c/ijson-3.5.0-cp312-cp312-macosx_10_13_universal2.whl", hash = "sha256:1ebefbe149a6106cc848a3eaf536af51a9b5ccc9082de801389f152dba6ab755", size = 88536, upload-time = "2026-02-24T03:57:06.809Z" },
+    { url = "https://files.pythonhosted.org/packages/6f/dd/e15c2400244c117b06585452ebc63ae254f5a6964f712306afd1422daae0/ijson-3.5.0-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:19e30d9f00f82e64de689c0b8651b9cfed879c184b139d7e1ea5030cec401c21", size = 60499, upload-time = "2026-02-24T03:57:09.155Z" },
+    { url = "https://files.pythonhosted.org/packages/77/a9/bf4fe3538a0c965f16b406f180a06105b875da83f0743e36246be64ef550/ijson-3.5.0-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:a04a33ee78a6f27b9b8528c1ca3c207b1df3b8b867a4cf2fcc4109986f35c227", size = 60330, upload-time = "2026-02-24T03:57:10.574Z" },
+    { url = "https://files.pythonhosted.org/packages/31/76/6f91bdb019dd978fce1bc5ea1cd620cfc096d258126c91db2c03a20a7f34/ijson-3.5.0-cp312-cp312-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl", hash = "sha256:7d48dc2984af02eb3c56edfb3f13b3f62f2f3e4fe36f058c8cfc75d93adf4fed", size = 138977, upload-time = "2026-02-24T03:57:11.932Z" },
+    { url = "https://files.pythonhosted.org/packages/11/be/bbc983059e48a54b0121ee60042979faed7674490bbe7b2c41560db3f436/ijson-3.5.0-cp312-cp312-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:f1e73a44844d9adbca9cf2c4132cd875933e83f3d4b23881fcaf82be83644c7d", size = 149785, upload-time = "2026-02-24T03:57:13.255Z" },
+    { url = "https://files.pythonhosted.org/packages/6d/81/2fee58f9024a3449aee83edfa7167fb5ccd7e1af2557300e28531bb68e16/ijson-3.5.0-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:7389a56b8562a19948bdf1d7bae3a2edc8c7f86fb59834dcb1c4c722818e645a", size = 149729, upload-time = "2026-02-24T03:57:14.191Z" },
+    { url = "https://files.pythonhosted.org/packages/c7/56/f1706761fcc096c9d414b3dcd000b1e6e5c24364c21cfba429837f98ee8d/ijson-3.5.0-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:3176f23f8ebec83f374ed0c3b4e5a0c4db7ede54c005864efebbed46da123608", size = 150697, upload-time = "2026-02-24T03:57:15.855Z" },
+    { url = "https://files.pythonhosted.org/packages/d9/6e/ee0d9c875a0193b632b3e9ccd1b22a50685fb510256ad57ba483b6529f77/ijson-3.5.0-cp312-cp312-musllinux_1_2_i686.whl", hash = "sha256:6babd88e508630c6ef86c9bebaaf13bb2fb8ec1d8f8868773a03c20253f599bc", size = 142873, upload-time = "2026-02-24T03:57:16.831Z" },
+    { url = "https://files.pythonhosted.org/packages/d2/bf/f9d4399d0e6e3fd615035290a71e97c843f17f329b43638c0a01cf112d73/ijson-3.5.0-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:dc1b3836b174b6db2fa8319f1926fb5445abd195dc963368092103f8579cb8ed", size = 151583, upload-time = "2026-02-24T03:57:17.757Z" },
+    { url = "https://files.pythonhosted.org/packages/b2/71/a7254a065933c0e2ffd3586f46187d84830d3d7b6f41cfa5901820a4f87d/ijson-3.5.0-cp312-cp312-win32.whl", hash = "sha256:6673de9395fb9893c1c79a43becd8c8fbee0a250be6ea324bfd1487bb5e9ee4c", size = 53079, upload-time = "2026-02-24T03:57:18.703Z" },
+    { url = "https://files.pythonhosted.org/packages/8f/7b/2edca79b359fc9f95d774616867a03ecccdf333797baf5b3eea79733918c/ijson-3.5.0-cp312-cp312-win_amd64.whl", hash = "sha256:f4f7fabd653459dcb004175235f310435959b1bb5dfa8878578391c6cc9ad944", size = 55500, upload-time = "2026-02-24T03:57:20.428Z" },
+]
+
 [[package]]
 name = "importlib-metadata"
 version = "8.9.0"
@@ -713,6 +1145,15 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/7d/f9/97f2ca8bb3ec6e4b1d64f983ebe98b9a192faddff67fac3d6303a537e670/importlib_metadata-8.9.0-py3-none-any.whl", hash = "sha256:e0f761b6ea91ced3b0844c14c9d955224d538105921f8e6754c00f6ca79fba7f", size = 27220, upload-time = "2026-03-20T16:56:25.07Z" },
 ]
 
+[[package]]
+name = "interegular"
+version = "0.3.3"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/dc/9d/8b6dde58a028a3962ce17e84d5fe73758df61378e00ef8ac3d85da34b0ff/interegular-0.3.3.tar.gz", hash = "sha256:d9b697b21b34884711399ba0f0376914b81899ce670032486d0d048344a76600", size = 24705, upload-time = "2024-01-06T23:01:22.372Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/c4/01/72d6472f80651673716d1deda2a5bbb633e563ecf94f4479da5519d69d25/interegular-0.3.3-py37-none-any.whl", hash = "sha256:b0c07007d48c89d6d19f7204972d369b2a77222722e126b6aa63aa721dc3b19c", size = 23635, upload-time = "2024-01-06T23:01:20.829Z" },
+]
+
 [[package]]
 name = "jaraco-classes"
 version = "3.4.0"
@@ -894,6 +1335,15 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/81/db/e655086b7f3a705df045bf0933bdd9c2f79bb3c97bfef1384598bb79a217/keyring-25.7.0-py3-none-any.whl", hash = "sha256:be4a0b195f149690c166e850609a477c532ddbfbaed96a404d4e43f8d5e2689f", size = 39160, upload-time = "2025-11-16T16:26:08.402Z" },
 ]
 
+[[package]]
+name = "lark"
+version = "1.2.2"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/af/60/bc7622aefb2aee1c0b4ba23c1446d3e30225c8770b38d7aedbfb65ca9d5a/lark-1.2.2.tar.gz", hash = "sha256:ca807d0162cd16cef15a8feecb862d7319e7a09bdb13aef927968e45040fed80", size = 252132, upload-time = "2024-08-13T19:49:00.652Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/2d/00/d90b10b962b4277f5e64a78b6609968859ff86889f5b898c1a778c06ec00/lark-1.2.2-py3-none-any.whl", hash = "sha256:c2276486b02f0f1b90be155f2c8ba4a8e194d42775786db622faccd652d8e80c", size = 111036, upload-time = "2024-08-13T19:48:58.603Z" },
+]
+
 [[package]]
 name = "litellm"
 version = "1.86.2"
@@ -918,17 +1368,71 @@ wheels = [
 ]
 
 [[package]]
-name = "lxml"
-version = "6.1.1"
+name = "llguidance"
+version = "1.3.0"
 source = { registry = "https://pypi.org/simple" }
-sdist = { url = "https://files.pythonhosted.org/packages/05/3b/aab6728cae887456f409b4d75e8a01856e4f04bd510de38052a47768b680/lxml-6.1.1.tar.gz", hash = "sha256:ba96ae44888e0185281e937633a743ea90d5a196c6000f82565ebb0580012d40", size = 4197430, upload-time = "2026-05-18T19:19:06.424Z" }
+sdist = { url = "https://files.pythonhosted.org/packages/95/48/3f7a9d3ff1b36bba92b5107a3a21286821227afe9ea464736133994d61fb/llguidance-1.3.0.tar.gz", hash = "sha256:861249afd51dc325646834462ea827e57a5c2b2042e108e6aae7059fdad9104d", size = 1070460, upload-time = "2025-10-20T19:58:44.164Z" }
 wheels = [
-    { url = "https://files.pythonhosted.org/packages/6a/6e/c4add832b6fc1e887125b96f880d7b9b70aae5248718e046b1704bcac4b9/lxml-6.1.1-cp312-cp312-macosx_10_13_universal2.whl", hash = "sha256:104c09bda8d2a562824c0e319d0768ce26a779b7601e0931d33b09b53c392ef7", size = 8570821, upload-time = "2026-05-18T19:17:42.068Z" },
-    { url = "https://files.pythonhosted.org/packages/22/00/ff3009c88e65de8011630acf8ab5a09cb2becd2aaf47fba2f3449f6224e9/lxml-6.1.1-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:25c6997a9a534e016695a0ba06b2f07945de682731ff01065b6d5a4474179da1", size = 4624252, upload-time = "2026-05-18T19:17:47.897Z" },
-    { url = "https://files.pythonhosted.org/packages/42/95/bb63f0fd62e554fe078e1fb3c8fe9083c14ddc7ad7fa178d10e57e071ac7/lxml-6.1.1-cp312-cp312-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:c921ba5c51e4e9f63b8b00267d06566e1f63407408a0496da2d1d0bfc819c7fc", size = 4930746, upload-time = "2026-05-18T19:18:29.637Z" },
-    { url = "https://files.pythonhosted.org/packages/eb/99/0013e8d9b5960f4f041cf0b73e2f80c23eb5205b1f7bfb20203243651359/lxml-6.1.1-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:54a7f95e4de5fb94e2f9f4b9055c6ba33bf3d628fd77a1d647c5923caa2cdcdc", size = 5093723, upload-time = "2026-05-18T19:18:34.168Z" },
-    { url = "https://files.pythonhosted.org/packages/29/91/317b332636bfc7bddcff828d41b3307f50043f4b237e40849c333d80fa1a/lxml-6.1.1-cp312-cp312-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:96f2ec43df44b1f76249ee0a615334f9b5b060e1c8bd90e706dad2d14d02f383", size = 5005557, upload-time = "2026-05-18T19:18:39.798Z" },
-    { url = "https://files.pythonhosted.org/packages/42/2f/cc9bf06afe70f9c9093ae60855d9759da9db601ec4080f7473319666ffd7/lxml-6.1.1-cp312-cp312-manylinux_2_26_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:70ef8a7e102a1508f8121aae5b0867abd663f72c14f0a9c937e6554cb4587b7b", size = 5631036, upload-time = "2026-05-18T19:18:44.858Z" },
+    { url = "https://files.pythonhosted.org/packages/3b/33/be5acb85cd8cdc4afde33d9c234eece9f318e087920255af3c05864cd3e7/llguidance-1.3.0-cp39-abi3-macosx_10_12_x86_64.whl", hash = "sha256:f7685222660a762e481ac633d49cc559c64980fe2ee59c8f932a5bb5cbc0c2c2", size = 3220647, upload-time = "2025-10-20T19:58:42.542Z" },
+    { url = "https://files.pythonhosted.org/packages/82/e6/b48bda5b15efeaeb62bd0dba8fc6a01d4ae5457a85dbb5d18632385fe15c/llguidance-1.3.0-cp39-abi3-macosx_11_0_arm64.whl", hash = "sha256:098030ff0687261a3f1bd54cf21fe951fc861d56d37a0671250dd36677eaf224", size = 3099830, upload-time = "2025-10-20T19:58:40.826Z" },
+    { url = "https://files.pythonhosted.org/packages/aa/11/44389d3d1526d7a5c38ffd587a5ebc61d7bee443ac1dea95f2089ad58f5f/llguidance-1.3.0-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:6f6caca5d78db7f76e1fbb0fff8607b861c32d47fa3d5dee2fc49de27ee269df", size = 2835242, upload-time = "2025-10-20T19:58:34.518Z" },
+    { url = "https://files.pythonhosted.org/packages/83/a8/1ff2bedb8f9acb46a2d2d603415d272bb622c142ea86f5b95445cc6e366c/llguidance-1.3.0-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:bc17e9dd602c3879bf91664a64bf72f54c74dbfbeb24ccfab6a5fe435b12f7aa", size = 3033133, upload-time = "2025-10-20T19:58:38.721Z" },
+    { url = "https://files.pythonhosted.org/packages/5a/7e/809349638231f469b9056c0e1bfd924d5ef5558b3b3ec72d093b6fad33b1/llguidance-1.3.0-cp39-abi3-win_amd64.whl", hash = "sha256:1d1cd1c8618d1a13605d3e057c978651e551c8c469b481ee4041f1d6c436002d", size = 2789946, upload-time = "2025-10-20T19:58:45.958Z" },
+]
+
+[[package]]
+name = "llvmlite"
+version = "0.44.0"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/89/6a/95a3d3610d5c75293d5dbbb2a76480d5d4eeba641557b69fe90af6c5b84e/llvmlite-0.44.0.tar.gz", hash = "sha256:07667d66a5d150abed9157ab6c0b9393c9356f229784a4385c02f99e94fc94d4", size = 171880, upload-time = "2025-01-20T11:14:41.342Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/15/86/e3c3195b92e6e492458f16d233e58a1a812aa2bfbef9bdd0fbafcec85c60/llvmlite-0.44.0-cp312-cp312-macosx_10_14_x86_64.whl", hash = "sha256:1d671a56acf725bf1b531d5ef76b86660a5ab8ef19bb6a46064a705c6ca80aad", size = 28132297, upload-time = "2025-01-20T11:13:32.57Z" },
+    { url = "https://files.pythonhosted.org/packages/d6/53/373b6b8be67b9221d12b24125fd0ec56b1078b660eeae266ec388a6ac9a0/llvmlite-0.44.0-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:5f79a728e0435493611c9f405168682bb75ffd1fbe6fc360733b850c80a026db", size = 26201105, upload-time = "2025-01-20T11:13:38.744Z" },
+    { url = "https://files.pythonhosted.org/packages/cb/da/8341fd3056419441286c8e26bf436923021005ece0bff5f41906476ae514/llvmlite-0.44.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:c0143a5ef336da14deaa8ec26c5449ad5b6a2b564df82fcef4be040b9cacfea9", size = 42361901, upload-time = "2025-01-20T11:13:46.711Z" },
+    { url = "https://files.pythonhosted.org/packages/53/ad/d79349dc07b8a395a99153d7ce8b01d6fcdc9f8231355a5df55ded649b61/llvmlite-0.44.0-cp312-cp312-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:d752f89e31b66db6f8da06df8b39f9b91e78c5feea1bf9e8c1fba1d1c24c065d", size = 41184247, upload-time = "2025-01-20T11:13:56.159Z" },
+    { url = "https://files.pythonhosted.org/packages/e2/3b/a9a17366af80127bd09decbe2a54d8974b6d8b274b39bf47fbaedeec6307/llvmlite-0.44.0-cp312-cp312-win_amd64.whl", hash = "sha256:eae7e2d4ca8f88f89d315b48c6b741dcb925d6a1042da694aa16ab3dd4cbd3a1", size = 30332380, upload-time = "2025-01-20T11:14:02.442Z" },
+]
+
+[[package]]
+name = "lm-format-enforcer"
+version = "0.11.3"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "interegular" },
+    { name = "packaging" },
+    { name = "pydantic" },
+    { name = "pyyaml" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/84/d5/41cd417ba7dfdbbcfe46cebf81fb3dfd7c591b89897560ad05bb410a465d/lm_format_enforcer-0.11.3.tar.gz", hash = "sha256:e68081c108719cce284a9bcc889709b26ffb085a1945b5eba3a12cfa96d528da", size = 40258, upload-time = "2025-08-24T19:37:47.527Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/a0/ef/11292bb0b85cf4c93447cab5a29f64576ed14d3ab4280e35ddd23486594a/lm_format_enforcer-0.11.3-py3-none-any.whl", hash = "sha256:cf586350875def1ae7a8fba84fcbbfc8371424b6c9d05c1fcba70aa233fbf06f", size = 45418, upload-time = "2025-08-24T19:37:46.325Z" },
+]
+
+[[package]]
+name = "loguru"
+version = "0.7.3"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "colorama", marker = "sys_platform == 'win32'" },
+    { name = "win32-setctime", marker = "sys_platform == 'win32'" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/3a/05/a1dae3dffd1116099471c643b8924f5aa6524411dc6c63fdae648c4f1aca/loguru-0.7.3.tar.gz", hash = "sha256:19480589e77d47b8d85b2c827ad95d49bf31b0dcde16593892eb51dd18706eb6", size = 63559, upload-time = "2024-12-06T11:20:56.608Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/0c/29/0348de65b8cc732daa3e33e67806420b2ae89bdce2b04af740289c5c6c8c/loguru-0.7.3-py3-none-any.whl", hash = "sha256:31a33c10c8e1e10422bfd431aeb5d351c7cf7fa671e3c4df004162264b28220c", size = 61595, upload-time = "2024-12-06T11:20:54.538Z" },
+]
+
+[[package]]
+name = "lxml"
+version = "6.1.1"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/05/3b/aab6728cae887456f409b4d75e8a01856e4f04bd510de38052a47768b680/lxml-6.1.1.tar.gz", hash = "sha256:ba96ae44888e0185281e937633a743ea90d5a196c6000f82565ebb0580012d40", size = 4197430, upload-time = "2026-05-18T19:19:06.424Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/6a/6e/c4add832b6fc1e887125b96f880d7b9b70aae5248718e046b1704bcac4b9/lxml-6.1.1-cp312-cp312-macosx_10_13_universal2.whl", hash = "sha256:104c09bda8d2a562824c0e319d0768ce26a779b7601e0931d33b09b53c392ef7", size = 8570821, upload-time = "2026-05-18T19:17:42.068Z" },
+    { url = "https://files.pythonhosted.org/packages/22/00/ff3009c88e65de8011630acf8ab5a09cb2becd2aaf47fba2f3449f6224e9/lxml-6.1.1-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:25c6997a9a534e016695a0ba06b2f07945de682731ff01065b6d5a4474179da1", size = 4624252, upload-time = "2026-05-18T19:17:47.897Z" },
+    { url = "https://files.pythonhosted.org/packages/42/95/bb63f0fd62e554fe078e1fb3c8fe9083c14ddc7ad7fa178d10e57e071ac7/lxml-6.1.1-cp312-cp312-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:c921ba5c51e4e9f63b8b00267d06566e1f63407408a0496da2d1d0bfc819c7fc", size = 4930746, upload-time = "2026-05-18T19:18:29.637Z" },
+    { url = "https://files.pythonhosted.org/packages/eb/99/0013e8d9b5960f4f041cf0b73e2f80c23eb5205b1f7bfb20203243651359/lxml-6.1.1-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:54a7f95e4de5fb94e2f9f4b9055c6ba33bf3d628fd77a1d647c5923caa2cdcdc", size = 5093723, upload-time = "2026-05-18T19:18:34.168Z" },
+    { url = "https://files.pythonhosted.org/packages/29/91/317b332636bfc7bddcff828d41b3307f50043f4b237e40849c333d80fa1a/lxml-6.1.1-cp312-cp312-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:96f2ec43df44b1f76249ee0a615334f9b5b060e1c8bd90e706dad2d14d02f383", size = 5005557, upload-time = "2026-05-18T19:18:39.798Z" },
+    { url = "https://files.pythonhosted.org/packages/42/2f/cc9bf06afe70f9c9093ae60855d9759da9db601ec4080f7473319666ffd7/lxml-6.1.1-cp312-cp312-manylinux_2_26_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:70ef8a7e102a1508f8121aae5b0867abd663f72c14f0a9c937e6554cb4587b7b", size = 5631036, upload-time = "2026-05-18T19:18:44.858Z" },
     { url = "https://files.pythonhosted.org/packages/08/f6/af32e23e563971ffb0fb86be52bc5be5c2c118858ffc119bf6a9039b173d/lxml-6.1.1-cp312-cp312-manylinux_2_26_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:ebe6af670449830d6d9b752c256a983291c766a1365ba5d5460048f9e33a7818", size = 5240367, upload-time = "2026-05-18T19:18:49.217Z" },
     { url = "https://files.pythonhosted.org/packages/78/83/8555d40948b09ce86f1bd0c68a7ac31d07b1929f92cc1b074006c97ef2d2/lxml-6.1.1-cp312-cp312-manylinux_2_28_i686.whl", hash = "sha256:27acc820660aaffa4f7c087f29120e12980f7779d56d8492d263170111284740", size = 5350171, upload-time = "2026-05-18T19:18:52.779Z" },
     { url = "https://files.pythonhosted.org/packages/63/75/5d92da93729b7bad783689e6496049fa40927b45bec7bf183c981de3ca70/lxml-6.1.1-cp312-cp312-manylinux_2_31_armv7l.whl", hash = "sha256:1db753c9115ec7100d073b744d17e25e88a8f90f5c39b2f5dd878149af59671f", size = 4694874, upload-time = "2026-05-18T19:18:55.139Z" },
@@ -1025,6 +1529,48 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/b3/38/89ba8ad64ae25be8de66a6d463314cf1eb366222074cfda9ee839c56a4b4/mdurl-0.1.2-py3-none-any.whl", hash = "sha256:84008a41e51615a49fc9966191ff91509e3c40b939176e643fd50a5c2196b8f8", size = 9979, upload-time = "2022-08-14T12:40:09.779Z" },
 ]
 
+[[package]]
+name = "mistral-common"
+version = "1.11.3"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "jsonschema" },
+    { name = "numpy" },
+    { name = "pillow" },
+    { name = "pydantic" },
+    { name = "pydantic-extra-types", extra = ["pycountry"] },
+    { name = "requests" },
+    { name = "tiktoken" },
+    { name = "typing-extensions" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/2e/03/3c5d4c9430da406f8444f9a7b058a6aa89c525fb068a57fe2ab8b04a6d08/mistral_common-1.11.3.tar.gz", hash = "sha256:6437e128fc8a307318440839ca14ddf2e8060056b062233ec0db10352651374c", size = 6360629, upload-time = "2026-06-04T09:01:11.131Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/7b/76/dbfdf9c59e2a4b0116587626a3768c2a3b2ba1758b5756743918c2337fdc/mistral_common-1.11.3-py3-none-any.whl", hash = "sha256:dbfcef9d0c892727ee08a080f0c1039baed5430b291f5425ffd88892bf09e52c", size = 6533154, upload-time = "2026-06-04T09:01:14.186Z" },
+]
+
+[package.optional-dependencies]
+image = [
+    { name = "opencv-python-headless" },
+]
+
+[[package]]
+name = "model-hosting-container-standards"
+version = "0.1.16"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "fastapi" },
+    { name = "httpx" },
+    { name = "jmespath" },
+    { name = "pydantic" },
+    { name = "setuptools" },
+    { name = "starlette" },
+    { name = "supervisor" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/2d/5f/bc0d0fce1bd0a35378696aa13b21feffa18d9cda837f4e1be124e45ee090/model_hosting_container_standards-0.1.16.tar.gz", hash = "sha256:d34589633900e53c3ee5f7c78280a7cf7e4f6532c35e763341a262fc85cbe84a", size = 94130, upload-time = "2026-06-15T21:29:34.771Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/07/ef/6eabeb251d2a0598cb5f9a274159e05ae07a1e3fe6a1473bf6035793252a/model_hosting_container_standards-0.1.16-py3-none-any.whl", hash = "sha256:47f4f65713120bc3a69feb022981a38db9e557aedf88dbd72077f20588caa12b", size = 125666, upload-time = "2026-06-15T21:29:33.415Z" },
+]
+
 [[package]]
 name = "more-itertools"
 version = "11.1.0"
@@ -1034,6 +1580,31 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/e8/3d/1087453384dbde46a8c7f9356eead2c58be8a7bf156bca40243377c85715/more_itertools-11.1.0-py3-none-any.whl", hash = "sha256:4b65538ae22f6fed0ce4874efd317463a7489796a0939fa66824dd542125a192", size = 72226, upload-time = "2026-05-22T14:14:28.824Z" },
 ]
 
+[[package]]
+name = "mpmath"
+version = "1.3.0"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/e0/47/dd32fa426cc72114383ac549964eecb20ecfd886d1e5ccf5340b55b02f57/mpmath-1.3.0.tar.gz", hash = "sha256:7a28eb2a9774d00c7bc92411c19a89209d5da7c4c9a9e227be8330a23a25b91f", size = 508106, upload-time = "2023-03-07T16:47:11.061Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/43/e3/7d92a15f894aa0c9c4b49b8ee9ac9850d6e63b03c9c32c0367a13ae62209/mpmath-1.3.0-py3-none-any.whl", hash = "sha256:a0b2b9fe80bbcd81a6647ff13108738cfb482d481d826cc0e02f5b35e5c88d2c", size = 536198, upload-time = "2023-03-07T16:47:09.197Z" },
+]
+
+[[package]]
+name = "msgspec"
+version = "0.21.1"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/e3/60/f79b9b013a16fa3a58350c9295ddc6789f2e335f36ea61ed10a21b215364/msgspec-0.21.1.tar.gz", hash = "sha256:2313508e394b0d208f8f56892ca9b2799e2561329de9763b19619595a6c0f72c", size = 319193, upload-time = "2026-04-12T21:44:50.394Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/6e/cf/317224852c00248c620a9bcf4b26e2e4ab8afd752f18d2a6ef73ebd423b6/msgspec-0.21.1-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:d4248cf0b6129b7d230eacd493c17cc2d4f3989f3bb7f633a928a85b7dcfa251", size = 196188, upload-time = "2026-04-12T21:44:07.181Z" },
+    { url = "https://files.pythonhosted.org/packages/6d/81/074612945c0666078f7366f40000013de9f6ba687491d450df699bceebc9/msgspec-0.21.1-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:5102c7e9b3acff82178449b85006d96310e690291bb1ea0142f1b24bcb8aabcb", size = 188473, upload-time = "2026-04-12T21:44:08.736Z" },
+    { url = "https://files.pythonhosted.org/packages/8a/37/655101799590bcc5fddb2bd3fe0e6194e816c2d1da7c361725f5eb89a910/msgspec-0.21.1-cp312-cp312-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:846758412e9518252b2ac9bffd6f0e54d9ff614f5f9488df7749f81ff5c80920", size = 218871, upload-time = "2026-04-12T21:44:09.917Z" },
+    { url = "https://files.pythonhosted.org/packages/b5/d1/d4cd9fe89c7d400d7a18f86ccc94daa3f0927f53558846fcb60791dce5d6/msgspec-0.21.1-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:21995e74b5c598c2e004110ad66ec7f1b8c20bf2bcf3b2de8fd9a3094422d3ff", size = 225025, upload-time = "2026-04-12T21:44:11.191Z" },
+    { url = "https://files.pythonhosted.org/packages/24/bf/e20549e602b9edccadeeff98760345a416f9cce846a657e8b18e3396b212/msgspec-0.21.1-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:6129f0cca52992e898fd5344187f7c8127b63d810b2fd73e36fca73b4c6475ee", size = 222672, upload-time = "2026-04-12T21:44:12.481Z" },
+    { url = "https://files.pythonhosted.org/packages/b4/68/04d7a8f0f786545cf9b8c280c57aa6befb5977af6e884b8b54191cbe44b3/msgspec-0.21.1-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:ef3ec2296248d1f8b9231acb051b6d471dfde8f21819e86c9adaaa9f42918521", size = 227303, upload-time = "2026-04-12T21:44:13.709Z" },
+    { url = "https://files.pythonhosted.org/packages/cc/4d/619866af2840875be408047bf9e70ceafbae6ab50660de7134ed1b25eb86/msgspec-0.21.1-cp312-cp312-win_amd64.whl", hash = "sha256:d4ab834a054c6f0cbeef6df9e7e1b33d5f1bc7b86dea1d2fd7cad003873e783d", size = 190017, upload-time = "2026-04-12T21:44:14.977Z" },
+    { url = "https://files.pythonhosted.org/packages/5e/2e/a8f9eca8fd00e097d7a9e99ba8a4685db994494448e3d4f0b7f6e9a3c0f7/msgspec-0.21.1-cp312-cp312-win_arm64.whl", hash = "sha256:628aaa35c74950a8c59da330d7e98917e1c7188f983745782027748ee4ca573e", size = 175345, upload-time = "2026-04-12T21:44:16.431Z" },
+]
+
 [[package]]
 name = "multidict"
 version = "6.7.1"
@@ -1076,23 +1647,253 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/7e/82/69e539c4c2027f1e1697e09aaa2449243085a0edf81ae2c6341e84d769b6/multiprocess-0.70.19-py39-none-any.whl", hash = "sha256:0d4b4397ed669d371c81dcd1ef33fd384a44d6c3de1bd0ca7ac06d837720d3c5", size = 133477, upload-time = "2026-01-19T06:47:38.619Z" },
 ]
 
+[[package]]
+name = "networkx"
+version = "3.6.1"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/6a/51/63fe664f3908c97be9d2e4f1158eb633317598cfa6e1fc14af5383f17512/networkx-3.6.1.tar.gz", hash = "sha256:26b7c357accc0c8cde558ad486283728b65b6a95d85ee1cd66bafab4c8168509", size = 2517025, upload-time = "2025-12-08T17:02:39.908Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/9e/c9/b2622292ea83fbb4ec318f5b9ab867d0a28ab43c5717bb85b0a5f6b3b0a4/networkx-3.6.1-py3-none-any.whl", hash = "sha256:d47fbf302e7d9cbbb9e2555a0d267983d2aa476bac30e90dfbe5669bd57f3762", size = 2068504, upload-time = "2025-12-08T17:02:38.159Z" },
+]
+
+[[package]]
+name = "ninja"
+version = "1.13.0"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/43/73/79a0b22fc731989c708068427579e840a6cf4e937fe7ae5c5d0b7356ac22/ninja-1.13.0.tar.gz", hash = "sha256:4a40ce995ded54d9dc24f8ea37ff3bf62ad192b547f6c7126e7e25045e76f978", size = 242558, upload-time = "2025-08-11T15:10:19.421Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/3c/74/d02409ed2aa865e051b7edda22ad416a39d81a84980f544f8de717cab133/ninja-1.13.0-py3-none-macosx_10_9_universal2.whl", hash = "sha256:fa2a8bfc62e31b08f83127d1613d10821775a0eb334197154c4d6067b7068ff1", size = 310125, upload-time = "2025-08-11T15:09:50.971Z" },
+    { url = "https://files.pythonhosted.org/packages/8e/de/6e1cd6b84b412ac1ef327b76f0641aeb5dcc01e9d3f9eee0286d0c34fd93/ninja-1.13.0-py3-none-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:3d00c692fb717fd511abeb44b8c5d00340c36938c12d6538ba989fe764e79630", size = 177467, upload-time = "2025-08-11T15:09:52.767Z" },
+    { url = "https://files.pythonhosted.org/packages/c8/83/49320fb6e58ae3c079381e333575fdbcf1cca3506ee160a2dcce775046fa/ninja-1.13.0-py3-none-manylinux2014_i686.manylinux_2_17_i686.whl", hash = "sha256:be7f478ff9f96a128b599a964fc60a6a87b9fa332ee1bd44fa243ac88d50291c", size = 187834, upload-time = "2025-08-11T15:09:54.115Z" },
+    { url = "https://files.pythonhosted.org/packages/56/c7/ba22748fb59f7f896b609cd3e568d28a0a367a6d953c24c461fe04fc4433/ninja-1.13.0-py3-none-manylinux2014_ppc64le.manylinux_2_17_ppc64le.whl", hash = "sha256:60056592cf495e9a6a4bea3cd178903056ecb0943e4de45a2ea825edb6dc8d3e", size = 202736, upload-time = "2025-08-11T15:09:55.745Z" },
+    { url = "https://files.pythonhosted.org/packages/79/22/d1de07632b78ac8e6b785f41fa9aad7a978ec8c0a1bf15772def36d77aac/ninja-1.13.0-py3-none-manylinux2014_s390x.manylinux_2_17_s390x.whl", hash = "sha256:1c97223cdda0417f414bf864cfb73b72d8777e57ebb279c5f6de368de0062988", size = 179034, upload-time = "2025-08-11T15:09:57.394Z" },
+    { url = "https://files.pythonhosted.org/packages/ed/de/0e6edf44d6a04dabd0318a519125ed0415ce437ad5a1ec9b9be03d9048cf/ninja-1.13.0-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:fb46acf6b93b8dd0322adc3a4945452a4e774b75b91293bafcc7b7f8e6517dfa", size = 180716, upload-time = "2025-08-11T15:09:58.696Z" },
+    { url = "https://files.pythonhosted.org/packages/54/28/938b562f9057aaa4d6bfbeaa05e81899a47aebb3ba6751e36c027a7f5ff7/ninja-1.13.0-py3-none-manylinux_2_28_armv7l.manylinux_2_31_armv7l.whl", hash = "sha256:4be9c1b082d244b1ad7ef41eb8ab088aae8c109a9f3f0b3e56a252d3e00f42c1", size = 146843, upload-time = "2025-08-11T15:10:00.046Z" },
+    { url = "https://files.pythonhosted.org/packages/2a/fb/d06a3838de4f8ab866e44ee52a797b5491df823901c54943b2adb0389fbb/ninja-1.13.0-py3-none-manylinux_2_31_riscv64.whl", hash = "sha256:6739d3352073341ad284246f81339a384eec091d9851a886dfa5b00a6d48b3e2", size = 154402, upload-time = "2025-08-11T15:10:01.657Z" },
+    { url = "https://files.pythonhosted.org/packages/31/bf/0d7808af695ceddc763cf251b84a9892cd7f51622dc8b4c89d5012779f06/ninja-1.13.0-py3-none-musllinux_1_2_aarch64.whl", hash = "sha256:11be2d22027bde06f14c343f01d31446747dbb51e72d00decca2eb99be911e2f", size = 552388, upload-time = "2025-08-11T15:10:03.349Z" },
+    { url = "https://files.pythonhosted.org/packages/9d/70/c99d0c2c809f992752453cce312848abb3b1607e56d4cd1b6cded317351a/ninja-1.13.0-py3-none-musllinux_1_2_armv7l.whl", hash = "sha256:aa45b4037b313c2f698bc13306239b8b93b4680eb47e287773156ac9e9304714", size = 472501, upload-time = "2025-08-11T15:10:04.735Z" },
+    { url = "https://files.pythonhosted.org/packages/9f/43/c217b1153f0e499652f5e0766da8523ce3480f0a951039c7af115e224d55/ninja-1.13.0-py3-none-musllinux_1_2_i686.whl", hash = "sha256:5f8e1e8a1a30835eeb51db05cf5a67151ad37542f5a4af2a438e9490915e5b72", size = 638280, upload-time = "2025-08-11T15:10:06.512Z" },
+    { url = "https://files.pythonhosted.org/packages/8c/45/9151bba2c8d0ae2b6260f71696330590de5850e5574b7b5694dce6023e20/ninja-1.13.0-py3-none-musllinux_1_2_ppc64le.whl", hash = "sha256:3d7d7779d12cb20c6d054c61b702139fd23a7a964ec8f2c823f1ab1b084150db", size = 642420, upload-time = "2025-08-11T15:10:08.35Z" },
+    { url = "https://files.pythonhosted.org/packages/3c/fb/95752eb635bb8ad27d101d71bef15bc63049de23f299e312878fc21cb2da/ninja-1.13.0-py3-none-musllinux_1_2_riscv64.whl", hash = "sha256:d741a5e6754e0bda767e3274a0f0deeef4807f1fec6c0d7921a0244018926ae5", size = 585106, upload-time = "2025-08-11T15:10:09.818Z" },
+    { url = "https://files.pythonhosted.org/packages/c1/31/aa56a1a286703800c0cbe39fb4e82811c277772dc8cd084f442dd8e2938a/ninja-1.13.0-py3-none-musllinux_1_2_s390x.whl", hash = "sha256:e8bad11f8a00b64137e9b315b137d8bb6cbf3086fbdc43bf1f90fd33324d2e96", size = 707138, upload-time = "2025-08-11T15:10:11.366Z" },
+    { url = "https://files.pythonhosted.org/packages/34/6f/5f5a54a1041af945130abdb2b8529cbef0cdcbbf9bcf3f4195378319d29a/ninja-1.13.0-py3-none-musllinux_1_2_x86_64.whl", hash = "sha256:b4f2a072db3c0f944c32793e91532d8948d20d9ab83da9c0c7c15b5768072200", size = 581758, upload-time = "2025-08-11T15:10:13.295Z" },
+    { url = "https://files.pythonhosted.org/packages/95/97/51359c77527d45943fe7a94d00a3843b81162e6c4244b3579fe8fc54cb9c/ninja-1.13.0-py3-none-win32.whl", hash = "sha256:8cfbb80b4a53456ae8a39f90ae3d7a2129f45ea164f43fadfa15dc38c4aef1c9", size = 267201, upload-time = "2025-08-11T15:10:15.158Z" },
+    { url = "https://files.pythonhosted.org/packages/29/45/c0adfbfb0b5895aa18cec400c535b4f7ff3e52536e0403602fc1a23f7de9/ninja-1.13.0-py3-none-win_amd64.whl", hash = "sha256:fb8ee8719f8af47fed145cced4a85f0755dd55d45b2bddaf7431fa89803c5f3e", size = 309975, upload-time = "2025-08-11T15:10:16.697Z" },
+    { url = "https://files.pythonhosted.org/packages/df/93/a7b983643d1253bb223234b5b226e69de6cda02b76cdca7770f684b795f5/ninja-1.13.0-py3-none-win_arm64.whl", hash = "sha256:3c0b40b1f0bba764644385319028650087b4c1b18cdfa6f45cb39a3669b81aa9", size = 290806, upload-time = "2025-08-11T15:10:18.018Z" },
+]
+
+[[package]]
+name = "numba"
+version = "0.61.2"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "llvmlite" },
+    { name = "numpy" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/1c/a0/e21f57604304aa03ebb8e098429222722ad99176a4f979d34af1d1ee80da/numba-0.61.2.tar.gz", hash = "sha256:8750ee147940a6637b80ecf7f95062185ad8726c8c28a2295b8ec1160a196f7d", size = 2820615, upload-time = "2025-04-09T02:58:07.659Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/b4/a0/c6b7b9c615cfa3b98c4c63f4316e3f6b3bbe2387740277006551784218cd/numba-0.61.2-cp312-cp312-macosx_10_14_x86_64.whl", hash = "sha256:34fba9406078bac7ab052efbf0d13939426c753ad72946baaa5bf9ae0ebb8dd2", size = 2776626, upload-time = "2025-04-09T02:57:51.857Z" },
+    { url = "https://files.pythonhosted.org/packages/92/4a/fe4e3c2ecad72d88f5f8cd04e7f7cff49e718398a2fac02d2947480a00ca/numba-0.61.2-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:4ddce10009bc097b080fc96876d14c051cc0c7679e99de3e0af59014dab7dfe8", size = 2779287, upload-time = "2025-04-09T02:57:53.658Z" },
+    { url = "https://files.pythonhosted.org/packages/9a/2d/e518df036feab381c23a624dac47f8445ac55686ec7f11083655eb707da3/numba-0.61.2-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:5b1bb509d01f23d70325d3a5a0e237cbc9544dd50e50588bc581ba860c213546", size = 3885928, upload-time = "2025-04-09T02:57:55.206Z" },
+    { url = "https://files.pythonhosted.org/packages/10/0f/23cced68ead67b75d77cfcca3df4991d1855c897ee0ff3fe25a56ed82108/numba-0.61.2-cp312-cp312-manylinux_2_28_aarch64.whl", hash = "sha256:48a53a3de8f8793526cbe330f2a39fe9a6638efcbf11bd63f3d2f9757ae345cd", size = 3577115, upload-time = "2025-04-09T02:57:56.818Z" },
+    { url = "https://files.pythonhosted.org/packages/68/1d/ddb3e704c5a8fb90142bf9dc195c27db02a08a99f037395503bfbc1d14b3/numba-0.61.2-cp312-cp312-win_amd64.whl", hash = "sha256:97cf4f12c728cf77c9c1d7c23707e4d8fb4632b46275f8f3397de33e5877af18", size = 2831929, upload-time = "2025-04-09T02:57:58.45Z" },
+]
+
 [[package]]
 name = "numpy"
-version = "2.4.6"
+version = "2.2.6"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/76/21/7d2a95e4bba9dc13d043ee156a356c0a8f0c6309dff6b21b4d71a073b8a8/numpy-2.2.6.tar.gz", hash = "sha256:e29554e2bef54a90aa5cc07da6ce955accb83f21ab5de01a62c8478897b264fd", size = 20276440, upload-time = "2025-05-17T22:38:04.611Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/82/5d/c00588b6cf18e1da539b45d3598d3557084990dcc4331960c15ee776ee41/numpy-2.2.6-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:41c5a21f4a04fa86436124d388f6ed60a9343a6f767fced1a8a71c3fbca038ff", size = 20875348, upload-time = "2025-05-17T21:34:39.648Z" },
+    { url = "https://files.pythonhosted.org/packages/66/ee/560deadcdde6c2f90200450d5938f63a34b37e27ebff162810f716f6a230/numpy-2.2.6-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:de749064336d37e340f640b05f24e9e3dd678c57318c7289d222a8a2f543e90c", size = 14119362, upload-time = "2025-05-17T21:35:01.241Z" },
+    { url = "https://files.pythonhosted.org/packages/3c/65/4baa99f1c53b30adf0acd9a5519078871ddde8d2339dc5a7fde80d9d87da/numpy-2.2.6-cp312-cp312-macosx_14_0_arm64.whl", hash = "sha256:894b3a42502226a1cac872f840030665f33326fc3dac8e57c607905773cdcde3", size = 5084103, upload-time = "2025-05-17T21:35:10.622Z" },
+    { url = "https://files.pythonhosted.org/packages/cc/89/e5a34c071a0570cc40c9a54eb472d113eea6d002e9ae12bb3a8407fb912e/numpy-2.2.6-cp312-cp312-macosx_14_0_x86_64.whl", hash = "sha256:71594f7c51a18e728451bb50cc60a3ce4e6538822731b2933209a1f3614e9282", size = 6625382, upload-time = "2025-05-17T21:35:21.414Z" },
+    { url = "https://files.pythonhosted.org/packages/f8/35/8c80729f1ff76b3921d5c9487c7ac3de9b2a103b1cd05e905b3090513510/numpy-2.2.6-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:f2618db89be1b4e05f7a1a847a9c1c0abd63e63a1607d892dd54668dd92faf87", size = 14018462, upload-time = "2025-05-17T21:35:42.174Z" },
+    { url = "https://files.pythonhosted.org/packages/8c/3d/1e1db36cfd41f895d266b103df00ca5b3cbe965184df824dec5c08c6b803/numpy-2.2.6-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:fd83c01228a688733f1ded5201c678f0c53ecc1006ffbc404db9f7a899ac6249", size = 16527618, upload-time = "2025-05-17T21:36:06.711Z" },
+    { url = "https://files.pythonhosted.org/packages/61/c6/03ed30992602c85aa3cd95b9070a514f8b3c33e31124694438d88809ae36/numpy-2.2.6-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:37c0ca431f82cd5fa716eca9506aefcabc247fb27ba69c5062a6d3ade8cf8f49", size = 15505511, upload-time = "2025-05-17T21:36:29.965Z" },
+    { url = "https://files.pythonhosted.org/packages/b7/25/5761d832a81df431e260719ec45de696414266613c9ee268394dd5ad8236/numpy-2.2.6-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:fe27749d33bb772c80dcd84ae7e8df2adc920ae8297400dabec45f0dedb3f6de", size = 18313783, upload-time = "2025-05-17T21:36:56.883Z" },
+    { url = "https://files.pythonhosted.org/packages/57/0a/72d5a3527c5ebffcd47bde9162c39fae1f90138c961e5296491ce778e682/numpy-2.2.6-cp312-cp312-win32.whl", hash = "sha256:4eeaae00d789f66c7a25ac5f34b71a7035bb474e679f410e5e1a94deb24cf2d4", size = 6246506, upload-time = "2025-05-17T21:37:07.368Z" },
+    { url = "https://files.pythonhosted.org/packages/36/fa/8c9210162ca1b88529ab76b41ba02d433fd54fecaf6feb70ef9f124683f1/numpy-2.2.6-cp312-cp312-win_amd64.whl", hash = "sha256:c1f9540be57940698ed329904db803cf7a402f3fc200bfe599334c9bd84a40b2", size = 12614190, upload-time = "2025-05-17T21:37:26.213Z" },
+]
+
+[[package]]
+name = "nvidia-cublas-cu12"
+version = "12.8.4.1"
+source = { registry = "https://pypi.org/simple" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/dc/61/e24b560ab2e2eaeb3c839129175fb330dfcfc29e5203196e5541a4c44682/nvidia_cublas_cu12-12.8.4.1-py3-none-manylinux_2_27_x86_64.whl", hash = "sha256:8ac4e771d5a348c551b2a426eda6193c19aa630236b418086020df5ba9667142", size = 594346921, upload-time = "2025-03-07T01:44:31.254Z" },
+]
+
+[[package]]
+name = "nvidia-cuda-cupti-cu12"
+version = "12.8.90"
 source = { registry = "https://pypi.org/simple" }
-sdist = { url = "https://files.pythonhosted.org/packages/d0/ad/fed0499ce6a338d2a03ebae59cd15093910c8875328855781952abf6c2fe/numpy-2.4.6.tar.gz", hash = "sha256:f3a3570c4a2a16746ac2c31a7c7c7b0c186b95ce902e33db6f28094ed7387dda", size = 20735807, upload-time = "2026-05-18T23:37:14.07Z" }
 wheels = [
-    { url = "https://files.pythonhosted.org/packages/95/2a/3d7b5ac8aac24feaf9ad7ed58f45b0bbc06d37e4338ae84c9f2298b570f9/numpy-2.4.6-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:001fbb8e08d942dd57599e781f2472269ee7f2755fae407b4f67b2f0b17da3f1", size = 16689119, upload-time = "2026-05-18T23:33:54.065Z" },
-    { url = "https://files.pythonhosted.org/packages/ea/12/92c4c131527599e8288d6918e888d88726f84d805d784b771f32408aeaef/numpy-2.4.6-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:ebfb099f8dcf083deef3ac1ca4c1503f387cf76296fcb3816b66f5ecb5f54fdb", size = 14699246, upload-time = "2026-05-18T23:33:57.621Z" },
-    { url = "https://files.pythonhosted.org/packages/ad/fe/c0a6b7b2ca128a8fb228575147073b660656734b8ebe4d76c8fd748dcc79/numpy-2.4.6-cp312-cp312-macosx_14_0_arm64.whl", hash = "sha256:3213d622a0283a39a93d188f3cf72b26862df52fbb4ca3697f51705016523d41", size = 5204410, upload-time = "2026-05-18T23:34:00.302Z" },
-    { url = "https://files.pythonhosted.org/packages/f3/d4/9770d14ba719432bb90a421bfd443872ed0f70f7264b64bec12ea363d5fd/numpy-2.4.6-cp312-cp312-macosx_14_0_x86_64.whl", hash = "sha256:357cc07a6d7b0b182ff02249616a03742827ebb1277546b5c7cd7f7620a45698", size = 6551240, upload-time = "2026-05-18T23:34:02.852Z" },
-    { url = "https://files.pythonhosted.org/packages/c9/c6/50a46a6205feba2343f1d6d17438107c5dc491ed1c736e6ea68689fd906b/numpy-2.4.6-cp312-cp312-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:5f9fb9157b4ce2971008323afe46053787b526ef624fea915b261468a8421a0f", size = 15671012, upload-time = "2026-05-18T23:34:05.485Z" },
-    { url = "https://files.pythonhosted.org/packages/99/60/14115e6364fa676c5397c2ad3004e527e9aa487abf5d0706ec81bbd08529/numpy-2.4.6-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:90f9849678c75fe7afa2d348ac842c168b0a4d3d61919687216dfc547976d853", size = 16645538, upload-time = "2026-05-18T23:34:09.265Z" },
-    { url = "https://files.pythonhosted.org/packages/ae/c5/693cbe59e57db94d2231fa519ca3978dc9e19da5a8f088588f5c6e947ff2/numpy-2.4.6-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:c1a2af6c6ef86344a6b0db6b97834208bf598db514f2b155042439b62605601a", size = 17020706, upload-time = "2026-05-18T23:34:13.053Z" },
-    { url = "https://files.pythonhosted.org/packages/ef/fc/85b7c4eff9b4966ade25c2273cf7e7012e92366c032058653934b37de044/numpy-2.4.6-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:e5805d5a22fd19c8ccff10a9561f9df94436b0545619ea579db2d3c35294bce2", size = 18368541, upload-time = "2026-05-18T23:34:17.024Z" },
-    { url = "https://files.pythonhosted.org/packages/f6/81/e1b27545deedce7f4a0b348618c6b62d74e36a4dc9ccd42f3eb2f85eee32/numpy-2.4.6-cp312-cp312-win32.whl", hash = "sha256:e3eeb0aabd6bd5ce64faae67e9935203a6991b4bc2a485a767fbafb2c5125f45", size = 5962825, upload-time = "2026-05-18T23:34:20.3Z" },
-    { url = "https://files.pythonhosted.org/packages/ab/ca/feab00bd44aa5fe1ad2c18f08b4d3bb92e26484b0b1d1443897809ed528c/numpy-2.4.6-cp312-cp312-win_amd64.whl", hash = "sha256:d8e8286dd7cea7895157318d1b91cdacac64c479f3cbc8dce548331728484751", size = 12321687, upload-time = "2026-05-18T23:34:23.095Z" },
-    { url = "https://files.pythonhosted.org/packages/63/cf/5a6d34850a39d1093558564f77ee8e8e0bee5061151b8f05a55711001ec7/numpy-2.4.6-cp312-cp312-win_arm64.whl", hash = "sha256:4081eb135ac24158bd51cdfbef16f1c64df7063b1143f24731387137c092bec8", size = 10221482, upload-time = "2026-05-18T23:34:25.876Z" },
+    { url = "https://files.pythonhosted.org/packages/f8/02/2adcaa145158bf1a8295d83591d22e4103dbfd821bcaf6f3f53151ca4ffa/nvidia_cuda_cupti_cu12-12.8.90-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:ea0cb07ebda26bb9b29ba82cda34849e73c166c18162d3913575b0c9db9a6182", size = 10248621, upload-time = "2025-03-07T01:40:21.213Z" },
+]
+
+[[package]]
+name = "nvidia-cuda-nvrtc-cu12"
+version = "12.8.93"
+source = { registry = "https://pypi.org/simple" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/05/6b/32f747947df2da6994e999492ab306a903659555dddc0fbdeb9d71f75e52/nvidia_cuda_nvrtc_cu12-12.8.93-py3-none-manylinux2010_x86_64.manylinux_2_12_x86_64.whl", hash = "sha256:a7756528852ef889772a84c6cd89d41dfa74667e24cca16bb31f8f061e3e9994", size = 88040029, upload-time = "2025-03-07T01:42:13.562Z" },
+]
+
+[[package]]
+name = "nvidia-cuda-runtime-cu12"
+version = "12.8.90"
+source = { registry = "https://pypi.org/simple" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/0d/9b/a997b638fcd068ad6e4d53b8551a7d30fe8b404d6f1804abf1df69838932/nvidia_cuda_runtime_cu12-12.8.90-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:adade8dcbd0edf427b7204d480d6066d33902cab2a4707dcfc48a2d0fd44ab90", size = 954765, upload-time = "2025-03-07T01:40:01.615Z" },
+]
+
+[[package]]
+name = "nvidia-cudnn-cu12"
+version = "9.10.2.21"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "nvidia-cublas-cu12", marker = "sys_platform != 'emscripten' and sys_platform != 'win32'" },
+]
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/ba/51/e123d997aa098c61d029f76663dedbfb9bc8dcf8c60cbd6adbe42f76d049/nvidia_cudnn_cu12-9.10.2.21-py3-none-manylinux_2_27_x86_64.whl", hash = "sha256:949452be657fa16687d0930933f032835951ef0892b37d2d53824d1a84dc97a8", size = 706758467, upload-time = "2025-06-06T21:54:08.597Z" },
+]
+
+[[package]]
+name = "nvidia-cudnn-frontend"
+version = "1.18.0"
+source = { registry = "https://pypi.org/simple" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/e3/b4/604e230378680ee117849a4e1045baca092f93161a829291a84d5acce70c/nvidia_cudnn_frontend-1.18.0-cp312-cp312-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:310b417f2848a83d1437203fcaeea320a74fb7f28af20bf42bf5afc9c01f1c12", size = 2027408, upload-time = "2026-01-27T23:32:46.576Z" },
+    { url = "https://files.pythonhosted.org/packages/c6/52/08f98262e77b1cbcc834cc1a5db494d0661ea1dbdea58c2e2d51a57fdaca/nvidia_cudnn_frontend-1.18.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:6c023539ca6de99234cf5102c3ec0d6af817f5396fc93028a22ba5b834a35b8a", size = 2159245, upload-time = "2026-01-27T23:07:32.664Z" },
+    { url = "https://files.pythonhosted.org/packages/aa/1f/751a5a8cfdc95fb4dc556192d37369ae488c30c473fe9a3ec720b23d07ea/nvidia_cudnn_frontend-1.18.0-cp312-cp312-win_amd64.whl", hash = "sha256:e13f7dd46cdb4762dde87f181f06d1c5e15e9478bbdd547bfa74d9b11f415aae", size = 1591041, upload-time = "2026-01-27T23:09:04.118Z" },
+]
+
+[[package]]
+name = "nvidia-cufft-cu12"
+version = "11.3.3.83"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "nvidia-nvjitlink-cu12", marker = "sys_platform != 'emscripten' and sys_platform != 'win32'" },
+]
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/1f/13/ee4e00f30e676b66ae65b4f08cb5bcbb8392c03f54f2d5413ea99a5d1c80/nvidia_cufft_cu12-11.3.3.83-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:4d2dd21ec0b88cf61b62e6b43564355e5222e4a3fb394cac0db101f2dd0d4f74", size = 193118695, upload-time = "2025-03-07T01:45:27.821Z" },
+]
+
+[[package]]
+name = "nvidia-cufile-cu12"
+version = "1.13.1.3"
+source = { registry = "https://pypi.org/simple" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/bb/fe/1bcba1dfbfb8d01be8d93f07bfc502c93fa23afa6fd5ab3fc7c1df71038a/nvidia_cufile_cu12-1.13.1.3-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:1d069003be650e131b21c932ec3d8969c1715379251f8d23a1860554b1cb24fc", size = 1197834, upload-time = "2025-03-07T01:45:50.723Z" },
+]
+
+[[package]]
+name = "nvidia-curand-cu12"
+version = "10.3.9.90"
+source = { registry = "https://pypi.org/simple" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/fb/aa/6584b56dc84ebe9cf93226a5cde4d99080c8e90ab40f0c27bda7a0f29aa1/nvidia_curand_cu12-10.3.9.90-py3-none-manylinux_2_27_x86_64.whl", hash = "sha256:b32331d4f4df5d6eefa0554c565b626c7216f87a06a4f56fab27c3b68a830ec9", size = 63619976, upload-time = "2025-03-07T01:46:23.323Z" },
+]
+
+[[package]]
+name = "nvidia-cusolver-cu12"
+version = "11.7.3.90"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "nvidia-cublas-cu12", marker = "sys_platform != 'emscripten' and sys_platform != 'win32'" },
+    { name = "nvidia-cusparse-cu12", marker = "sys_platform != 'emscripten' and sys_platform != 'win32'" },
+    { name = "nvidia-nvjitlink-cu12", marker = "sys_platform != 'emscripten' and sys_platform != 'win32'" },
+]
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/85/48/9a13d2975803e8cf2777d5ed57b87a0b6ca2cc795f9a4f59796a910bfb80/nvidia_cusolver_cu12-11.7.3.90-py3-none-manylinux_2_27_x86_64.whl", hash = "sha256:4376c11ad263152bd50ea295c05370360776f8c3427b30991df774f9fb26c450", size = 267506905, upload-time = "2025-03-07T01:47:16.273Z" },
+]
+
+[[package]]
+name = "nvidia-cusparse-cu12"
+version = "12.5.8.93"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "nvidia-nvjitlink-cu12", marker = "sys_platform != 'emscripten' and sys_platform != 'win32'" },
+]
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/c2/f5/e1854cb2f2bcd4280c44736c93550cc300ff4b8c95ebe370d0aa7d2b473d/nvidia_cusparse_cu12-12.5.8.93-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:1ec05d76bbbd8b61b06a80e1eaf8cf4959c3d4ce8e711b65ebd0443bb0ebb13b", size = 288216466, upload-time = "2025-03-07T01:48:13.779Z" },
+]
+
+[[package]]
+name = "nvidia-cusparselt-cu12"
+version = "0.7.1"
+source = { registry = "https://pypi.org/simple" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/56/79/12978b96bd44274fe38b5dde5cfb660b1d114f70a65ef962bcbbed99b549/nvidia_cusparselt_cu12-0.7.1-py3-none-manylinux2014_x86_64.whl", hash = "sha256:f1bb701d6b930d5a7cea44c19ceb973311500847f81b634d802b7b539dc55623", size = 287193691, upload-time = "2025-02-26T00:15:44.104Z" },
+]
+
+[[package]]
+name = "nvidia-cutlass-dsl"
+version = "4.5.2"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "nvidia-cutlass-dsl-libs-base" },
+]
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/f0/15/575d7df4fe2f3406f1cfc68be72aeff2834f8a696daf1cd5bee8017e4507/nvidia_cutlass_dsl-4.5.2-py3-none-any.whl", hash = "sha256:68ed1b63ca74aae87955012da9dfd7fdaae471329d0028b229b841c7192ccf52", size = 10179, upload-time = "2026-05-25T03:38:56.364Z" },
+]
+
+[[package]]
+name = "nvidia-cutlass-dsl-libs-base"
+version = "4.5.2"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "cuda-python", version = "12.9.4", source = { registry = "https://pypi.org/simple" }, marker = "sys_platform != 'emscripten' and sys_platform != 'win32'" },
+    { name = "cuda-python", version = "13.3.1", source = { registry = "https://pypi.org/simple" }, marker = "sys_platform == 'emscripten' or sys_platform == 'win32'" },
+    { name = "numpy" },
+    { name = "typing-extensions" },
+]
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/b1/ef/e827e3c67d72adbf4e8f680bdf03b1b67723d9e1ae7c3d0a1751f39f69ce/nvidia_cutlass_dsl_libs_base-4.5.2-cp312-cp312-manylinux_2_28_aarch64.whl", hash = "sha256:d2a3c412287e356fbe48fe9f845d6d33cd35dea5e20d7e4f628c20957967cacd", size = 75643473, upload-time = "2026-05-25T03:49:15.857Z" },
+    { url = "https://files.pythonhosted.org/packages/97/68/c1247ab848f26c4ab56e562eea0e3f31fc14c9aaf0d883afaa92d8f05592/nvidia_cutlass_dsl_libs_base-4.5.2-cp312-cp312-manylinux_2_28_x86_64.whl", hash = "sha256:15ef6a59193667e663934ef4873f8ccad37455e9b7c3c419c3072113b8aedf61", size = 74513226, upload-time = "2026-05-25T03:51:32.496Z" },
+]
+
+[[package]]
+name = "nvidia-ml-py"
+version = "13.610.43"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/f0/b5/a8fbc356f768fa5c9cfd646668fd7d34bf55bdd1c6e20754642a64d930d4/nvidia_ml_py-13.610.43.tar.gz", hash = "sha256:65437eb73d68d0c62c931ca4d45038472faff03bd0b8729abba4b899f70d60f2", size = 52109, upload-time = "2026-06-01T18:54:08.829Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/23/45/caa600acfab94560807a20a64b5830d2cd3c3202b7f1328644d70b7d6bd8/nvidia_ml_py-13.610.43-py3-none-any.whl", hash = "sha256:f13c72698edef492f985cc225f14faafe68ae065a2e407f45bdf6f4b9b43fde8", size = 53163, upload-time = "2026-06-01T18:54:07.704Z" },
+]
+
+[[package]]
+name = "nvidia-nccl-cu12"
+version = "2.27.5"
+source = { registry = "https://pypi.org/simple" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/6e/89/f7a07dc961b60645dbbf42e80f2bc85ade7feb9a491b11a1e973aa00071f/nvidia_nccl_cu12-2.27.5-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:ad730cf15cb5d25fe849c6e6ca9eb5b76db16a80f13f425ac68d8e2e55624457", size = 322348229, upload-time = "2025-06-26T04:11:28.385Z" },
+]
+
+[[package]]
+name = "nvidia-nvjitlink-cu12"
+version = "12.8.93"
+source = { registry = "https://pypi.org/simple" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/f6/74/86a07f1d0f42998ca31312f998bd3b9a7eff7f52378f4f270c8679c77fb9/nvidia_nvjitlink_cu12-12.8.93-py3-none-manylinux2010_x86_64.manylinux_2_12_x86_64.whl", hash = "sha256:81ff63371a7ebd6e6451970684f916be2eab07321b73c9d244dc2b4da7f73b88", size = 39254836, upload-time = "2025-03-07T01:49:55.661Z" },
+]
+
+[[package]]
+name = "nvidia-nvshmem-cu12"
+version = "3.4.5"
+source = { registry = "https://pypi.org/simple" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/b5/09/6ea3ea725f82e1e76684f0708bbedd871fc96da89945adeba65c3835a64c/nvidia_nvshmem_cu12-3.4.5-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:042f2500f24c021db8a06c5eec2539027d57460e1c1a762055a6554f72c369bd", size = 139103095, upload-time = "2025-09-06T00:32:31.266Z" },
+]
+
+[[package]]
+name = "nvidia-nvtx-cu12"
+version = "12.8.90"
+source = { registry = "https://pypi.org/simple" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/a2/eb/86626c1bbc2edb86323022371c39aa48df6fd8b0a1647bc274577f72e90b/nvidia_nvtx_cu12-12.8.90-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:5b17e2001cc0d751a5bc2c6ec6d26ad95913324a4adb86788c944f8ce9ba441f", size = 89954, upload-time = "2025-03-07T01:42:44.131Z" },
 ]
 
 [[package]]
@@ -1127,6 +1928,29 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/0a/bf/ccff9be562e24207716d04ef9dc931c76aff0c89a7265da43e2104d7fe06/openai-2.38.0-py3-none-any.whl", hash = "sha256:ec6661c57b2dcc47414a767e6e3335c7ed3d19c9696999283a3c82e95c756a3c", size = 1344910, upload-time = "2026-05-21T21:23:39.636Z" },
 ]
 
+[[package]]
+name = "openai-harmony"
+version = "0.0.8"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "pydantic" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/3e/92/2d038d096f29179c7c9571b431f9e739f87a487121901725e23fe338dd9d/openai_harmony-0.0.8.tar.gz", hash = "sha256:6e43f98e6c242fa2de6f8ea12eab24af63fa2ed3e89c06341fb9d92632c5cbdf", size = 284777, upload-time = "2025-11-05T19:07:06.727Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/45/c6/2502f416d46be3ec08bb66d696cccffb57781a499e3ff2e4d7c174af4e8f/openai_harmony-0.0.8-cp38-abi3-macosx_11_0_arm64.whl", hash = "sha256:029ec25ca74abe48fdb58eb9fdd2a8c1618581fc33ce8e5653f8a1ffbfbd9326", size = 2627806, upload-time = "2025-11-05T19:06:57.063Z" },
+    { url = "https://files.pythonhosted.org/packages/d3/d2/ce6953ca87db9cae3e775024184da7d1c5cb88cead19a2d75b42f00a959c/openai_harmony-0.0.8-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:e4f709815924ec325b9a890e6ab2bbb0ceec8e319a4e257328eb752cf36b2efc", size = 2948463, upload-time = "2025-11-05T19:06:48.17Z" },
+    { url = "https://files.pythonhosted.org/packages/fa/4c/b553c9651662d6ce102ca7f3629d268b23df1abe5841e24bed81e8a8e949/openai_harmony-0.0.8-cp38-abi3-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:5cfcfd963b50a41fc656c84d3440ca6eecdccd6c552158ce790b8f2e33dfb5a9", size = 2704083, upload-time = "2025-11-05T19:06:50.205Z" },
+    { url = "https://files.pythonhosted.org/packages/9b/af/4eec8f9ab9c27bcdb444460c72cf43011d176fc44c79d6e113094ca1e152/openai_harmony-0.0.8-cp38-abi3-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:0a3a16972aa1cee38ea958470cd04ac9a2d5ac38fdcf77ab686611246220c158", size = 2959765, upload-time = "2025-11-05T19:06:53.62Z" },
+    { url = "https://files.pythonhosted.org/packages/11/3c/33f3374e4624e0e776f6b13b73c45a7ead7f9c4529f8369ed5bfcaa30cac/openai_harmony-0.0.8-cp38-abi3-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:b4d5cfa168e74d08f8ba6d58a7e49bc7daef4d58951ec69b66b0d56f4927a68d", size = 3427031, upload-time = "2025-11-05T19:06:51.829Z" },
+    { url = "https://files.pythonhosted.org/packages/25/3f/1a192b93bb47c6b44cd98ba8cc1d3d2a9308f1bb700c3017e6352da11bda/openai_harmony-0.0.8-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:c007d277218a50db8839e599ed78e0fffe5130f614c3f6d93ae257f282071a29", size = 2953260, upload-time = "2025-11-05T19:06:55.406Z" },
+    { url = "https://files.pythonhosted.org/packages/5b/f8/93b582cad3531797c3db7c2db5400fd841538ccddfd9f5e3df61be99a630/openai_harmony-0.0.8-cp38-abi3-musllinux_1_2_aarch64.whl", hash = "sha256:8565d4f5a0638da1bffde29832ed63c9e695c558611053add3b2dc0b56c92dbc", size = 3127044, upload-time = "2025-11-05T19:06:59.553Z" },
+    { url = "https://files.pythonhosted.org/packages/1d/10/4327dbf87f75ae813405fd9a9b4a5cde63d506ffed0a096a440a4cabd89c/openai_harmony-0.0.8-cp38-abi3-musllinux_1_2_armv7l.whl", hash = "sha256:cbaa3bda75ef0d8836e1f8cc84af62f971b1d756d740efc95c38c3e04c0bfde2", size = 2932931, upload-time = "2025-11-05T19:07:01.437Z" },
+    { url = "https://files.pythonhosted.org/packages/8a/c8/1774eec4f6f360ef57618fb8f52e3d3af245b2491bd0297513aa09eec04b/openai_harmony-0.0.8-cp38-abi3-musllinux_1_2_i686.whl", hash = "sha256:772922a9bd24e133950fad71eb1550836f415a88e8c77870e12d0c3bd688ddc2", size = 2996140, upload-time = "2025-11-05T19:07:03.438Z" },
+    { url = "https://files.pythonhosted.org/packages/60/c3/3d1e01e2dba517a91760e4a03e4f20ffc75039a6fe584d0e6f9b5c78fd15/openai_harmony-0.0.8-cp38-abi3-musllinux_1_2_x86_64.whl", hash = "sha256:007b0476a1f331f8130783f901f1da6f5a7057af1a4891f1b6a31dec364189b5", size = 3205080, upload-time = "2025-11-05T19:07:05.078Z" },
+    { url = "https://files.pythonhosted.org/packages/14/63/119de431572d7c70a7bf1037034a9be6ed0a7502a7498ba7302bca5b3242/openai_harmony-0.0.8-cp38-abi3-win32.whl", hash = "sha256:a9b5f893326b28d9e935ade14b4f655f5a840942473bc89b201c25f7a15af9cf", size = 2082457, upload-time = "2025-11-05T19:07:09.631Z" },
+    { url = "https://files.pythonhosted.org/packages/40/1f/c83cf5a206c263ee70448a5ae4264682555f4d0b5bed0d2cc6ca1108103d/openai_harmony-0.0.8-cp38-abi3-win_amd64.whl", hash = "sha256:39d44f0d8f466bd56698e7ead708bead3141e27b9b87e3ab7d5a6d0e4a869ee5", size = 2438369, upload-time = "2025-11-05T19:07:08.1Z" },
+]
+
 [[package]]
 name = "openapi-pydantic"
 version = "0.5.1"
@@ -1139,6 +1963,24 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/12/cf/03675d8bd8ecbf4445504d8071adab19f5f993676795708e36402ab38263/openapi_pydantic-0.5.1-py3-none-any.whl", hash = "sha256:a3a09ef4586f5bd760a8df7f43028b60cafb6d9f61de2acba9574766255ab146", size = 96381, upload-time = "2025-01-08T19:29:25.275Z" },
 ]
 
+[[package]]
+name = "opencv-python-headless"
+version = "4.13.0.92"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "numpy" },
+]
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/79/42/2310883be3b8826ac58c3f2787b9358a2d46923d61f88fedf930bc59c60c/opencv_python_headless-4.13.0.92-cp37-abi3-macosx_13_0_arm64.whl", hash = "sha256:1a7d040ac656c11b8c38677cc8cccdc149f98535089dbe5b081e80a4e5903209", size = 46247192, upload-time = "2026-02-05T07:01:35.187Z" },
+    { url = "https://files.pythonhosted.org/packages/2d/1e/6f9e38005a6f7f22af785df42a43139d0e20f169eb5787ce8be37ee7fcc9/opencv_python_headless-4.13.0.92-cp37-abi3-macosx_14_0_x86_64.whl", hash = "sha256:3e0a6f0a37994ec6ce5f59e936be21d5d6384a4556f2d2da9c2f9c5dc948394c", size = 32568914, upload-time = "2026-02-05T07:01:51.989Z" },
+    { url = "https://files.pythonhosted.org/packages/21/76/9417a6aef9def70e467a5bf560579f816148a4c658b7d525581b356eda9e/opencv_python_headless-4.13.0.92-cp37-abi3-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:5c8cfc8e87ed452b5cecb9419473ee5560a989859fe1d10d1ce11ae87b09a2cb", size = 33703709, upload-time = "2026-02-05T10:24:46.469Z" },
+    { url = "https://files.pythonhosted.org/packages/92/ce/bd17ff5772938267fd49716e94ca24f616ff4cb1ff4c6be13085108037be/opencv_python_headless-4.13.0.92-cp37-abi3-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:0525a3d2c0b46c611e2130b5fdebc94cf404845d8fa64d2f3a3b679572a5bd22", size = 56016764, upload-time = "2026-02-05T10:26:48.904Z" },
+    { url = "https://files.pythonhosted.org/packages/8f/b4/b7bcbf7c874665825a8c8e1097e93ea25d1f1d210a3e20d4451d01da30aa/opencv_python_headless-4.13.0.92-cp37-abi3-manylinux_2_28_aarch64.whl", hash = "sha256:eb60e36b237b1ebd40a912da5384b348df8ed534f6f644d8e0b4f103e272ba7d", size = 35010236, upload-time = "2026-02-05T10:28:11.031Z" },
+    { url = "https://files.pythonhosted.org/packages/4b/33/b5db29a6c00eb8f50708110d8d453747ca125c8b805bc437b289dbdcc057/opencv_python_headless-4.13.0.92-cp37-abi3-manylinux_2_28_x86_64.whl", hash = "sha256:0bd48544f77c68b2941392fcdf9bcd2b9cdf00e98cb8c29b2455d194763cf99e", size = 60391106, upload-time = "2026-02-05T10:30:14.236Z" },
+    { url = "https://files.pythonhosted.org/packages/fb/c3/52cfea47cd33e53e8c0fbd6e7c800b457245c1fda7d61660b4ffe9596a7f/opencv_python_headless-4.13.0.92-cp37-abi3-win32.whl", hash = "sha256:a7cf08e5b191f4ebb530791acc0825a7986e0d0dee2a3c491184bd8599848a4b", size = 30812232, upload-time = "2026-02-05T07:02:29.594Z" },
+    { url = "https://files.pythonhosted.org/packages/4a/90/b338326131ccb2aaa3c2c85d00f41822c0050139a4bfe723cfd95455bd2d/opencv_python_headless-4.13.0.92-cp37-abi3-win_amd64.whl", hash = "sha256:77a82fe35ddcec0f62c15f2ba8a12ecc2ed4207c17b0902c7a3151ae29f37fb6", size = 40070414, upload-time = "2026-02-05T07:02:26.448Z" },
+]
+
 [[package]]
 name = "opentelemetry-api"
 version = "1.42.1"
@@ -1148,7 +1990,120 @@ dependencies = [
 ]
 sdist = { url = "https://files.pythonhosted.org/packages/b4/1c/125e1c936c0873796771b7f04f6c93b9f1bf5d424cea90fda94a99f61da8/opentelemetry_api-1.42.1.tar.gz", hash = "sha256:56c63bea9f77b62856be8c47600474acad853b2924b99b1687c4cb6297166716", size = 72296, upload-time = "2026-05-21T16:32:49.335Z" }
 wheels = [
-    { url = "https://files.pythonhosted.org/packages/a3/ca/9520cc1f3dfbbd03ac5903bbf55833e257bc64b1cf30fa8b0d6df374d821/opentelemetry_api-1.42.1-py3-none-any.whl", hash = "sha256:51a69edacadbc03a8950ace1c4c21099cacc538820ac2c9e36277e78cebba714", size = 61311, upload-time = "2026-05-21T16:32:28.822Z" },
+    { url = "https://files.pythonhosted.org/packages/a3/ca/9520cc1f3dfbbd03ac5903bbf55833e257bc64b1cf30fa8b0d6df374d821/opentelemetry_api-1.42.1-py3-none-any.whl", hash = "sha256:51a69edacadbc03a8950ace1c4c21099cacc538820ac2c9e36277e78cebba714", size = 61311, upload-time = "2026-05-21T16:32:28.822Z" },
+]
+
+[[package]]
+name = "opentelemetry-exporter-otlp"
+version = "1.42.1"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "opentelemetry-exporter-otlp-proto-grpc" },
+    { name = "opentelemetry-exporter-otlp-proto-http" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/08/94/8637919a5d01f81dacf510234bc0110b944f4687a6e96b0a02adf2f6bdce/opentelemetry_exporter_otlp-1.42.1.tar.gz", hash = "sha256:2d9ebaed714377a67d224d46795ddcc11d2c877fa5de35fda70b6f3b010729a9", size = 6086, upload-time = "2026-05-21T16:32:51.963Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/6c/4d/c26080295a36fd22e201fefd7cb9c22cd203189b1af8cd73b158382b7ad8/opentelemetry_exporter_otlp-1.42.1-py3-none-any.whl", hash = "sha256:aedd54545bb0587cd45210abdc8be545af9c01413f3307786e276df1e3c83bee", size = 6733, upload-time = "2026-05-21T16:32:31.261Z" },
+]
+
+[[package]]
+name = "opentelemetry-exporter-otlp-proto-common"
+version = "1.42.1"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "opentelemetry-proto" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/0e/9c/216acfeaedadf2e1937f4373929b20f73197c5c4a2546d4f584b7fa63813/opentelemetry_exporter_otlp_proto_common-1.42.1.tar.gz", hash = "sha256:04f1f01fb597c4249dfcd7f8b861c902c2102369d376d9d346ff38de4469a2ee", size = 21433, upload-time = "2026-05-21T16:32:55.526Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/d6/43/2375e7612e1121a4518c17603b6e0b03ad94f565aafad53f464dc5be2bf6/opentelemetry_exporter_otlp_proto_common-1.42.1-py3-none-any.whl", hash = "sha256:f48d395ab815b444da118868977e9798ea354c25737d5cf39578ae894011c140", size = 17327, upload-time = "2026-05-21T16:32:33.387Z" },
+]
+
+[[package]]
+name = "opentelemetry-exporter-otlp-proto-grpc"
+version = "1.42.1"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "googleapis-common-protos" },
+    { name = "grpcio" },
+    { name = "opentelemetry-api" },
+    { name = "opentelemetry-exporter-otlp-proto-common" },
+    { name = "opentelemetry-proto" },
+    { name = "opentelemetry-sdk" },
+    { name = "typing-extensions" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/87/87/ca7fc790dfdbcf4f9e9aab14a39ef1b7508ead13707e283de0b3131478d2/opentelemetry_exporter_otlp_proto_grpc-1.42.1.tar.gz", hash = "sha256:975c4461f167dd8ed8857d68d3b6b25f3d272eab896f6a9470d0f5b90e2faf15", size = 27140, upload-time = "2026-05-21T16:32:56.162Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/89/2b/28ba5b128f47fe8c3bab541000d6feb4b5a9bd26623ca013406f01c0fb60/opentelemetry_exporter_otlp_proto_grpc-1.42.1-py3-none-any.whl", hash = "sha256:0ae1177e2038b18a929b3098215243631ef91136cba26b7e2b12790ceb7e87cc", size = 19617, upload-time = "2026-05-21T16:32:34.278Z" },
+]
+
+[[package]]
+name = "opentelemetry-exporter-otlp-proto-http"
+version = "1.42.1"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "googleapis-common-protos" },
+    { name = "opentelemetry-api" },
+    { name = "opentelemetry-exporter-otlp-proto-common" },
+    { name = "opentelemetry-proto" },
+    { name = "opentelemetry-sdk" },
+    { name = "requests" },
+    { name = "typing-extensions" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/77/32/826bfa1d80ecea24f47808de03cd4a0d13c17ecc07712f45123f0f61e4ac/opentelemetry_exporter_otlp_proto_http-1.42.1.tar.gz", hash = "sha256:bf142a21035d7571ac3a09cb2e5639f49886f243972883cfe777ed3bf02b734d", size = 25406, upload-time = "2026-05-21T16:32:56.807Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/d3/96/82cb223a1502f0787d4bbff12907f5f8d870a50731febcd5818d93ef9555/opentelemetry_exporter_otlp_proto_http-1.42.1-py3-none-any.whl", hash = "sha256:00a16da1b312a1d6c7233d600d557c91df71125af73020f3b9a7765bd699d59d", size = 21793, upload-time = "2026-05-21T16:32:35.277Z" },
+]
+
+[[package]]
+name = "opentelemetry-proto"
+version = "1.42.1"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "protobuf" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/b4/55/63eac3e1089b768ba014091fdd2ae8a9a440c821ef5e2b786909c94c8836/opentelemetry_proto-1.42.1.tar.gz", hash = "sha256:c6a51e6b4f05ae63565f3a113217f3d2bfaec68f78c02d7a6c85f9010d1cfca6", size = 45839, upload-time = "2026-05-21T16:33:03.937Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/41/9d/171c02c84a76940b7e601805b3bb536985aded9168fbcc9ba52f0a730fa2/opentelemetry_proto-1.42.1-py3-none-any.whl", hash = "sha256:dedb74cba2886c59c7789b227a7a670613025a07489040050aedff6e5c0fb43c", size = 71782, upload-time = "2026-05-21T16:32:44.867Z" },
+]
+
+[[package]]
+name = "opentelemetry-sdk"
+version = "1.42.1"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "opentelemetry-api" },
+    { name = "opentelemetry-semantic-conventions" },
+    { name = "typing-extensions" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/40/f7/b390bd9bfd703bf98a68fea1f27786c6872331fd617164a54b8a59bdc008/opentelemetry_sdk-1.42.1.tar.gz", hash = "sha256:8c834e8f8c9ba4171d4ec843d0cb8a67e4c7394d3f9e9297e582cbd9456ddbf7", size = 239262, upload-time = "2026-05-21T16:33:04.641Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/8f/6b/4287766cfbde577ae2272e8884abac325aeaac0d64f41c61d5b8cc595105/opentelemetry_sdk-1.42.1-py3-none-any.whl", hash = "sha256:083cd4bbfaa5aa7b5a9e552430d9951219967cfb27aa61feb13a77aba1fc839d", size = 170907, upload-time = "2026-05-21T16:32:45.894Z" },
+]
+
+[[package]]
+name = "opentelemetry-semantic-conventions"
+version = "0.63b1"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "opentelemetry-api" },
+    { name = "typing-extensions" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/93/99/4d7dd6df64795951413ce6e815f8cf1eb191daf7196ae86574589643d5f3/opentelemetry_semantic_conventions-0.63b1.tar.gz", hash = "sha256:3daf963611334b365e98a57438183eb012d3bfb40b2d931a9af613476b8701a9", size = 148340, upload-time = "2026-05-21T16:33:05.455Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/cb/7a/7fe66f5f3682b1dd47d88cc4e11f1c6c0966b737de2d16671146e23c39a5/opentelemetry_semantic_conventions-0.63b1-py3-none-any.whl", hash = "sha256:dfe5ef4dee82586b746f522b818ceb298d00b3d59f660042bd79404bff8d0682", size = 203713, upload-time = "2026-05-21T16:32:47.016Z" },
+]
+
+[[package]]
+name = "opentelemetry-semantic-conventions-ai"
+version = "0.5.1"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "opentelemetry-sdk" },
+    { name = "opentelemetry-semantic-conventions" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/24/02/10aeacc37a38a3a8fa16ff67bec1ae3bf882539f6f9efb0f70acf802ca2d/opentelemetry_semantic_conventions_ai-0.5.1.tar.gz", hash = "sha256:153906200d8c1d2f8e09bd78dbef526916023de85ac3dab35912bfafb69ff04c", size = 26533, upload-time = "2026-03-26T14:20:38.73Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/55/22/41fb05f1dc5fda2c468e05a41814c20859016c85117b66c8a257cae814f6/opentelemetry_semantic_conventions_ai-0.5.1-py3-none-any.whl", hash = "sha256:25aeb22bd261543b4898a73824026d96770e5351209c7d07a0b1314762b1f6e4", size = 11250, upload-time = "2026-03-26T14:20:37.108Z" },
 ]
 
 [[package]]
@@ -1163,6 +2118,22 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/55/8b/5ab7257531a5d830fc8000c476e63c935488d74609b50f9384a643ec0a62/outcome-1.3.0.post0-py2.py3-none-any.whl", hash = "sha256:e771c5ce06d1415e356078d3bdd68523f284b4ce5419828922b6871e65eda82b", size = 10692, upload-time = "2023-10-26T04:26:02.532Z" },
 ]
 
+[[package]]
+name = "outlines-core"
+version = "0.2.11"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/1a/d3/e04e9145f8f806723dec9b9e5227ad695a3efcd3ced7794cf7c22b15df5e/outlines_core-0.2.11.tar.gz", hash = "sha256:dfce56f717ff5083e54cbcfdb66cad243365437fccbb5509adaa7e31e030f1d8", size = 197263, upload-time = "2025-05-19T10:12:51.719Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/5f/2c/c7636823244c70e2960060bf9bd978248dffb55c5e7c91c46d18354b2a24/outlines_core-0.2.11-cp312-cp312-macosx_14_0_arm64.whl", hash = "sha256:4a9db4872bae083631d720994f4cee603bce0536b33d5a988814576863b657cf", size = 1957668, upload-time = "2025-05-19T10:12:18.29Z" },
+    { url = "https://files.pythonhosted.org/packages/c7/09/5c62047da139d722317a444a4d01cd5f11943a8c2eaecce784341dd0844a/outlines_core-0.2.11-cp312-cp312-macosx_14_0_x86_64.whl", hash = "sha256:8359a45c59f6a8f2eb717245806501a59044c75f6ea8bd08faaa131cc8cdec45", size = 2130493, upload-time = "2025-05-19T10:12:19.537Z" },
+    { url = "https://files.pythonhosted.org/packages/89/7a/d6a2810f90e37d550168e0c0a9a915086ea721444727e3ca2c630898d1ef/outlines_core-0.2.11-cp312-cp312-macosx_15_0_arm64.whl", hash = "sha256:5d26a46591377340e0b870b8a96ea8341058341a62ee0bded9098e0c88dd24f4", size = 1956804, upload-time = "2025-05-19T10:12:20.755Z" },
+    { url = "https://files.pythonhosted.org/packages/ca/ea/339e6c273b5581128c3b7ca27d428d8993c3085912af1a467aa32ef0e9d1/outlines_core-0.2.11-cp312-cp312-macosx_15_0_x86_64.whl", hash = "sha256:ae460a34675fb11d92a5c605a480fbae4cd6c1b2d11b3698da64a7fcaba64dcf", size = 2127085, upload-time = "2025-05-19T10:12:22.02Z" },
+    { url = "https://files.pythonhosted.org/packages/92/c7/a65d1fddf49830ebc41422294eacde35286d9f68994a8aa905cb14f5aade/outlines_core-0.2.11-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:86df9740368866295077346440d911df4972da2b3f1f54b8125e6f329e8a8891", size = 2287677, upload-time = "2025-05-19T10:12:24.24Z" },
+    { url = "https://files.pythonhosted.org/packages/23/79/8795aed8be9b77dd69d78e7cfbfcf28c179e6b08da6e56bbbf48a09fe55f/outlines_core-0.2.11-cp312-cp312-manylinux_2_28_aarch64.whl", hash = "sha256:96ce4dd78f106799be4a0a5795cefd1352806162973756a4b6fce4bb6eddd7e4", size = 2113000, upload-time = "2025-05-19T10:12:25.446Z" },
+    { url = "https://files.pythonhosted.org/packages/59/e3/cbe9294b06d92ee1892dbb6f2125d833d68e8629d45d080d6daba54eec2d/outlines_core-0.2.11-cp312-cp312-win32.whl", hash = "sha256:358db161cce3650ba822e118dcf0a1efa571c7deb4864ab9d64ca2c9cca7425d", size = 1765703, upload-time = "2025-05-19T10:12:26.693Z" },
+    { url = "https://files.pythonhosted.org/packages/1d/c9/ed3cf362515fac16e313368b9b2f2497051f4ded88679205830b6f889f54/outlines_core-0.2.11-cp312-cp312-win_amd64.whl", hash = "sha256:231f9d20d2630c70665345821780d7808b29539620a75c99f65113b518c51032", size = 2060945, upload-time = "2025-05-19T10:12:28.294Z" },
+]
+
 [[package]]
 name = "packaging"
 version = "26.2"
@@ -1193,6 +2164,15 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/86/bd/fda8f9705b1b09c6ebe14bfc0fa0e4ec8584d54ea673628f157ff55131af/pandas-3.0.3-cp312-cp312-win_arm64.whl", hash = "sha256:557409bc4178e70ee8d9ddb494798e51ebf6ea59330f6be22c51bab2a7db6c49", size = 9066158, upload-time = "2026-05-11T18:52:56.038Z" },
 ]
 
+[[package]]
+name = "partial-json-parser"
+version = "0.2.1.1.post7"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/6a/6d/eed37d7ebc1e0bcd27b831c0cf1fe94881934316187c4b30d23f29ea0bd4/partial_json_parser-0.2.1.1.post7.tar.gz", hash = "sha256:86590e1ba6bcb6739a2dfc17d2323f028cb5884f4c6ce23db376999132c9a922", size = 10296, upload-time = "2025-11-17T07:27:41.202Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/42/32/658973117bf0fd82a24abbfb94fe73a5e86216e49342985e10acce54775a/partial_json_parser-0.2.1.1.post7-py3-none-any.whl", hash = "sha256:145119e5eabcf80cbb13844a6b50a85c68bf99d376f8ed771e2a3c3b03e653ae", size = 10877, upload-time = "2025-11-17T07:27:40.457Z" },
+]
+
 [[package]]
 name = "pathable"
 version = "0.6.0"
@@ -1247,28 +2227,35 @@ dependencies = [
     { name = "webdriver-manager" },
 ]
 
+[package.optional-dependencies]
+reader = [
+    { name = "vllm" },
+]
+
 [package.metadata]
 requires-dist = [
-    { name = "aiohttp", specifier = "==3.13.5" },
-    { name = "beautifulsoup4", specifier = "==4.14.3" },
-    { name = "botocore", specifier = "==1.43.18" },
-    { name = "datasets", specifier = "==4.8.5" },
-    { name = "fastmcp", specifier = "==3.3.1" },
-    { name = "litellm", specifier = "==1.86.2" },
-    { name = "lxml", specifier = "==6.1.1" },
-    { name = "numpy", specifier = "==2.4.6" },
-    { name = "omegaconf", specifier = "==2.3.0" },
-    { name = "openai", specifier = "==2.38.0" },
-    { name = "pillow", specifier = "==12.2.0" },
-    { name = "requests", specifier = "==2.34.2" },
-    { name = "retry", specifier = "==0.9.2" },
-    { name = "selenium", specifier = "==4.44.0" },
-    { name = "tenacity", specifier = "==9.1.4" },
-    { name = "tiktoken", specifier = "==0.13.0" },
-    { name = "tqdm", specifier = "==4.67.3" },
-    { name = "trafilatura", specifier = "==2.0.0" },
-    { name = "webdriver-manager", specifier = "==4.1.1" },
-]
+    { name = "aiohttp", specifier = ">=3.13.5" },
+    { name = "beautifulsoup4", specifier = ">=4.14.3" },
+    { name = "botocore", specifier = ">=1.43.18" },
+    { name = "datasets", specifier = ">=4.8.5" },
+    { name = "fastmcp", specifier = ">=3.3.1" },
+    { name = "litellm", specifier = ">=1.86.2" },
+    { name = "lxml", specifier = ">=6.1.1" },
+    { name = "numpy", specifier = ">=1.26.0,<2.3" },
+    { name = "omegaconf", specifier = ">=2.3.0" },
+    { name = "openai", specifier = ">=2.38.0" },
+    { name = "pillow", specifier = ">=12.2.0" },
+    { name = "requests", specifier = ">=2.34.2" },
+    { name = "retry", specifier = ">=0.9.2" },
+    { name = "selenium", specifier = ">=4.44.0" },
+    { name = "tenacity", specifier = ">=9.1.4" },
+    { name = "tiktoken", specifier = ">=0.13.0" },
+    { name = "tqdm", specifier = ">=4.67.3" },
+    { name = "trafilatura", specifier = ">=2.0.0" },
+    { name = "vllm", marker = "extra == 'reader'", specifier = "==0.19.0" },
+    { name = "webdriver-manager", specifier = ">=4.1.1" },
+]
+provides-extras = ["reader"]
 
 [[package]]
 name = "platformdirs"
@@ -1279,6 +2266,28 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/81/e6/cd9575ac904136b3cbf7aa7ee819ef86eedb7274e46f230e94ea4342e729/platformdirs-4.10.0-py3-none-any.whl", hash = "sha256:fb516cdb12eb0d857d0cd85a7c57cea4d060bee4578d6cf5a14dfdf8cbf8784a", size = 22743, upload-time = "2026-05-28T03:32:52.175Z" },
 ]
 
+[[package]]
+name = "prometheus-client"
+version = "0.25.0"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/1b/fb/d9aa83ffe43ce1f19e557c0971d04b90561b0cfd50762aafb01968285553/prometheus_client-0.25.0.tar.gz", hash = "sha256:5e373b75c31afb3c86f1a52fa1ad470c9aace18082d39ec0d2f918d11cc9ba28", size = 86035, upload-time = "2026-04-09T19:53:42.359Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/8d/9b/d4b1e644385499c8346fa9b622a3f030dce14cd6ef8a1871c221a17a67e7/prometheus_client-0.25.0-py3-none-any.whl", hash = "sha256:d5aec89e349a6ec230805d0df882f3807f74fd6c1a2fa86864e3c2279059fed1", size = 64154, upload-time = "2026-04-09T19:53:41.324Z" },
+]
+
+[[package]]
+name = "prometheus-fastapi-instrumentator"
+version = "8.0.2"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "prometheus-client" },
+    { name = "starlette" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/1b/e9/2065686d1dfa62296fdc158b6e8fd25b0cb3dca09b0632cabeb5ae81fe4d/prometheus_fastapi_instrumentator-8.0.2.tar.gz", hash = "sha256:3c252e748151768a7aefd66824a04a870144f71de48a67aed211749a9ca2a548", size = 21342, upload-time = "2026-06-23T09:39:31.611Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/c6/c7/fa2b3f469a2e6001b829e0d6bc8680755349aa1329a87bf48731e9d5d30a/prometheus_fastapi_instrumentator-8.0.2-py3-none-any.whl", hash = "sha256:746002ec1e2c58b93f61444e1d104de959a9463a6a3f1c8909ac3757e16c3866", size = 20549, upload-time = "2026-06-23T09:39:32.616Z" },
+]
+
 [[package]]
 name = "propcache"
 version = "0.5.2"
@@ -1305,6 +2314,37 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/3a/ed/1cdcab6ba3d6ab7feca11fc14f0eeea80755bb53ef4e892079f31b10a25f/propcache-0.5.2-py3-none-any.whl", hash = "sha256:be1ddfcbb376e3de5d2e2db1d58d6d67463e6b4f9f040c000de8e300295465fe", size = 14036, upload-time = "2026-05-08T21:02:10.673Z" },
 ]
 
+[[package]]
+name = "protobuf"
+version = "6.33.6"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/66/70/e908e9c5e52ef7c3a6c7902c9dfbb34c7e29c25d2f81ade3856445fd5c94/protobuf-6.33.6.tar.gz", hash = "sha256:a6768d25248312c297558af96a9f9c929e8c4cee0659cb07e780731095f38135", size = 444531, upload-time = "2026-03-18T19:05:00.988Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/fc/9f/2f509339e89cfa6f6a4c4ff50438db9ca488dec341f7e454adad60150b00/protobuf-6.33.6-cp310-abi3-win32.whl", hash = "sha256:7d29d9b65f8afef196f8334e80d6bc1d5d4adedb449971fefd3723824e6e77d3", size = 425739, upload-time = "2026-03-18T19:04:48.373Z" },
+    { url = "https://files.pythonhosted.org/packages/76/5d/683efcd4798e0030c1bab27374fd13a89f7c2515fb1f3123efdfaa5eab57/protobuf-6.33.6-cp310-abi3-win_amd64.whl", hash = "sha256:0cd27b587afca21b7cfa59a74dcbd48a50f0a6400cfb59391340ad729d91d326", size = 437089, upload-time = "2026-03-18T19:04:50.381Z" },
+    { url = "https://files.pythonhosted.org/packages/5c/01/a3c3ed5cd186f39e7880f8303cc51385a198a81469d53d0fdecf1f64d929/protobuf-6.33.6-cp39-abi3-macosx_10_9_universal2.whl", hash = "sha256:9720e6961b251bde64edfdab7d500725a2af5280f3f4c87e57c0208376aa8c3a", size = 427737, upload-time = "2026-03-18T19:04:51.866Z" },
+    { url = "https://files.pythonhosted.org/packages/ee/90/b3c01fdec7d2f627b3a6884243ba328c1217ed2d978def5c12dc50d328a3/protobuf-6.33.6-cp39-abi3-manylinux2014_aarch64.whl", hash = "sha256:e2afbae9b8e1825e3529f88d514754e094278bb95eadc0e199751cdd9a2e82a2", size = 324610, upload-time = "2026-03-18T19:04:53.096Z" },
+    { url = "https://files.pythonhosted.org/packages/9b/ca/25afc144934014700c52e05103c2421997482d561f3101ff352e1292fb81/protobuf-6.33.6-cp39-abi3-manylinux2014_s390x.whl", hash = "sha256:c96c37eec15086b79762ed265d59ab204dabc53056e3443e702d2681f4b39ce3", size = 339381, upload-time = "2026-03-18T19:04:54.616Z" },
+    { url = "https://files.pythonhosted.org/packages/16/92/d1e32e3e0d894fe00b15ce28ad4944ab692713f2e7f0a99787405e43533a/protobuf-6.33.6-cp39-abi3-manylinux2014_x86_64.whl", hash = "sha256:e9db7e292e0ab79dd108d7f1a94fe31601ce1ee3f7b79e0692043423020b0593", size = 323436, upload-time = "2026-03-18T19:04:55.768Z" },
+    { url = "https://files.pythonhosted.org/packages/c4/72/02445137af02769918a93807b2b7890047c32bfb9f90371cbc12688819eb/protobuf-6.33.6-py3-none-any.whl", hash = "sha256:77179e006c476e69bf8e8ce866640091ec42e1beb80b213c3900006ecfba6901", size = 170656, upload-time = "2026-03-18T19:04:59.826Z" },
+]
+
+[[package]]
+name = "psutil"
+version = "7.2.2"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/aa/c6/d1ddf4abb55e93cebc4f2ed8b5d6dbad109ecb8d63748dd2b20ab5e57ebe/psutil-7.2.2.tar.gz", hash = "sha256:0746f5f8d406af344fd547f1c8daa5f5c33dbc293bb8d6a16d80b4bb88f59372", size = 493740, upload-time = "2026-01-28T18:14:54.428Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/e7/36/5ee6e05c9bd427237b11b3937ad82bb8ad2752d72c6969314590dd0c2f6e/psutil-7.2.2-cp36-abi3-macosx_10_9_x86_64.whl", hash = "sha256:ed0cace939114f62738d808fdcecd4c869222507e266e574799e9c0faa17d486", size = 129090, upload-time = "2026-01-28T18:15:22.168Z" },
+    { url = "https://files.pythonhosted.org/packages/80/c4/f5af4c1ca8c1eeb2e92ccca14ce8effdeec651d5ab6053c589b074eda6e1/psutil-7.2.2-cp36-abi3-macosx_11_0_arm64.whl", hash = "sha256:1a7b04c10f32cc88ab39cbf606e117fd74721c831c98a27dc04578deb0c16979", size = 129859, upload-time = "2026-01-28T18:15:23.795Z" },
+    { url = "https://files.pythonhosted.org/packages/b5/70/5d8df3b09e25bce090399cf48e452d25c935ab72dad19406c77f4e828045/psutil-7.2.2-cp36-abi3-manylinux2010_x86_64.manylinux_2_12_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:076a2d2f923fd4821644f5ba89f059523da90dc9014e85f8e45a5774ca5bc6f9", size = 155560, upload-time = "2026-01-28T18:15:25.976Z" },
+    { url = "https://files.pythonhosted.org/packages/63/65/37648c0c158dc222aba51c089eb3bdfa238e621674dc42d48706e639204f/psutil-7.2.2-cp36-abi3-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:b0726cecd84f9474419d67252add4ac0cd9811b04d61123054b9fb6f57df6e9e", size = 156997, upload-time = "2026-01-28T18:15:27.794Z" },
+    { url = "https://files.pythonhosted.org/packages/8e/13/125093eadae863ce03c6ffdbae9929430d116a246ef69866dad94da3bfbc/psutil-7.2.2-cp36-abi3-musllinux_1_2_aarch64.whl", hash = "sha256:fd04ef36b4a6d599bbdb225dd1d3f51e00105f6d48a28f006da7f9822f2606d8", size = 148972, upload-time = "2026-01-28T18:15:29.342Z" },
+    { url = "https://files.pythonhosted.org/packages/04/78/0acd37ca84ce3ddffaa92ef0f571e073faa6d8ff1f0559ab1272188ea2be/psutil-7.2.2-cp36-abi3-musllinux_1_2_x86_64.whl", hash = "sha256:b58fabe35e80b264a4e3bb23e6b96f9e45a3df7fb7eed419ac0e5947c61e47cc", size = 148266, upload-time = "2026-01-28T18:15:31.597Z" },
+    { url = "https://files.pythonhosted.org/packages/b4/90/e2159492b5426be0c1fef7acba807a03511f97c5f86b3caeda6ad92351a7/psutil-7.2.2-cp37-abi3-win_amd64.whl", hash = "sha256:eb7e81434c8d223ec4a219b5fc1c47d0417b12be7ea866e24fb5ad6e84b3d988", size = 137737, upload-time = "2026-01-28T18:15:33.849Z" },
+    { url = "https://files.pythonhosted.org/packages/8c/c7/7bb2e321574b10df20cbde462a94e2b71d05f9bbda251ef27d104668306a/psutil-7.2.2-cp37-abi3-win_arm64.whl", hash = "sha256:8c233660f575a5a89e6d4cb65d9f938126312bca76d8fe087b947b3a1aaac9ee", size = 134617, upload-time = "2026-01-28T18:15:36.514Z" },
+]
+
 [[package]]
 name = "py"
 version = "1.11.0"
@@ -1314,6 +2354,15 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/f6/f0/10642828a8dfb741e5f3fbaac830550a518a775c7fff6f04a007259b0548/py-1.11.0-py2.py3-none-any.whl", hash = "sha256:607c53218732647dff4acdfcd50cb62615cedf612e72d1724fb1a0cc6405b378", size = 98708, upload-time = "2021-11-04T17:17:00.152Z" },
 ]
 
+[[package]]
+name = "py-cpuinfo"
+version = "9.0.0"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/37/a8/d832f7293ebb21690860d2e01d8115e5ff6f2ae8bbdc953f0eb0fa4bd2c7/py-cpuinfo-9.0.0.tar.gz", hash = "sha256:3cdbbf3fac90dc6f118bfd64384f309edeadd902d7c8fb17f02ffa1fc3f49690", size = 104716, upload-time = "2022-10-25T20:38:06.303Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/e0/a9/023730ba63db1e494a271cb018dcd361bd2c917ba7004c3e49d5daf795a2/py_cpuinfo-9.0.0-py3-none-any.whl", hash = "sha256:859625bc251f64e21f077d099d4162689c762b5d6a4c3c97553d56241c9674d5", size = 22335, upload-time = "2022-10-25T20:38:27.636Z" },
+]
+
 [[package]]
 name = "py-key-value-aio"
 version = "0.4.5"
@@ -1354,6 +2403,47 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/66/1c/e3e72c8014ad2743ca64a701652c733cc5cbcee15c0463a32a8c55518d9e/pyarrow-24.0.0-cp312-cp312-win_amd64.whl", hash = "sha256:295f0a7f2e242dabd513737cf076007dc5b2d59237e3eca37b05c0c6446f3826", size = 27355660, upload-time = "2026-04-21T10:48:01.718Z" },
 ]
 
+[[package]]
+name = "pybase64"
+version = "1.4.3"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/aa/b8/4ed5c7ad5ec15b08d35cc79ace6145d5c1ae426e46435f4987379439dfea/pybase64-1.4.3.tar.gz", hash = "sha256:c2ed274c9e0ba9c8f9c4083cfe265e66dd679126cd9c2027965d807352f3f053", size = 137272, upload-time = "2025-12-06T13:27:04.013Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/86/a7/efcaa564f091a2af7f18a83c1c4875b1437db56ba39540451dc85d56f653/pybase64-1.4.3-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:18d85e5ab8b986bb32d8446aca6258ed80d1bafe3603c437690b352c648f5967", size = 38167, upload-time = "2025-12-06T13:23:16.821Z" },
+    { url = "https://files.pythonhosted.org/packages/db/c7/c7ad35adff2d272bf2930132db2b3eea8c44bb1b1f64eb9b2b8e57cde7b4/pybase64-1.4.3-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:3f5791a3491d116d0deaf4d83268f48792998519698f8751efb191eac84320e9", size = 31673, upload-time = "2025-12-06T13:23:17.835Z" },
+    { url = "https://files.pythonhosted.org/packages/43/1b/9a8cab0042b464e9a876d5c65fe5127445a2436da36fda64899b119b1a1b/pybase64-1.4.3-cp312-cp312-manylinux1_i686.manylinux2014_i686.manylinux_2_17_i686.manylinux_2_5_i686.whl", hash = "sha256:f0b3f200c3e06316f6bebabd458b4e4bcd4c2ca26af7c0c766614d91968dee27", size = 68210, upload-time = "2025-12-06T13:23:18.813Z" },
+    { url = "https://files.pythonhosted.org/packages/62/f7/965b79ff391ad208b50e412b5d3205ccce372a2d27b7218ae86d5295b105/pybase64-1.4.3-cp312-cp312-manylinux1_x86_64.manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_5_x86_64.whl", hash = "sha256:bb632edfd132b3eaf90c39c89aa314beec4e946e210099b57d40311f704e11d4", size = 71599, upload-time = "2025-12-06T13:23:20.195Z" },
+    { url = "https://files.pythonhosted.org/packages/03/4b/a3b5175130b3810bbb8ccfa1edaadbd3afddb9992d877c8a1e2f274b476e/pybase64-1.4.3-cp312-cp312-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:356ef1d74648ce997f5a777cf8f1aefecc1c0b4fe6201e0ef3ec8a08170e1b54", size = 59922, upload-time = "2025-12-06T13:23:21.487Z" },
+    { url = "https://files.pythonhosted.org/packages/da/5d/c38d1572027fc601b62d7a407721688b04b4d065d60ca489912d6893e6cf/pybase64-1.4.3-cp312-cp312-manylinux2014_armv7l.manylinux_2_17_armv7l.whl", hash = "sha256:c48361f90db32bacaa5518419d4eb9066ba558013aaf0c7781620279ecddaeb9", size = 56712, upload-time = "2025-12-06T13:23:22.77Z" },
+    { url = "https://files.pythonhosted.org/packages/e7/d4/4e04472fef485caa8f561d904d4d69210a8f8fc1608ea15ebd9012b92655/pybase64-1.4.3-cp312-cp312-manylinux2014_ppc64le.manylinux_2_17_ppc64le.whl", hash = "sha256:702bcaa16ae02139d881aeaef5b1c8ffb4a3fae062fe601d1e3835e10310a517", size = 59300, upload-time = "2025-12-06T13:23:24.543Z" },
+    { url = "https://files.pythonhosted.org/packages/86/e7/16e29721b86734b881d09b7e23dfd7c8408ad01a4f4c7525f3b1088e25ec/pybase64-1.4.3-cp312-cp312-manylinux2014_s390x.manylinux_2_17_s390x.whl", hash = "sha256:53d0ffe1847b16b647c6413d34d1de08942b7724273dd57e67dcbdb10c574045", size = 60278, upload-time = "2025-12-06T13:23:25.608Z" },
+    { url = "https://files.pythonhosted.org/packages/b1/02/18515f211d7c046be32070709a8efeeef8a0203de4fd7521e6b56404731b/pybase64-1.4.3-cp312-cp312-manylinux_2_31_riscv64.whl", hash = "sha256:9a1792e8b830a92736dae58f0c386062eb038dfe8004fb03ba33b6083d89cd43", size = 54817, upload-time = "2025-12-06T13:23:26.633Z" },
+    { url = "https://files.pythonhosted.org/packages/e7/be/14e29d8e1a481dbff151324c96dd7b5d2688194bb65dc8a00ca0e1ad1e86/pybase64-1.4.3-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:1d468b1b1ac5ad84875a46eaa458663c3721e8be5f155ade356406848d3701f6", size = 58611, upload-time = "2025-12-06T13:23:27.684Z" },
+    { url = "https://files.pythonhosted.org/packages/b4/8a/a2588dfe24e1bbd742a554553778ab0d65fdf3d1c9a06d10b77047d142aa/pybase64-1.4.3-cp312-cp312-musllinux_1_2_armv7l.whl", hash = "sha256:e97b7bdbd62e71898cd542a6a9e320d9da754ff3ebd02cb802d69087ee94d468", size = 52404, upload-time = "2025-12-06T13:23:28.714Z" },
+    { url = "https://files.pythonhosted.org/packages/27/fc/afcda7445bebe0cbc38cafdd7813234cdd4fc5573ff067f1abf317bb0cec/pybase64-1.4.3-cp312-cp312-musllinux_1_2_i686.whl", hash = "sha256:b33aeaa780caaa08ffda87fc584d5eab61e3d3bbb5d86ead02161dc0c20d04bc", size = 68817, upload-time = "2025-12-06T13:23:30.079Z" },
+    { url = "https://files.pythonhosted.org/packages/d3/3a/87c3201e555ed71f73e961a787241a2438c2bbb2ca8809c29ddf938a3157/pybase64-1.4.3-cp312-cp312-musllinux_1_2_ppc64le.whl", hash = "sha256:1c0efcf78f11cf866bed49caa7b97552bc4855a892f9cc2372abcd3ed0056f0d", size = 57854, upload-time = "2025-12-06T13:23:31.17Z" },
+    { url = "https://files.pythonhosted.org/packages/fd/7d/931c2539b31a7b375e7d595b88401eeb5bd6c5ce1059c9123f9b608aaa14/pybase64-1.4.3-cp312-cp312-musllinux_1_2_riscv64.whl", hash = "sha256:66e3791f2ed725a46593f8bd2761ff37d01e2cdad065b1dceb89066f476e50c6", size = 54333, upload-time = "2025-12-06T13:23:32.422Z" },
+    { url = "https://files.pythonhosted.org/packages/de/5e/537601e02cc01f27e9d75f440f1a6095b8df44fc28b1eef2cd739aea8cec/pybase64-1.4.3-cp312-cp312-musllinux_1_2_s390x.whl", hash = "sha256:72bb0b6bddadab26e1b069bb78e83092711a111a80a0d6b9edcb08199ad7299b", size = 56492, upload-time = "2025-12-06T13:23:33.515Z" },
+    { url = "https://files.pythonhosted.org/packages/96/97/2a2e57acf8f5c9258d22aba52e71f8050e167b29ed2ee1113677c1b600c1/pybase64-1.4.3-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:5b3365dbcbcdb0a294f0f50af0c0a16b27a232eddeeb0bceeefd844ef30d2a23", size = 70974, upload-time = "2025-12-06T13:23:36.27Z" },
+    { url = "https://files.pythonhosted.org/packages/75/2e/a9e28941c6dab6f06e6d3f6783d3373044be9b0f9a9d3492c3d8d2260ac0/pybase64-1.4.3-cp312-cp312-win32.whl", hash = "sha256:7bca1ed3a5df53305c629ca94276966272eda33c0d71f862d2d3d043f1e1b91a", size = 33686, upload-time = "2025-12-06T13:23:37.848Z" },
+    { url = "https://files.pythonhosted.org/packages/83/e3/507ab649d8c3512c258819c51d25c45d6e29d9ca33992593059e7b646a33/pybase64-1.4.3-cp312-cp312-win_amd64.whl", hash = "sha256:9f2da8f56d9b891b18b4daf463a0640eae45a80af548ce435be86aa6eff3603b", size = 35833, upload-time = "2025-12-06T13:23:38.877Z" },
+    { url = "https://files.pythonhosted.org/packages/bc/8a/6eba66cd549a2fc74bb4425fd61b839ba0ab3022d3c401b8a8dc2cc00c7a/pybase64-1.4.3-cp312-cp312-win_arm64.whl", hash = "sha256:0631d8a2d035de03aa9bded029b9513e1fee8ed80b7ddef6b8e9389ffc445da0", size = 31185, upload-time = "2025-12-06T13:23:39.908Z" },
+    { url = "https://files.pythonhosted.org/packages/17/45/92322aec1b6979e789b5710f73c59f2172bc37c8ce835305434796824b7b/pybase64-1.4.3-graalpy312-graalpy250_312_native-macosx_10_13_x86_64.whl", hash = "sha256:2baaa092f3475f3a9c87ac5198023918ea8b6c125f4c930752ab2cbe3cd1d520", size = 38746, upload-time = "2025-12-06T13:26:25.869Z" },
+    { url = "https://files.pythonhosted.org/packages/11/94/f1a07402870388fdfc2ecec0c718111189732f7d0f2d7fe1386e19e8fad0/pybase64-1.4.3-graalpy312-graalpy250_312_native-macosx_11_0_arm64.whl", hash = "sha256:cde13c0764b1af07a631729f26df019070dad759981d6975527b7e8ecb465b6c", size = 32573, upload-time = "2025-12-06T13:26:27.792Z" },
+    { url = "https://files.pythonhosted.org/packages/fa/8f/43c3bb11ca9bacf81cb0b7a71500bb65b2eda6d5fe07433c09b543de97f3/pybase64-1.4.3-graalpy312-graalpy250_312_native-manylinux1_x86_64.manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_5_x86_64.whl", hash = "sha256:5c29a582b0ea3936d02bd6fe9bf674ab6059e6e45ab71c78404ab2c913224414", size = 43461, upload-time = "2025-12-06T13:26:28.906Z" },
+    { url = "https://files.pythonhosted.org/packages/2d/4c/2a5258329200be57497d3972b5308558c6de42e3749c6cc2aa1cbe34b25a/pybase64-1.4.3-graalpy312-graalpy250_312_native-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:b6b664758c804fa919b4f1257aa8cf68e95db76fc331de5f70bfc3a34655afe1", size = 36058, upload-time = "2025-12-06T13:26:30.092Z" },
+    { url = "https://files.pythonhosted.org/packages/ea/6d/41faa414cde66ec023b0ca8402a8f11cb61731c3dc27c082909cbbd1f929/pybase64-1.4.3-graalpy312-graalpy250_312_native-win_amd64.whl", hash = "sha256:f7537fa22ae56a0bf51e4b0ffc075926ad91c618e1416330939f7ef366b58e3b", size = 36231, upload-time = "2025-12-06T13:26:31.656Z" },
+]
+
+[[package]]
+name = "pycountry"
+version = "26.2.16"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/de/1d/061b9e7a48b85cfd69f33c33d2ef784a531c359399ad764243399673c8f5/pycountry-26.2.16.tar.gz", hash = "sha256:5b6027d453fcd6060112b951dd010f01f168b51b4bf8a1f1fc8c95c8d94a0801", size = 7711342, upload-time = "2026-02-17T03:42:52.367Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/9c/42/7703bd45b62fecd44cd7d3495423097e2f7d28bc2e99e7c1af68892ab157/pycountry-26.2.16-py3-none-any.whl", hash = "sha256:115c4baf7cceaa30f59a4694d79483c9167dbce7a9de4d3d571c5f3ea77c305a", size = 8044600, upload-time = "2026-02-17T03:42:49.777Z" },
+]
+
 [[package]]
 name = "pycparser"
 version = "3.0"
@@ -1413,6 +2503,24 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/fa/c3/7c8b240552251faf6b3a957db200fcfbbcec36763c050428b601e0c9b83b/pydantic_core-2.46.4-graalpy312-graalpy250_312_native-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:00c603d540afdd6b80eb39f078f33ebd46211f02f33e34a32d9f053bba711de0", size = 2147590, upload-time = "2026-05-06T13:39:29.883Z" },
 ]
 
+[[package]]
+name = "pydantic-extra-types"
+version = "2.11.1"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "pydantic" },
+    { name = "typing-extensions" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/66/71/dba38ee2651f84f7842206adbd2233d8bbdb59fb85e9fa14232486a8c471/pydantic_extra_types-2.11.1.tar.gz", hash = "sha256:46792d2307383859e923d8fcefa82108b1a141f8a9c0198982b3832ab5ef1049", size = 172002, upload-time = "2026-03-16T08:08:03.92Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/17/c1/3226e6d7f5a4f736f38ac11a6fbb262d701889802595cdb0f53a885ac2e0/pydantic_extra_types-2.11.1-py3-none-any.whl", hash = "sha256:1722ea2bddae5628ace25f2aa685b69978ef533123e5638cfbddb999e0100ec1", size = 79526, upload-time = "2026-03-16T08:08:02.533Z" },
+]
+
+[package.optional-dependencies]
+pycountry = [
+    { name = "pycountry" },
+]
+
 [[package]]
 name = "pydantic-settings"
 version = "2.14.1"
@@ -1489,6 +2597,15 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/0b/d7/1959b9648791274998a9c3526f6d0ec8fd2233e4d4acce81bbae76b44b2a/python_dotenv-1.2.2-py3-none-any.whl", hash = "sha256:1d8214789a24de455a8b8bd8ae6fe3c6b69a5e3d64aa8a8e5d68e694bbcb285a", size = 22101, upload-time = "2026-03-01T16:00:25.09Z" },
 ]
 
+[[package]]
+name = "python-json-logger"
+version = "4.1.0"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/f7/ff/3cc9165fd44106973cd7ac9facb674a65ed853494592541d339bdc9a30eb/python_json_logger-4.1.0.tar.gz", hash = "sha256:b396b9e3ed782b09ff9d6e4f1683d46c83ad0d35d2e407c09a9ebbf038f88195", size = 17573, upload-time = "2026-03-29T04:39:56.805Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/27/be/0631a861af4d1c875f096c07d34e9a63639560a717130e7a87cbc82b7e3f/python_json_logger-4.1.0-py3-none-any.whl", hash = "sha256:132994765cf75bf44554be9aa49b06ef2345d23661a96720262716438141b6b2", size = 15021, upload-time = "2026-03-29T04:39:55.266Z" },
+]
+
 [[package]]
 name = "python-multipart"
 version = "0.0.30"
@@ -1544,6 +2661,43 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/1a/08/67bd04656199bbb51dbed1439b7f27601dfb576fb864099c7ef0c3e55531/pyyaml-6.0.3-cp312-cp312-win_arm64.whl", hash = "sha256:64386e5e707d03a7e172c0701abfb7e10f0fb753ee1d773128192742712a98fd", size = 140344, upload-time = "2025-09-25T21:32:22.617Z" },
 ]
 
+[[package]]
+name = "pyzmq"
+version = "27.1.0"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "cffi", marker = "implementation_name == 'pypy'" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/04/0b/3c9baedbdf613ecaa7aa07027780b8867f57b6293b6ee50de316c9f3222b/pyzmq-27.1.0.tar.gz", hash = "sha256:ac0765e3d44455adb6ddbf4417dcce460fc40a05978c08efdf2948072f6db540", size = 281750, upload-time = "2025-09-08T23:10:18.157Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/92/e7/038aab64a946d535901103da16b953c8c9cc9c961dadcbf3609ed6428d23/pyzmq-27.1.0-cp312-abi3-macosx_10_15_universal2.whl", hash = "sha256:452631b640340c928fa343801b0d07eb0c3789a5ffa843f6e1a9cee0ba4eb4fc", size = 1306279, upload-time = "2025-09-08T23:08:03.807Z" },
+    { url = "https://files.pythonhosted.org/packages/e8/5e/c3c49fdd0f535ef45eefcc16934648e9e59dace4a37ee88fc53f6cd8e641/pyzmq-27.1.0-cp312-abi3-manylinux2014_i686.manylinux_2_17_i686.whl", hash = "sha256:1c179799b118e554b66da67d88ed66cd37a169f1f23b5d9f0a231b4e8d44a113", size = 895645, upload-time = "2025-09-08T23:08:05.301Z" },
+    { url = "https://files.pythonhosted.org/packages/f8/e5/b0b2504cb4e903a74dcf1ebae157f9e20ebb6ea76095f6cfffea28c42ecd/pyzmq-27.1.0-cp312-abi3-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:3837439b7f99e60312f0c926a6ad437b067356dc2bc2ec96eb395fd0fe804233", size = 652574, upload-time = "2025-09-08T23:08:06.828Z" },
+    { url = "https://files.pythonhosted.org/packages/f8/9b/c108cdb55560eaf253f0cbdb61b29971e9fb34d9c3499b0e96e4e60ed8a5/pyzmq-27.1.0-cp312-abi3-manylinux_2_26_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:43ad9a73e3da1fab5b0e7e13402f0b2fb934ae1c876c51d0afff0e7c052eca31", size = 840995, upload-time = "2025-09-08T23:08:08.396Z" },
+    { url = "https://files.pythonhosted.org/packages/c2/bb/b79798ca177b9eb0825b4c9998c6af8cd2a7f15a6a1a4272c1d1a21d382f/pyzmq-27.1.0-cp312-abi3-musllinux_1_2_aarch64.whl", hash = "sha256:0de3028d69d4cdc475bfe47a6128eb38d8bc0e8f4d69646adfbcd840facbac28", size = 1642070, upload-time = "2025-09-08T23:08:09.989Z" },
+    { url = "https://files.pythonhosted.org/packages/9c/80/2df2e7977c4ede24c79ae39dcef3899bfc5f34d1ca7a5b24f182c9b7a9ca/pyzmq-27.1.0-cp312-abi3-musllinux_1_2_i686.whl", hash = "sha256:cf44a7763aea9298c0aa7dbf859f87ed7012de8bda0f3977b6fb1d96745df856", size = 2021121, upload-time = "2025-09-08T23:08:11.907Z" },
+    { url = "https://files.pythonhosted.org/packages/46/bd/2d45ad24f5f5ae7e8d01525eb76786fa7557136555cac7d929880519e33a/pyzmq-27.1.0-cp312-abi3-musllinux_1_2_x86_64.whl", hash = "sha256:f30f395a9e6fbca195400ce833c731e7b64c3919aa481af4d88c3759e0cb7496", size = 1878550, upload-time = "2025-09-08T23:08:13.513Z" },
+    { url = "https://files.pythonhosted.org/packages/e6/2f/104c0a3c778d7c2ab8190e9db4f62f0b6957b53c9d87db77c284b69f33ea/pyzmq-27.1.0-cp312-abi3-win32.whl", hash = "sha256:250e5436a4ba13885494412b3da5d518cd0d3a278a1ae640e113c073a5f88edd", size = 559184, upload-time = "2025-09-08T23:08:15.163Z" },
+    { url = "https://files.pythonhosted.org/packages/fc/7f/a21b20d577e4100c6a41795842028235998a643b1ad406a6d4163ea8f53e/pyzmq-27.1.0-cp312-abi3-win_amd64.whl", hash = "sha256:9ce490cf1d2ca2ad84733aa1d69ce6855372cb5ce9223802450c9b2a7cba0ccf", size = 619480, upload-time = "2025-09-08T23:08:17.192Z" },
+    { url = "https://files.pythonhosted.org/packages/78/c2/c012beae5f76b72f007a9e91ee9401cb88c51d0f83c6257a03e785c81cc2/pyzmq-27.1.0-cp312-abi3-win_arm64.whl", hash = "sha256:75a2f36223f0d535a0c919e23615fc85a1e23b71f40c7eb43d7b1dedb4d8f15f", size = 552993, upload-time = "2025-09-08T23:08:18.926Z" },
+]
+
+[[package]]
+name = "quack-kernels"
+version = "0.5.0"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "apache-tvm-ffi" },
+    { name = "einops" },
+    { name = "nvidia-cutlass-dsl" },
+    { name = "torch" },
+    { name = "torch-c-dlpack-ext" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/e1/94/ee76e3a3dc74d986b7b24c5928f1d14b01bd5152375688c2ede369f6d19b/quack_kernels-0.5.0.tar.gz", hash = "sha256:c7c7338b67243397b6ca166e648bba161076e99f3858b532e1c877dcc6eaa03d", size = 366426, upload-time = "2026-05-29T05:00:25.985Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/2d/2b/a8f171d5e172880885571bf89e93204aaf231a0e92c4c84714eaf18c271a/quack_kernels-0.5.0-py3-none-any.whl", hash = "sha256:08821ebfb8e638cc20308d5c59410c6dbb3b637ccc7b07bd57c7a9261a06af74", size = 327709, upload-time = "2026-05-29T05:00:24.679Z" },
+]
+
 [[package]]
 name = "referencing"
 version = "0.37.0"
@@ -1636,6 +2790,43 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/a0/3d/55c17d3ebdf3cd81356002afe5bef9bb8af631db2819785b6eac845b925b/rich_rst-2.0.1-py3-none-any.whl", hash = "sha256:7ee15f345ce25fa02b582c272a6cdbaf0c21243e38061cea273cff659bf3ef61", size = 272922, upload-time = "2026-05-16T00:47:55.508Z" },
 ]
 
+[[package]]
+name = "rich-toolkit"
+version = "0.20.1"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "click" },
+    { name = "rich" },
+    { name = "typing-extensions" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/29/63/3e427c62f1992945c997d4ec31e2fcb37d26aadbe5aa44ae5b29f7f64d26/rich_toolkit-0.20.1.tar.gz", hash = "sha256:c7336ae281f435c785acecaedc4b71d4b663dc73d9c8079fea96372527e822a4", size = 203473, upload-time = "2026-06-05T08:56:57.679Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/00/88/309f07d08155da2ba1d5ceb42d270fb42fbe34a807684543e3ffc10fe713/rich_toolkit-0.20.1-py3-none-any.whl", hash = "sha256:2a6d5f8e15759b9eba5a9ee63da10b275359ead20e5a0fc92bd5b4dbae8ce4bf", size = 35525, upload-time = "2026-06-05T08:56:58.586Z" },
+]
+
+[[package]]
+name = "rignore"
+version = "0.7.6"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/e5/f5/8bed2310abe4ae04b67a38374a4d311dd85220f5d8da56f47ae9361be0b0/rignore-0.7.6.tar.gz", hash = "sha256:00d3546cd793c30cb17921ce674d2c8f3a4b00501cb0e3dd0e82217dbeba2671", size = 57140, upload-time = "2025-11-05T21:41:21.968Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/0b/0e/012556ef3047a2628842b44e753bb15f4dc46806780ff090f1e8fe4bf1eb/rignore-0.7.6-cp312-cp312-macosx_10_12_x86_64.whl", hash = "sha256:03e82348cb7234f8d9b2834f854400ddbbd04c0f8f35495119e66adbd37827a8", size = 883488, upload-time = "2025-11-05T20:42:41.359Z" },
+    { url = "https://files.pythonhosted.org/packages/93/b0/d4f1f3fe9eb3f8e382d45ce5b0547ea01c4b7e0b4b4eb87bcd66a1d2b888/rignore-0.7.6-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:b9e624f6be6116ea682e76c5feb71ea91255c67c86cb75befe774365b2931961", size = 820411, upload-time = "2025-11-05T20:42:24.782Z" },
+    { url = "https://files.pythonhosted.org/packages/4a/c8/dea564b36dedac8de21c18e1851789545bc52a0c22ece9843444d5608a6a/rignore-0.7.6-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:bda49950d405aa8d0ebe26af807c4e662dd281d926530f03f29690a2e07d649a", size = 897821, upload-time = "2025-11-05T20:40:52.613Z" },
+    { url = "https://files.pythonhosted.org/packages/b3/2b/ee96db17ac1835e024c5d0742eefb7e46de60020385ac883dd3d1cde2c1f/rignore-0.7.6-cp312-cp312-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:b5fd5ab3840b8c16851d327ed06e9b8be6459702a53e5ab1fc4073b684b3789e", size = 873963, upload-time = "2025-11-05T20:41:07.49Z" },
+    { url = "https://files.pythonhosted.org/packages/a5/8c/ad5a57bbb9d14d5c7e5960f712a8a0b902472ea3f4a2138cbf70d1777b75/rignore-0.7.6-cp312-cp312-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:ced2a248352636a5c77504cb755dc02c2eef9a820a44d3f33061ce1bb8a7f2d2", size = 1169216, upload-time = "2025-11-05T20:41:23.73Z" },
+    { url = "https://files.pythonhosted.org/packages/80/e6/5b00bc2a6bc1701e6878fca798cf5d9125eb3113193e33078b6fc0d99123/rignore-0.7.6-cp312-cp312-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:a04a3b73b75ddc12c9c9b21efcdaab33ca3832941d6f1d67bffd860941cd448a", size = 942942, upload-time = "2025-11-05T20:41:39.393Z" },
+    { url = "https://files.pythonhosted.org/packages/85/e5/7f99bd0cc9818a91d0e8b9acc65b792e35750e3bdccd15a7ee75e64efca4/rignore-0.7.6-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:d24321efac92140b7ec910ac7c53ab0f0c86a41133d2bb4b0e6a7c94967f44dd", size = 959787, upload-time = "2025-11-05T20:42:09.765Z" },
+    { url = "https://files.pythonhosted.org/packages/55/54/2ffea79a7c1eabcede1926347ebc2a81bc6b81f447d05b52af9af14948b9/rignore-0.7.6-cp312-cp312-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:73c7aa109d41e593785c55fdaa89ad80b10330affa9f9d3e3a51fa695f739b20", size = 984245, upload-time = "2025-11-05T20:41:54.062Z" },
+    { url = "https://files.pythonhosted.org/packages/41/f7/e80f55dfe0f35787fa482aa18689b9c8251e045076c35477deb0007b3277/rignore-0.7.6-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:1734dc49d1e9501b07852ef44421f84d9f378da9fbeda729e77db71f49cac28b", size = 1078647, upload-time = "2025-11-05T21:40:13.463Z" },
+    { url = "https://files.pythonhosted.org/packages/d4/cf/2c64f0b6725149f7c6e7e5a909d14354889b4beaadddaa5fff023ec71084/rignore-0.7.6-cp312-cp312-musllinux_1_2_armv7l.whl", hash = "sha256:5719ea14ea2b652c0c0894be5dfde954e1853a80dea27dd2fbaa749618d837f5", size = 1139186, upload-time = "2025-11-05T21:40:31.27Z" },
+    { url = "https://files.pythonhosted.org/packages/75/95/a86c84909ccc24af0d094b50d54697951e576c252a4d9f21b47b52af9598/rignore-0.7.6-cp312-cp312-musllinux_1_2_i686.whl", hash = "sha256:8e23424fc7ce35726854f639cb7968151a792c0c3d9d082f7f67e0c362cfecca", size = 1117604, upload-time = "2025-11-05T21:40:48.07Z" },
+    { url = "https://files.pythonhosted.org/packages/7f/5e/13b249613fd5d18d58662490ab910a9f0be758981d1797789913adb4e918/rignore-0.7.6-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:3efdcf1dd84d45f3e2bd2f93303d9be103888f56dfa7c3349b5bf4f0657ec696", size = 1127725, upload-time = "2025-11-05T21:41:05.804Z" },
+    { url = "https://files.pythonhosted.org/packages/c7/28/fa5dcd1e2e16982c359128664e3785f202d3eca9b22dd0b2f91c4b3d242f/rignore-0.7.6-cp312-cp312-win32.whl", hash = "sha256:ccca9d1a8b5234c76b71546fc3c134533b013f40495f394a65614a81f7387046", size = 646145, upload-time = "2025-11-05T21:41:51.096Z" },
+    { url = "https://files.pythonhosted.org/packages/26/87/69387fb5dd81a0f771936381431780b8cf66fcd2cfe9495e1aaf41548931/rignore-0.7.6-cp312-cp312-win_amd64.whl", hash = "sha256:c96a285e4a8bfec0652e0bfcf42b1aabcdda1e7625f5006d188e3b1c87fdb543", size = 726090, upload-time = "2025-11-05T21:41:36.485Z" },
+    { url = "https://files.pythonhosted.org/packages/24/5f/e8418108dcda8087fb198a6f81caadbcda9fd115d61154bf0df4d6d3619b/rignore-0.7.6-cp312-cp312-win_arm64.whl", hash = "sha256:a64a750e7a8277a323f01ca50b7784a764845f6cce2fe38831cb93f0508d0051", size = 656317, upload-time = "2025-11-05T21:41:25.305Z" },
+]
+
 [[package]]
 name = "rpds-py"
 version = "2026.5.1"
@@ -1659,6 +2850,30 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/cb/53/6c3419d85eb2ec5938a37627c585b42d76a63bb731d6e42ed4b079ebf486/rpds_py-2026.5.1-cp312-cp312-win_arm64.whl", hash = "sha256:1841d067089e117142d79b98aa0df2f08b52f2ecc1819dd2700636c0db74a473", size = 223967, upload-time = "2026-05-28T11:59:32.318Z" },
 ]
 
+[[package]]
+name = "safetensors"
+version = "0.8.0"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/45/06/f955dbbb1859e3bd23c8ac6141af5106e7ad5fedec4a3a6e3d60f94b7001/safetensors-0.8.0.tar.gz", hash = "sha256:fabaf3e0f18a6618d9b36560682562157f77c2b71fcffc7b432be2baed9d753d", size = 325846, upload-time = "2026-06-09T07:52:25.563Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/39/a0/f718cda65b05407d228f97602cf60dca269c979867aa5beb25410de26cd3/safetensors-0.8.0-cp310-abi3-macosx_10_12_x86_64.whl", hash = "sha256:c554f85858e05226d3c2828e32395e677434685d6d94594a41643361c5e837f0", size = 473568, upload-time = "2026-06-09T07:52:18.829Z" },
+    { url = "https://files.pythonhosted.org/packages/f5/b1/fa7c600e7dceae12e9606c7578cbc9ff1e1ed55844883ee5c92205e86226/safetensors-0.8.0-cp310-abi3-macosx_11_0_arm64.whl", hash = "sha256:c80201d22cbf405b80647a60ada77bba06c8fba2da2743ba1e89cdcc39a81f25", size = 484562, upload-time = "2026-06-09T07:52:17.518Z" },
+    { url = "https://files.pythonhosted.org/packages/09/7d/65a7de0af421317bb36a067241e4235fff194eed60b961ed6d3f59a3fc60/safetensors-0.8.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:7a46e5ff292c356d6991e60942ba7f79817682d3a2cef0702136448cb9c4d235", size = 502844, upload-time = "2026-06-09T07:52:07.624Z" },
+    { url = "https://files.pythonhosted.org/packages/91/4f/3175c9d75634e0e0dda0082794193521035edd7c70a6f212bf33ca06ddf4/safetensors-0.8.0-cp310-abi3-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:4124502b78f03534117c848f87a39b8f31e577b15eff423bf8bfb95f2a8c30d0", size = 511823, upload-time = "2026-06-09T07:52:09.565Z" },
+    { url = "https://files.pythonhosted.org/packages/20/87/846c289e7aa2299eff406335717cf43ce8777194ece8aad75772e0411615/safetensors-0.8.0-cp310-abi3-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:7bc0a787ba8a35be368ee3574edfa2b1ad389eebd0a72e482ae275490e3f6c98", size = 633461, upload-time = "2026-06-09T07:52:11.128Z" },
+    { url = "https://files.pythonhosted.org/packages/76/22/8d64d9df2c45d5ded401df889d0ad90882804ca172d79ec4f0df8f727fe0/safetensors-0.8.0-cp310-abi3-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:040070828e36dc8e122178bbbd5830ff9e97920affb84cbe0f46442497bed358", size = 545148, upload-time = "2026-06-09T07:52:13.603Z" },
+    { url = "https://files.pythonhosted.org/packages/28/50/f203ff3a3ddfe19308efc83c5a3a29ed02bf786732ec35e68bf9162f3365/safetensors-0.8.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:fd6f3f93c9a0a7cc2788ee63fb763353d4bd2e89b0751bc78fcf7dda00bea774", size = 516040, upload-time = "2026-06-09T07:52:16.29Z" },
+    { url = "https://files.pythonhosted.org/packages/46/fb/cdaed17ceb2948784fd9c36b6fd3e951b608547cea81a48e8ee6f8cfdfcb/safetensors-0.8.0-cp310-abi3-manylinux_2_31_riscv64.whl", hash = "sha256:fcdd41ec4628fee5799f807c73c353629130fbd942aa23d83c623dd6c9d52d78", size = 513832, upload-time = "2026-06-09T07:52:12.37Z" },
+    { url = "https://files.pythonhosted.org/packages/0d/49/1e15de264dcc3b77943d2d0c56a95809956883b1c2d6d585c792523f180b/safetensors-0.8.0-cp310-abi3-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:8e9f537aa183a38ace122d27303dcd986b26bd2a7591f9181d7f0c396f4677ca", size = 559930, upload-time = "2026-06-09T07:52:14.743Z" },
+    { url = "https://files.pythonhosted.org/packages/2a/43/bf38443278eab4b1be1fce2931e2b012ad9cb7df52ada751d0aab8f7659a/safetensors-0.8.0-cp310-abi3-musllinux_1_2_aarch64.whl", hash = "sha256:87eec7ffed2b809f05a398a8becb7d013f19f7837cd15d9748580d6cf30dbaf4", size = 678670, upload-time = "2026-06-09T07:52:20.032Z" },
+    { url = "https://files.pythonhosted.org/packages/72/e3/68cd3fa5b48488e84add63e04cb12f3bc28ae4638c06d4508c6e88823d0e/safetensors-0.8.0-cp310-abi3-musllinux_1_2_armv7l.whl", hash = "sha256:4a95ae2b05d7726d751da4ebf626a2ca782b706e101bd894c95bc2450b1cffcc", size = 786679, upload-time = "2026-06-09T07:52:21.322Z" },
+    { url = "https://files.pythonhosted.org/packages/29/4b/1c19c509d56e01f4fbb3d0a2e597450f6cc04d1d56cf52defb0a62dfd715/safetensors-0.8.0-cp310-abi3-musllinux_1_2_i686.whl", hash = "sha256:3ae091f16662658bdc019a4ff6cb4c085bb7d725eb5978b183ffd265863b6d2d", size = 765683, upload-time = "2026-06-09T07:52:22.594Z" },
+    { url = "https://files.pythonhosted.org/packages/27/43/41c1621732edd934d868a00d1b891584c892a7b62a9aab82ea5a0a5623ee/safetensors-0.8.0-cp310-abi3-musllinux_1_2_x86_64.whl", hash = "sha256:8e080062fcde23be189565e1c3305d16751a218ecf9412c8601e64204eb6f846", size = 722361, upload-time = "2026-06-09T07:52:23.924Z" },
+    { url = "https://files.pythonhosted.org/packages/8e/3f/73ccf82579412b4a71c4ca673f10b5f1f888d7cf5af7fe24f27d30307be4/safetensors-0.8.0-cp310-abi3-win32.whl", hash = "sha256:2ddf52eac562eda224f99acfa7889d02968c1fd59a5b011ae7d8137c37e9c02d", size = 342401, upload-time = "2026-06-09T07:52:28.895Z" },
+    { url = "https://files.pythonhosted.org/packages/1b/6d/3fba214c1e5e0f69991677ec3bc17023f0421776975e1de0c682dca475e2/safetensors-0.8.0-cp310-abi3-win_amd64.whl", hash = "sha256:096ec1a98435df7beb08853bb5aa9081a84f23d0adc67ed1a0a10550f608373f", size = 355540, upload-time = "2026-06-09T07:52:27.832Z" },
+    { url = "https://files.pythonhosted.org/packages/8d/fc/7eedc3510d97878876e32774eebbeb61c43f148a96e915c84229a3e967aa/safetensors-0.8.0-cp310-abi3-win_arm64.whl", hash = "sha256:f7838e5135a406ad3e02efdcb8cf2e5397d368b0154537c4fec682dbc544d452", size = 340500, upload-time = "2026-06-09T07:52:26.745Z" },
+]
+
 [[package]]
 name = "secretstorage"
 version = "3.5.0"
@@ -1689,6 +2904,62 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/1f/bc/885047e975e996cb317db31c4551caa915aafc6befea990f082c7233adc2/selenium-4.44.0-py3-none-any.whl", hash = "sha256:d01ea3e5ecad8149460a765f7cf5177194c21dcc0173093fc05427c289b1bf24", size = 9654291, upload-time = "2026-05-12T22:48:16.836Z" },
 ]
 
+[[package]]
+name = "sentencepiece"
+version = "0.2.1"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/15/15/2e7a025fc62d764b151ae6d0f2a92f8081755ebe8d4a64099accc6f77ba6/sentencepiece-0.2.1.tar.gz", hash = "sha256:8138cec27c2f2282f4a34d9a016e3374cd40e5c6e9cb335063db66a0a3b71fad", size = 3228515, upload-time = "2025-08-12T07:00:51.718Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/4a/be/32ce495aa1d0e0c323dcb1ba87096037358edee539cac5baf8755a6bd396/sentencepiece-0.2.1-cp312-cp312-macosx_10_13_universal2.whl", hash = "sha256:57cae326c8727de58c85977b175af132a7138d84c764635d7e71bbee7e774133", size = 1943152, upload-time = "2025-08-12T06:59:40.048Z" },
+    { url = "https://files.pythonhosted.org/packages/88/7e/ff23008899a58678e98c6ff592bf4d368eee5a71af96d0df6b38a039dd4f/sentencepiece-0.2.1-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:56dd39a3c4d6493db3cdca7e8cc68c6b633f0d4195495cbadfcf5af8a22d05a6", size = 1325651, upload-time = "2025-08-12T06:59:41.536Z" },
+    { url = "https://files.pythonhosted.org/packages/19/84/42eb3ce4796777a1b5d3699dfd4dca85113e68b637f194a6c8d786f16a04/sentencepiece-0.2.1-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:d9381351182ff9888cc80e41c632e7e274b106f450de33d67a9e8f6043da6f76", size = 1253645, upload-time = "2025-08-12T06:59:42.903Z" },
+    { url = "https://files.pythonhosted.org/packages/89/fa/d3d5ebcba3cb9e6d3775a096251860c41a6bc53a1b9461151df83fe93255/sentencepiece-0.2.1-cp312-cp312-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:99f955df238021bf11f0fc37cdb54fd5e5b5f7fd30ecc3d93fb48b6815437167", size = 1316273, upload-time = "2025-08-12T06:59:44.476Z" },
+    { url = "https://files.pythonhosted.org/packages/04/88/14f2f4a2b922d8b39be45bf63d79e6cd3a9b2f248b2fcb98a69b12af12f5/sentencepiece-0.2.1-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:0cdfecef430d985f1c2bcbfff3defd1d95dae876fbd0173376012d2d7d24044b", size = 1387881, upload-time = "2025-08-12T06:59:46.09Z" },
+    { url = "https://files.pythonhosted.org/packages/fd/b8/903e5ccb77b4ef140605d5d71b4f9e0ad95d456d6184688073ed11712809/sentencepiece-0.2.1-cp312-cp312-win32.whl", hash = "sha256:a483fd29a34c3e34c39ac5556b0a90942bec253d260235729e50976f5dba1068", size = 999540, upload-time = "2025-08-12T06:59:48.023Z" },
+    { url = "https://files.pythonhosted.org/packages/2d/81/92df5673c067148c2545b1bfe49adfd775bcc3a169a047f5a0e6575ddaca/sentencepiece-0.2.1-cp312-cp312-win_amd64.whl", hash = "sha256:4cdc7c36234fda305e85c32949c5211faaf8dd886096c7cea289ddc12a2d02de", size = 1054671, upload-time = "2025-08-12T06:59:49.895Z" },
+    { url = "https://files.pythonhosted.org/packages/fe/02/c5e3bc518655d714622bec87d83db9cdba1cd0619a4a04e2109751c4f47f/sentencepiece-0.2.1-cp312-cp312-win_arm64.whl", hash = "sha256:daeb5e9e9fcad012324807856113708614d534f596d5008638eb9b40112cd9e4", size = 1033923, upload-time = "2025-08-12T06:59:51.952Z" },
+]
+
+[[package]]
+name = "sentry-sdk"
+version = "2.63.0"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "certifi" },
+    { name = "urllib3" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/ba/c8/b3c970a5b186722d276cd40a05b3254e03bccc0208560aff20f612e018e8/sentry_sdk-2.63.0.tar.gz", hash = "sha256:2a1502bf864769275dbc8c2c9fc7a0f7f5e18358180b615d262d13a31ffba216", size = 912449, upload-time = "2026-06-16T12:45:57.553Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/7b/57/cb205f7d93373120f666b9c5736dc0815524d96a9b278e7a728f018dc22a/sentry_sdk-2.63.0-py3-none-any.whl", hash = "sha256:3a9b5ddd403f79eb73bd670f75f04485819db53d28f76ced7bc09041cb0dfd6a", size = 495950, upload-time = "2026-06-16T12:45:55.819Z" },
+]
+
+[[package]]
+name = "setproctitle"
+version = "1.3.7"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/8d/48/49393a96a2eef1ab418b17475fb92b8fcfad83d099e678751b05472e69de/setproctitle-1.3.7.tar.gz", hash = "sha256:bc2bc917691c1537d5b9bca1468437176809c7e11e5694ca79a9ca12345dcb9e", size = 27002, upload-time = "2025-09-05T12:51:25.278Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/fb/f0/2dc88e842077719d7384d86cc47403e5102810492b33680e7dadcee64cd8/setproctitle-1.3.7-cp312-cp312-macosx_10_13_universal2.whl", hash = "sha256:2dc99aec591ab6126e636b11035a70991bc1ab7a261da428491a40b84376654e", size = 18049, upload-time = "2025-09-05T12:49:36.241Z" },
+    { url = "https://files.pythonhosted.org/packages/f0/b4/50940504466689cda65680c9e9a1e518e5750c10490639fa687489ac7013/setproctitle-1.3.7-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:cdd8aa571b7aa39840fdbea620e308a19691ff595c3a10231e9ee830339dd798", size = 13079, upload-time = "2025-09-05T12:49:38.088Z" },
+    { url = "https://files.pythonhosted.org/packages/d0/99/71630546b9395b095f4082be41165d1078204d1696c2d9baade3de3202d0/setproctitle-1.3.7-cp312-cp312-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl", hash = "sha256:2906b6c7959cdb75f46159bf0acd8cc9906cf1361c9e1ded0d065fe8f9039629", size = 32932, upload-time = "2025-09-05T12:49:39.271Z" },
+    { url = "https://files.pythonhosted.org/packages/50/22/cee06af4ffcfb0e8aba047bd44f5262e644199ae7527ae2c1f672b86495c/setproctitle-1.3.7-cp312-cp312-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:6915964a6dda07920a1159321dcd6d94fc7fc526f815ca08a8063aeca3c204f1", size = 33736, upload-time = "2025-09-05T12:49:40.565Z" },
+    { url = "https://files.pythonhosted.org/packages/5c/00/a5949a8bb06ef5e7df214fc393bb2fb6aedf0479b17214e57750dfdd0f24/setproctitle-1.3.7-cp312-cp312-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:cff72899861c765bd4021d1ff1c68d60edc129711a2fdba77f9cb69ef726a8b6", size = 35605, upload-time = "2025-09-05T12:49:42.362Z" },
+    { url = "https://files.pythonhosted.org/packages/b0/3a/50caca532a9343828e3bf5778c7a84d6c737a249b1796d50dd680290594d/setproctitle-1.3.7-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:b7cb05bd446687ff816a3aaaf831047fc4c364feff7ada94a66024f1367b448c", size = 33143, upload-time = "2025-09-05T12:49:43.515Z" },
+    { url = "https://files.pythonhosted.org/packages/ca/14/b843a251296ce55e2e17c017d6b9f11ce0d3d070e9265de4ecad948b913d/setproctitle-1.3.7-cp312-cp312-musllinux_1_2_ppc64le.whl", hash = "sha256:3a57b9a00de8cae7e2a1f7b9f0c2ac7b69372159e16a7708aa2f38f9e5cc987a", size = 34434, upload-time = "2025-09-05T12:49:45.31Z" },
+    { url = "https://files.pythonhosted.org/packages/c8/b7/06145c238c0a6d2c4bc881f8be230bb9f36d2bf51aff7bddcb796d5eed67/setproctitle-1.3.7-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:d8828b356114f6b308b04afe398ed93803d7fca4a955dd3abe84430e28d33739", size = 32795, upload-time = "2025-09-05T12:49:46.419Z" },
+    { url = "https://files.pythonhosted.org/packages/ef/dc/ef76a81fac9bf27b84ed23df19c1f67391a753eed6e3c2254ebcb5133f56/setproctitle-1.3.7-cp312-cp312-win32.whl", hash = "sha256:b0304f905efc845829ac2bc791ddebb976db2885f6171f4a3de678d7ee3f7c9f", size = 12552, upload-time = "2025-09-05T12:49:47.635Z" },
+    { url = "https://files.pythonhosted.org/packages/e2/5b/a9fe517912cd6e28cf43a212b80cb679ff179a91b623138a99796d7d18a0/setproctitle-1.3.7-cp312-cp312-win_amd64.whl", hash = "sha256:9888ceb4faea3116cf02a920ff00bfbc8cc899743e4b4ac914b03625bdc3c300", size = 13247, upload-time = "2025-09-05T12:49:49.16Z" },
+]
+
+[[package]]
+name = "setuptools"
+version = "80.10.2"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/76/95/faf61eb8363f26aa7e1d762267a8d602a1b26d4f3a1e758e92cb3cb8b054/setuptools-80.10.2.tar.gz", hash = "sha256:8b0e9d10c784bf7d262c4e5ec5d4ec94127ce206e8738f29a437945fbc219b70", size = 1200343, upload-time = "2026-01-25T22:38:17.252Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/94/b8/f1f62a5e3c0ad2ff1d189590bfa4c46b4f3b6e49cef6f26c6ee4e575394d/setuptools-80.10.2-py3-none-any.whl", hash = "sha256:95b30ddfb717250edb492926c92b5221f7ef3fbcc2b07579bcd4a27da21d0173", size = 1064234, upload-time = "2026-01-25T22:38:15.216Z" },
+]
+
 [[package]]
 name = "shellingham"
 version = "1.5.4"
@@ -1760,6 +3031,36 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/1c/54/196d0c1db10af76baa4f64894448505d60d3cdf70ef92cbb35f46a4e4c71/starlette-1.2.1-py3-none-any.whl", hash = "sha256:4de0082d08c8f6764a85a54cf1120d6939507a19905c7768acad2a9f875d2b89", size = 73350, upload-time = "2026-05-31T01:07:50.09Z" },
 ]
 
+[[package]]
+name = "supervisor"
+version = "4.3.0"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/a9/b5/37e7a3706de436a8a2d75334711dad1afb4ddffab09f25e31d89e467542f/supervisor-4.3.0.tar.gz", hash = "sha256:4a2bf149adf42997e1bb44b70c43b613275ec9852c3edacca86a9166b27e945e", size = 468912, upload-time = "2025-08-23T18:25:02.418Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/0e/65/5e726c372da8a5e35022a94388b12252710aad0c2351699c3d76ae8dba78/supervisor-4.3.0-py2.py3-none-any.whl", hash = "sha256:0bcb763fddafba410f35cbde226aa7f8514b9fb82eb05a0c85f6588d1c13f8db", size = 320736, upload-time = "2025-08-23T18:25:00.767Z" },
+]
+
+[[package]]
+name = "sympy"
+version = "1.14.0"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "mpmath" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/83/d3/803453b36afefb7c2bb238361cd4ae6125a569b4db67cd9e79846ba2d68c/sympy-1.14.0.tar.gz", hash = "sha256:d3d3fe8df1e5a0b42f0e7bdf50541697dbe7d23746e894990c030e2b05e72517", size = 7793921, upload-time = "2025-04-27T18:05:01.611Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/a2/09/77d55d46fd61b4a135c444fc97158ef34a095e5681d0a6c10b75bf356191/sympy-1.14.0-py3-none-any.whl", hash = "sha256:e091cc3e99d2141a0ba2847328f5479b05d94a6635cb96148ccb3f34671bd8f5", size = 6299353, upload-time = "2025-04-27T18:04:59.103Z" },
+]
+
+[[package]]
+name = "tabulate"
+version = "0.10.0"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/46/58/8c37dea7bbf769b20d58e7ace7e5edfe65b849442b00ffcdd56be88697c6/tabulate-0.10.0.tar.gz", hash = "sha256:e2cfde8f79420f6deeffdeda9aaec3b6bc5abce947655d17ac662b126e48a60d", size = 91754, upload-time = "2026-03-04T18:55:34.402Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/99/55/db07de81b5c630da5cbf5c7df646580ca26dfaefa593667fc6f2fe016d2e/tabulate-0.10.0-py3-none-any.whl", hash = "sha256:f0b0622e567335c8fabaaa659f1b33bcb6ddfe2e496071b743aa113f8774f2d3", size = 39814, upload-time = "2026-03-04T18:55:31.284Z" },
+]
+
 [[package]]
 name = "tenacity"
 version = "9.1.4"
@@ -1799,29 +3100,112 @@ wheels = [
 
 [[package]]
 name = "tokenizers"
-version = "0.23.1"
+version = "0.22.2"
 source = { registry = "https://pypi.org/simple" }
 dependencies = [
     { name = "huggingface-hub" },
 ]
-sdist = { url = "https://files.pythonhosted.org/packages/c1/60/21f715d9faba5f5407ff759472ade058ec4a507ad62bcea47cb847239a73/tokenizers-0.23.1.tar.gz", hash = "sha256:1feeeadf865a7915adc25445dea30e9933e593c31bb96c277cee36de227c8bfa", size = 365748, upload-time = "2026-04-27T14:43:25.606Z" }
+sdist = { url = "https://files.pythonhosted.org/packages/73/6f/f80cfef4a312e1fb34baf7d85c72d4411afde10978d4657f8cdd811d3ccc/tokenizers-0.22.2.tar.gz", hash = "sha256:473b83b915e547aa366d1eee11806deaf419e17be16310ac0a14077f1e28f917", size = 372115, upload-time = "2026-01-05T10:45:15.988Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/92/97/5dbfabf04c7e348e655e907ed27913e03db0923abb5dfdd120d7b25630e1/tokenizers-0.22.2-cp39-abi3-macosx_10_12_x86_64.whl", hash = "sha256:544dd704ae7238755d790de45ba8da072e9af3eea688f698b137915ae959281c", size = 3100275, upload-time = "2026-01-05T10:41:02.158Z" },
+    { url = "https://files.pythonhosted.org/packages/2e/47/174dca0502ef88b28f1c9e06b73ce33500eedfac7a7692108aec220464e7/tokenizers-0.22.2-cp39-abi3-macosx_11_0_arm64.whl", hash = "sha256:1e418a55456beedca4621dbab65a318981467a2b188e982a23e117f115ce5001", size = 2981472, upload-time = "2026-01-05T10:41:00.276Z" },
+    { url = "https://files.pythonhosted.org/packages/d6/84/7990e799f1309a8b87af6b948f31edaa12a3ed22d11b352eaf4f4b2e5753/tokenizers-0.22.2-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:2249487018adec45d6e3554c71d46eb39fa8ea67156c640f7513eb26f318cec7", size = 3290736, upload-time = "2026-01-05T10:40:32.165Z" },
+    { url = "https://files.pythonhosted.org/packages/78/59/09d0d9ba94dcd5f4f1368d4858d24546b4bdc0231c2354aa31d6199f0399/tokenizers-0.22.2-cp39-abi3-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:25b85325d0815e86e0bac263506dd114578953b7b53d7de09a6485e4a160a7dd", size = 3168835, upload-time = "2026-01-05T10:40:38.847Z" },
+    { url = "https://files.pythonhosted.org/packages/47/50/b3ebb4243e7160bda8d34b731e54dd8ab8b133e50775872e7a434e524c28/tokenizers-0.22.2-cp39-abi3-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:bfb88f22a209ff7b40a576d5324bf8286b519d7358663db21d6246fb17eea2d5", size = 3521673, upload-time = "2026-01-05T10:40:56.614Z" },
+    { url = "https://files.pythonhosted.org/packages/e0/fa/89f4cb9e08df770b57adb96f8cbb7e22695a4cb6c2bd5f0c4f0ebcf33b66/tokenizers-0.22.2-cp39-abi3-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:1c774b1276f71e1ef716e5486f21e76333464f47bece56bbd554485982a9e03e", size = 3724818, upload-time = "2026-01-05T10:40:44.507Z" },
+    { url = "https://files.pythonhosted.org/packages/64/04/ca2363f0bfbe3b3d36e95bf67e56a4c88c8e3362b658e616d1ac185d47f2/tokenizers-0.22.2-cp39-abi3-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:df6c4265b289083bf710dff49bc51ef252f9d5be33a45ee2bed151114a56207b", size = 3379195, upload-time = "2026-01-05T10:40:51.139Z" },
+    { url = "https://files.pythonhosted.org/packages/2e/76/932be4b50ef6ccedf9d3c6639b056a967a86258c6d9200643f01269211ca/tokenizers-0.22.2-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:369cc9fc8cc10cb24143873a0d95438bb8ee257bb80c71989e3ee290e8d72c67", size = 3274982, upload-time = "2026-01-05T10:40:58.331Z" },
+    { url = "https://files.pythonhosted.org/packages/1d/28/5f9f5a4cc211b69e89420980e483831bcc29dade307955cc9dc858a40f01/tokenizers-0.22.2-cp39-abi3-musllinux_1_2_aarch64.whl", hash = "sha256:29c30b83d8dcd061078b05ae0cb94d3c710555fbb44861139f9f83dcca3dc3e4", size = 9478245, upload-time = "2026-01-05T10:41:04.053Z" },
+    { url = "https://files.pythonhosted.org/packages/6c/fb/66e2da4704d6aadebf8cb39f1d6d1957df667ab24cff2326b77cda0dcb85/tokenizers-0.22.2-cp39-abi3-musllinux_1_2_armv7l.whl", hash = "sha256:37ae80a28c1d3265bb1f22464c856bd23c02a05bb211e56d0c5301a435be6c1a", size = 9560069, upload-time = "2026-01-05T10:45:10.673Z" },
+    { url = "https://files.pythonhosted.org/packages/16/04/fed398b05caa87ce9b1a1bb5166645e38196081b225059a6edaff6440fac/tokenizers-0.22.2-cp39-abi3-musllinux_1_2_i686.whl", hash = "sha256:791135ee325f2336f498590eb2f11dc5c295232f288e75c99a36c5dbce63088a", size = 9899263, upload-time = "2026-01-05T10:45:12.559Z" },
+    { url = "https://files.pythonhosted.org/packages/05/a1/d62dfe7376beaaf1394917e0f8e93ee5f67fea8fcf4107501db35996586b/tokenizers-0.22.2-cp39-abi3-musllinux_1_2_x86_64.whl", hash = "sha256:38337540fbbddff8e999d59970f3c6f35a82de10053206a7562f1ea02d046fa5", size = 10033429, upload-time = "2026-01-05T10:45:14.333Z" },
+    { url = "https://files.pythonhosted.org/packages/fd/18/a545c4ea42af3df6effd7d13d250ba77a0a86fb20393143bbb9a92e434d4/tokenizers-0.22.2-cp39-abi3-win32.whl", hash = "sha256:a6bf3f88c554a2b653af81f3204491c818ae2ac6fbc09e76ef4773351292bc92", size = 2502363, upload-time = "2026-01-05T10:45:20.593Z" },
+    { url = "https://files.pythonhosted.org/packages/65/71/0670843133a43d43070abeb1949abfdef12a86d490bea9cd9e18e37c5ff7/tokenizers-0.22.2-cp39-abi3-win_amd64.whl", hash = "sha256:c9ea31edff2968b44a88f97d784c2f16dc0729b8b143ed004699ebca91f05c48", size = 2747786, upload-time = "2026-01-05T10:45:18.411Z" },
+    { url = "https://files.pythonhosted.org/packages/72/f4/0de46cfa12cdcbcd464cc59fde36912af405696f687e53a091fb432f694c/tokenizers-0.22.2-cp39-abi3-win_arm64.whl", hash = "sha256:9ce725d22864a1e965217204946f830c37876eee3b2ba6fc6255e8e903d5fcbc", size = 2612133, upload-time = "2026-01-05T10:45:17.232Z" },
+]
+
+[[package]]
+name = "torch"
+version = "2.10.0"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "cuda-bindings", version = "12.9.4", source = { registry = "https://pypi.org/simple" }, marker = "platform_machine == 'x86_64' and sys_platform == 'linux'" },
+    { name = "filelock" },
+    { name = "fsspec" },
+    { name = "jinja2" },
+    { name = "networkx" },
+    { name = "nvidia-cublas-cu12", marker = "platform_machine == 'x86_64' and sys_platform == 'linux'" },
+    { name = "nvidia-cuda-cupti-cu12", marker = "platform_machine == 'x86_64' and sys_platform == 'linux'" },
+    { name = "nvidia-cuda-nvrtc-cu12", marker = "platform_machine == 'x86_64' and sys_platform == 'linux'" },
+    { name = "nvidia-cuda-runtime-cu12", marker = "platform_machine == 'x86_64' and sys_platform == 'linux'" },
+    { name = "nvidia-cudnn-cu12", marker = "platform_machine == 'x86_64' and sys_platform == 'linux'" },
+    { name = "nvidia-cufft-cu12", marker = "platform_machine == 'x86_64' and sys_platform == 'linux'" },
+    { name = "nvidia-cufile-cu12", marker = "platform_machine == 'x86_64' and sys_platform == 'linux'" },
+    { name = "nvidia-curand-cu12", marker = "platform_machine == 'x86_64' and sys_platform == 'linux'" },
+    { name = "nvidia-cusolver-cu12", marker = "platform_machine == 'x86_64' and sys_platform == 'linux'" },
+    { name = "nvidia-cusparse-cu12", marker = "platform_machine == 'x86_64' and sys_platform == 'linux'" },
+    { name = "nvidia-cusparselt-cu12", marker = "platform_machine == 'x86_64' and sys_platform == 'linux'" },
+    { name = "nvidia-nccl-cu12", marker = "platform_machine == 'x86_64' and sys_platform == 'linux'" },
+    { name = "nvidia-nvjitlink-cu12", marker = "platform_machine == 'x86_64' and sys_platform == 'linux'" },
+    { name = "nvidia-nvshmem-cu12", marker = "platform_machine == 'x86_64' and sys_platform == 'linux'" },
+    { name = "nvidia-nvtx-cu12", marker = "platform_machine == 'x86_64' and sys_platform == 'linux'" },
+    { name = "setuptools" },
+    { name = "sympy" },
+    { name = "triton", marker = "platform_machine == 'x86_64' and sys_platform == 'linux'" },
+    { name = "typing-extensions" },
+]
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/d3/54/a2ba279afcca44bbd320d4e73675b282fcee3d81400ea1b53934efca6462/torch-2.10.0-2-cp312-none-macosx_11_0_arm64.whl", hash = "sha256:13ec4add8c3faaed8d13e0574f5cd4a323c11655546f91fbe6afa77b57423574", size = 79498202, upload-time = "2026-02-10T21:44:52.603Z" },
+    { url = "https://files.pythonhosted.org/packages/b3/7a/abada41517ce0011775f0f4eacc79659bc9bc6c361e6bfe6f7052a6b9363/torch-2.10.0-3-cp312-cp312-manylinux_2_28_x86_64.whl", hash = "sha256:98c01b8bb5e3240426dcde1446eed6f40c778091c8544767ef1168fc663a05a6", size = 915622781, upload-time = "2026-03-11T14:17:11.354Z" },
+    { url = "https://files.pythonhosted.org/packages/cc/af/758e242e9102e9988969b5e621d41f36b8f258bb4a099109b7a4b4b50ea4/torch-2.10.0-cp312-cp312-manylinux_2_28_aarch64.whl", hash = "sha256:5fd4117d89ffd47e3dcc71e71a22efac24828ad781c7e46aaaf56bf7f2796acf", size = 145996088, upload-time = "2026-01-21T16:24:44.171Z" },
+    { url = "https://files.pythonhosted.org/packages/23/8e/3c74db5e53bff7ed9e34c8123e6a8bfef718b2450c35eefab85bb4a7e270/torch-2.10.0-cp312-cp312-manylinux_2_28_x86_64.whl", hash = "sha256:787124e7db3b379d4f1ed54dd12ae7c741c16a4d29b49c0226a89bea50923ffb", size = 915711952, upload-time = "2026-01-21T16:23:53.503Z" },
+    { url = "https://files.pythonhosted.org/packages/6e/01/624c4324ca01f66ae4c7cd1b74eb16fb52596dce66dbe51eff95ef9e7a4c/torch-2.10.0-cp312-cp312-win_amd64.whl", hash = "sha256:2c66c61f44c5f903046cc696d088e21062644cbe541c7f1c4eaae88b2ad23547", size = 113757972, upload-time = "2026-01-21T16:24:39.516Z" },
+    { url = "https://files.pythonhosted.org/packages/c9/5c/dee910b87c4d5c0fcb41b50839ae04df87c1cfc663cf1b5fca7ea565eeaa/torch-2.10.0-cp312-none-macosx_11_0_arm64.whl", hash = "sha256:6d3707a61863d1c4d6ebba7be4ca320f42b869ee657e9b2c21c736bf17000294", size = 79498198, upload-time = "2026-01-21T16:24:34.704Z" },
+]
+
+[[package]]
+name = "torch-c-dlpack-ext"
+version = "0.1.5"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "torch" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/37/de/921b6491efce5c389a5ef9bbed3d2d6660005840dae488124173180859ab/torch_c_dlpack_ext-0.1.5.tar.gz", hash = "sha256:d06f0357d575d22a168cc77acb9020fc4bae30968ceb6718a055dcbe92bacabe", size = 12913, upload-time = "2026-01-12T11:25:08.484Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/b1/67/10d236698525d7b7db4d74ec0a4b01f5b2db33968995fdd9ac6b4635e327/torch_c_dlpack_ext-0.1.5-cp312-cp312-macosx_14_0_arm64.whl", hash = "sha256:c0f2bd51fcd99c0e5b50314e1985f2728c4941bfa821f065e6c30951d1f995ca", size = 5291237, upload-time = "2026-01-12T11:24:44.011Z" },
+    { url = "https://files.pythonhosted.org/packages/87/06/8d760997307a5c3be4384424667bf31aae0a42060838c532c7d846516175/torch_c_dlpack_ext-0.1.5-cp312-cp312-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:3562ee411258676f9c38b8ad39306d1c8d027b6a86f6a87c920d2d009a9d1510", size = 443069, upload-time = "2026-01-12T11:24:45.451Z" },
+    { url = "https://files.pythonhosted.org/packages/e2/79/a914539b4785f3e44f891aa012a886edb8bc10fe081c440981c57543ce21/torch_c_dlpack_ext-0.1.5-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:e6f9da4bb9af70e27facc777458be62e10dbbbddda7672d16138db0553c5a524", size = 897846, upload-time = "2026-01-12T11:24:48.168Z" },
+    { url = "https://files.pythonhosted.org/packages/3a/e6/7d7a97a3953208d6d6ce749180c34d1dab48464ded9a76cecabe9d021ce6/torch_c_dlpack_ext-0.1.5-cp312-cp312-win_amd64.whl", hash = "sha256:670fbbab70123cc228bed41693a3720757af57a0ad22669063c9db25321e8f55", size = 1482855, upload-time = "2026-01-12T11:24:49.581Z" },
+]
+
+[[package]]
+name = "torchaudio"
+version = "2.10.0"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "torch" },
+]
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/0f/36/28a6f3e857616cf7576bdbf8170e483b8c5d0a1f8d349ecb2b75921236aa/torchaudio-2.10.0-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:9d0fbdbfd2f621c51d28571050d6d0c7287791034e5c7303b31480af1258f33f", size = 737144, upload-time = "2026-01-21T16:28:44.189Z" },
+    { url = "https://files.pythonhosted.org/packages/ea/3f/df620439a76ece170472d41438d11a1545d5db5dc9f1eaeab8c6e055a328/torchaudio-2.10.0-cp312-cp312-manylinux_2_28_aarch64.whl", hash = "sha256:42b148a0921a3721abd1f6ae098b1ec9f89703e555c4f7a0d44da87b8decbcb9", size = 391973, upload-time = "2026-01-21T16:28:39.732Z" },
+    { url = "https://files.pythonhosted.org/packages/98/25/e55a30d7138f8fe56ed006df25b0a3c27681f0ec7bc9989e1778e6d559c3/torchaudio-2.10.0-cp312-cp312-manylinux_2_28_x86_64.whl", hash = "sha256:0e77b2956448d63790a99beed0b74ac8b8cd3a94dcdd9ad01974411078f46278", size = 1895234, upload-time = "2026-01-21T16:28:37.034Z" },
+    { url = "https://files.pythonhosted.org/packages/be/a0/da53c7d20fac15f66f8838653b91162de1bf21fb40fee88cf839e4ef5174/torchaudio-2.10.0-cp312-cp312-win_amd64.whl", hash = "sha256:7f76a01ecebf1869e1f2c50a261f1cf07e5fccb24402b4e9bbb82d6725b9c7dd", size = 475470, upload-time = "2026-01-21T16:28:40.615Z" },
+]
+
+[[package]]
+name = "torchvision"
+version = "0.25.0"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "numpy" },
+    { name = "pillow" },
+    { name = "torch" },
+]
 wheels = [
-    { url = "https://files.pythonhosted.org/packages/87/39/b87a87d5bb9470610b80a2d31df42fcffeaf35118b8b97952b2aff598cc7/tokenizers-0.23.1-cp310-abi3-macosx_10_12_x86_64.whl", hash = "sha256:e03d6ffcbe0d56ee9c1ccd070e70a13fa750727c0277e138152acbc0252c2224", size = 3146732, upload-time = "2026-04-27T14:43:15.427Z" },
-    { url = "https://files.pythonhosted.org/packages/e2/6a/068ed9f6e444c9d7e9d55ce134181325700f3d7f30410721bdc8f848d727/tokenizers-0.23.1-cp310-abi3-macosx_11_0_arm64.whl", hash = "sha256:e0948bbb1ac1d7cdfc9fb6d62c596e3b7550036ad60ecd654a66ad273326324e", size = 3054954, upload-time = "2026-04-27T14:43:13.745Z" },
-    { url = "https://files.pythonhosted.org/packages/6c/36/e006edf031154cba92b8416057d92c3abe3635e4c4b0aa0b5b9bb39dde70/tokenizers-0.23.1-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:1bf13402aff9bc533c89cb849ec3b412dc3fbeacc9744840e423d7bf3f7dc0e3", size = 3374081, upload-time = "2026-04-27T14:43:01.241Z" },
-    { url = "https://files.pythonhosted.org/packages/a2/ef/7735d226f9c7f874a6bee5e3f27fb25ecabdf207d37b8cf45286d0795893/tokenizers-0.23.1-cp310-abi3-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:f836ca703b89ae07919a309f9651f7a88fd5a33d5f718ba5ad0870ec0256bad6", size = 3247641, upload-time = "2026-04-27T14:43:03.856Z" },
-    { url = "https://files.pythonhosted.org/packages/b9/d9/24827036f6e21297bfffda0768e58eb6096a4f411e932964a01707857931/tokenizers-0.23.1-cp310-abi3-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:ae848657742035523fdf261773630cb819a26995fcd3d9ecae0c1daf6e5a4959", size = 3585624, upload-time = "2026-04-27T14:43:10.664Z" },
-    { url = "https://files.pythonhosted.org/packages/0c/9a/22f3582b3a4f49358293a5206e25317621ee4526bfe9cdaa0f07a12e770e/tokenizers-0.23.1-cp310-abi3-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:53b09e85775d5187941e7bab30e941b4134ab4a7dd8c68e783d231fb7ca27c51", size = 3844062, upload-time = "2026-04-27T14:43:05.643Z" },
-    { url = "https://files.pythonhosted.org/packages/7e/65/b8f8814eef95800f20721384136d9a1d22241d50b2874357cb70542c392f/tokenizers-0.23.1-cp310-abi3-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:ea5a0ce170074329faaa8ea3f6400ecde604b6678192688533af80980daae71a", size = 3460098, upload-time = "2026-04-27T14:43:08.854Z" },
-    { url = "https://files.pythonhosted.org/packages/0d/d5/1353e5f677ec27c2494fb6a6725e82d56c985f53e90ec511369e7e4f02c6/tokenizers-0.23.1-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:5075b405006415ea148a992d093699c66eb01952bf59f4d5727089a98bda45a4", size = 3346235, upload-time = "2026-04-27T14:43:12.377Z" },
-    { url = "https://files.pythonhosted.org/packages/71/89/39b6b8fc073fb6d413d0147aa333dc7eff7be65639ac9d19930a0b21bf33/tokenizers-0.23.1-cp310-abi3-manylinux_2_31_riscv64.whl", hash = "sha256:56f3a77de629917652f876294dc9fe6bad4a0c43bc229dc72e59bb23a0f4729a", size = 3426398, upload-time = "2026-04-27T14:43:07.264Z" },
-    { url = "https://files.pythonhosted.org/packages/0f/80/127c854da64827e5b79264ce524993a90dddcb320e5cd42412c5c02f9e8a/tokenizers-0.23.1-cp310-abi3-musllinux_1_2_aarch64.whl", hash = "sha256:9d10a6d957ef01896dc274e890eee27d41bd0e74ef31e60616f0fc311345184e", size = 9823279, upload-time = "2026-04-27T14:43:17.222Z" },
-    { url = "https://files.pythonhosted.org/packages/fe/ba/44c2502feb1a058f096ddfb4e0996ef3225a01a388e1a9b094e91689fe93/tokenizers-0.23.1-cp310-abi3-musllinux_1_2_armv7l.whl", hash = "sha256:1974288a609c343774f1b897c8b482c791ab17b75ab5c8c2b1737565c1d82288", size = 9644986, upload-time = "2026-04-27T14:43:19.45Z" },
-    { url = "https://files.pythonhosted.org/packages/9e/c1/464019a9fb059870bfe4eebb4ba12208f3042035e258bf5e782906bd3847/tokenizers-0.23.1-cp310-abi3-musllinux_1_2_i686.whl", hash = "sha256:120468fb4c24faf0543c835a4fabafa4deb3f20a035c9b6e83d0b553a97615d4", size = 9976181, upload-time = "2026-04-27T14:43:21.463Z" },
-    { url = "https://files.pythonhosted.org/packages/79/94/3ac1432bda31626071e9b6a12709b97ae05131c804b94c8f3ac622c5da32/tokenizers-0.23.1-cp310-abi3-musllinux_1_2_x86_64.whl", hash = "sha256:e3d8f40ea6268047de7046906326abed5134f27d4e8447b23763afe5808c8a96", size = 10113853, upload-time = "2026-04-27T14:43:23.617Z" },
-    { url = "https://files.pythonhosted.org/packages/6a/dd/631b21433c771b1382535326f0eca80b9c9cee2e64961dd993bc9ac4669e/tokenizers-0.23.1-cp310-abi3-win32.whl", hash = "sha256:93120a930b919416da7cd10a2f606ac9919cc69cacae7980fa2140e277660948", size = 2536263, upload-time = "2026-04-27T14:43:29.888Z" },
-    { url = "https://files.pythonhosted.org/packages/97/c9/2553f72aaf65a2797d4229e37fa7fbe38ffbf3e32912d31bdd78b3323e59/tokenizers-0.23.1-cp310-abi3-win_amd64.whl", hash = "sha256:e7bfaf995c1bdbbd21d13539decb6650967013759318627d85daeb7881af16b7", size = 2798223, upload-time = "2026-04-27T14:43:28.51Z" },
-    { url = "https://files.pythonhosted.org/packages/cd/2b/2be299bab55fc595e3d38567edb1a87f86e594842968fa9515a07bdcf422/tokenizers-0.23.1-cp310-abi3-win_arm64.whl", hash = "sha256:a26197957d8e4425dfba746315f3c425ea00cfa8367c5fbc4ec73447893dcea9", size = 2664127, upload-time = "2026-04-27T14:43:26.949Z" },
+    { url = "https://files.pythonhosted.org/packages/56/3a/6ea0d73f49a9bef38a1b3a92e8dd455cea58470985d25635beab93841748/torchvision-0.25.0-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:c2abe430c90b1d5e552680037d68da4eb80a5852ebb1c811b2b89d299b10573b", size = 1874920, upload-time = "2026-01-21T16:27:45.348Z" },
+    { url = "https://files.pythonhosted.org/packages/51/f8/c0e1ef27c66e15406fece94930e7d6feee4cb6374bbc02d945a630d6426e/torchvision-0.25.0-cp312-cp312-manylinux_2_28_aarch64.whl", hash = "sha256:b75deafa2dfea3e2c2a525559b04783515e3463f6e830cb71de0fb7ea36fe233", size = 2344556, upload-time = "2026-01-21T16:27:40.125Z" },
+    { url = "https://files.pythonhosted.org/packages/68/2f/f24b039169db474e8688f649377de082a965fbf85daf4e46c44412f1d15a/torchvision-0.25.0-cp312-cp312-manylinux_2_28_x86_64.whl", hash = "sha256:f25aa9e380865b11ea6e9d99d84df86b9cc959f1a007cd966fc6f1ab2ed0e248", size = 8072351, upload-time = "2026-01-21T16:27:21.074Z" },
+    { url = "https://files.pythonhosted.org/packages/ad/16/8f650c2e288977cf0f8f85184b90ee56ed170a4919347fc74ee99286ed6f/torchvision-0.25.0-cp312-cp312-win_amd64.whl", hash = "sha256:f9c55ae8d673ab493325d1267cbd285bb94d56f99626c00ac4644de32a59ede3", size = 4303059, upload-time = "2026-01-21T16:27:11.08Z" },
 ]
 
 [[package]]
@@ -1854,6 +3238,27 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/8a/b6/097367f180b6383a3581ca1b86fcae284e52075fa941d1232df35293363c/trafilatura-2.0.0-py3-none-any.whl", hash = "sha256:77eb5d1e993747f6f20938e1de2d840020719735690c840b9a1024803a4cd51d", size = 132557, upload-time = "2024-12-03T15:23:21.41Z" },
 ]
 
+[[package]]
+name = "transformers"
+version = "4.57.6"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "filelock" },
+    { name = "huggingface-hub" },
+    { name = "numpy" },
+    { name = "packaging" },
+    { name = "pyyaml" },
+    { name = "regex" },
+    { name = "requests" },
+    { name = "safetensors" },
+    { name = "tokenizers" },
+    { name = "tqdm" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/c4/35/67252acc1b929dc88b6602e8c4a982e64f31e733b804c14bc24b47da35e6/transformers-4.57.6.tar.gz", hash = "sha256:55e44126ece9dc0a291521b7e5492b572e6ef2766338a610b9ab5afbb70689d3", size = 10134912, upload-time = "2026-01-16T10:38:39.284Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/03/b8/e484ef633af3887baeeb4b6ad12743363af7cce68ae51e938e00aaa0529d/transformers-4.57.6-py3-none-any.whl", hash = "sha256:4c9e9de11333ddfe5114bc872c9f370509198acf0b87a832a0ab9458e2bd0550", size = 11993498, upload-time = "2026-01-16T10:38:31.289Z" },
+]
+
 [[package]]
 name = "trio"
 version = "0.33.0"
@@ -1885,6 +3290,14 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/c7/19/eb640a397bba49ba49ef9dbe2e7e5c04202ba045b6ce2ec36e9cadc51e04/trio_websocket-0.12.2-py3-none-any.whl", hash = "sha256:df605665f1db533f4a386c94525870851096a223adcb97f72a07e8b4beba45b6", size = 21221, upload-time = "2025-02-25T05:16:57.545Z" },
 ]
 
+[[package]]
+name = "triton"
+version = "3.6.0"
+source = { registry = "https://pypi.org/simple" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/ab/a8/cdf8b3e4c98132f965f88c2313a4b493266832ad47fb52f23d14d4f86bb5/triton-3.6.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:74caf5e34b66d9f3a429af689c1c7128daba1d8208df60e81106b115c00d6fca", size = 188266850, upload-time = "2026-01-20T16:00:43.041Z" },
+]
+
 [[package]]
 name = "typer"
 version = "0.25.1"
@@ -1978,6 +3391,108 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/01/be/72532be3da7acc5fdfbccdb95215cd04f995a0886532a5b423f929cda4cc/uvicorn-0.48.0-py3-none-any.whl", hash = "sha256:48097851328b87ec36117d3d575234519eb58c2b22d79666e9bbc6c49a761dad", size = 71410, upload-time = "2026-05-24T12:08:40.258Z" },
 ]
 
+[package.optional-dependencies]
+standard = [
+    { name = "colorama", marker = "sys_platform == 'win32'" },
+    { name = "httptools" },
+    { name = "python-dotenv" },
+    { name = "pyyaml" },
+    { name = "uvloop", marker = "platform_python_implementation != 'PyPy' and sys_platform != 'cygwin' and sys_platform != 'win32'" },
+    { name = "watchfiles" },
+    { name = "websockets" },
+]
+
+[[package]]
+name = "uvloop"
+version = "0.22.1"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/06/f0/18d39dbd1971d6d62c4629cc7fa67f74821b0dc1f5a77af43719de7936a7/uvloop-0.22.1.tar.gz", hash = "sha256:6c84bae345b9147082b17371e3dd5d42775bddce91f885499017f4607fdaf39f", size = 2443250, upload-time = "2025-10-16T22:17:19.342Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/3d/ff/7f72e8170be527b4977b033239a83a68d5c881cc4775fca255c677f7ac5d/uvloop-0.22.1-cp312-cp312-macosx_10_13_universal2.whl", hash = "sha256:fe94b4564e865d968414598eea1a6de60adba0c040ba4ed05ac1300de402cd42", size = 1359936, upload-time = "2025-10-16T22:16:29.436Z" },
+    { url = "https://files.pythonhosted.org/packages/c3/c6/e5d433f88fd54d81ef4be58b2b7b0cea13c442454a1db703a1eea0db1a59/uvloop-0.22.1-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:51eb9bd88391483410daad430813d982010f9c9c89512321f5b60e2cddbdddd6", size = 752769, upload-time = "2025-10-16T22:16:30.493Z" },
+    { url = "https://files.pythonhosted.org/packages/24/68/a6ac446820273e71aa762fa21cdcc09861edd3536ff47c5cd3b7afb10eeb/uvloop-0.22.1-cp312-cp312-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:700e674a166ca5778255e0e1dc4e9d79ab2acc57b9171b79e65feba7184b3370", size = 4317413, upload-time = "2025-10-16T22:16:31.644Z" },
+    { url = "https://files.pythonhosted.org/packages/5f/6f/e62b4dfc7ad6518e7eff2516f680d02a0f6eb62c0c212e152ca708a0085e/uvloop-0.22.1-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:7b5b1ac819a3f946d3b2ee07f09149578ae76066d70b44df3fa990add49a82e4", size = 4426307, upload-time = "2025-10-16T22:16:32.917Z" },
+    { url = "https://files.pythonhosted.org/packages/90/60/97362554ac21e20e81bcef1150cb2a7e4ffdaf8ea1e5b2e8bf7a053caa18/uvloop-0.22.1-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:e047cc068570bac9866237739607d1313b9253c3051ad84738cbb095be0537b2", size = 4131970, upload-time = "2025-10-16T22:16:34.015Z" },
+    { url = "https://files.pythonhosted.org/packages/99/39/6b3f7d234ba3964c428a6e40006340f53ba37993f46ed6e111c6e9141d18/uvloop-0.22.1-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:512fec6815e2dd45161054592441ef76c830eddaad55c8aa30952e6fe1ed07c0", size = 4296343, upload-time = "2025-10-16T22:16:35.149Z" },
+]
+
+[[package]]
+name = "vllm"
+version = "0.19.0"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "aiohttp" },
+    { name = "anthropic" },
+    { name = "blake3" },
+    { name = "cachetools" },
+    { name = "cbor2" },
+    { name = "cloudpickle" },
+    { name = "compressed-tensors" },
+    { name = "depyf" },
+    { name = "diskcache" },
+    { name = "einops" },
+    { name = "fastapi", extra = ["standard"] },
+    { name = "filelock" },
+    { name = "flashinfer-cubin" },
+    { name = "flashinfer-python" },
+    { name = "gguf" },
+    { name = "ijson" },
+    { name = "lark" },
+    { name = "llguidance", marker = "platform_machine == 'aarch64' or platform_machine == 'arm64' or platform_machine == 'ppc64le' or platform_machine == 's390x' or platform_machine == 'x86_64'" },
+    { name = "lm-format-enforcer" },
+    { name = "mcp" },
+    { name = "mistral-common", extra = ["image"] },
+    { name = "model-hosting-container-standards" },
+    { name = "msgspec" },
+    { name = "ninja" },
+    { name = "numba" },
+    { name = "numpy" },
+    { name = "nvidia-cudnn-frontend" },
+    { name = "nvidia-cutlass-dsl" },
+    { name = "openai" },
+    { name = "openai-harmony" },
+    { name = "opencv-python-headless" },
+    { name = "opentelemetry-api" },
+    { name = "opentelemetry-exporter-otlp" },
+    { name = "opentelemetry-sdk" },
+    { name = "opentelemetry-semantic-conventions-ai" },
+    { name = "outlines-core" },
+    { name = "partial-json-parser" },
+    { name = "pillow" },
+    { name = "prometheus-client" },
+    { name = "prometheus-fastapi-instrumentator" },
+    { name = "protobuf" },
+    { name = "psutil" },
+    { name = "py-cpuinfo" },
+    { name = "pybase64" },
+    { name = "pydantic" },
+    { name = "python-json-logger" },
+    { name = "pyyaml" },
+    { name = "pyzmq" },
+    { name = "quack-kernels" },
+    { name = "regex" },
+    { name = "requests" },
+    { name = "sentencepiece" },
+    { name = "setproctitle" },
+    { name = "setuptools" },
+    { name = "six" },
+    { name = "tiktoken" },
+    { name = "tokenizers" },
+    { name = "torch" },
+    { name = "torchaudio" },
+    { name = "torchvision" },
+    { name = "tqdm" },
+    { name = "transformers" },
+    { name = "typing-extensions" },
+    { name = "watchfiles" },
+    { name = "xgrammar", marker = "platform_machine == 'aarch64' or platform_machine == 'arm64' or platform_machine == 'ppc64le' or platform_machine == 's390x' or platform_machine == 'x86_64'" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/03/14/c330a72309051f762b357a2e41d5015bedbb106ad1e16a231bdfda2e2163/vllm-0.19.0.tar.gz", hash = "sha256:81e59cf87175e7a62eb8d9acf5989484bbd17089d5eface353f89067bda282d9", size = 31071745, upload-time = "2026-04-03T04:04:52.833Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/c8/51/467e7a8cb4838022daa731b7f8b34c228691e36f938e1803c3a702c7bd69/vllm-0.19.0-cp38-abi3-manylinux_2_31_aarch64.whl", hash = "sha256:6ab90ccca5d7ca3bd2c8f90133f0fac85e8f4af582a1c67c6cc3f63c615521e3", size = 384650557, upload-time = "2026-04-03T04:05:52.513Z" },
+    { url = "https://files.pythonhosted.org/packages/b7/08/6a431731e4c163bc1fab85b63e269d84104aad0fba98dac1af34fdc5077f/vllm-0.19.0-cp38-abi3-manylinux_2_31_x86_64.whl", hash = "sha256:2d0e5fae45367bdbf111fcad68f4c0f8fdddd2f2fb643e52f0f2daebef7b41cf", size = 432281473, upload-time = "2026-04-03T04:05:22.07Z" },
+]
+
 [[package]]
 name = "watchfiles"
 version = "1.2.0"
@@ -2044,6 +3559,15 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/6f/28/258ebab549c2bf3e64d2b0217b973467394a9cea8c42f70418ca2c5d0d2e/websockets-16.0-py3-none-any.whl", hash = "sha256:1637db62fad1dc833276dded54215f2c7fa46912301a24bd94d45d46a011ceec", size = 171598, upload-time = "2026-01-10T09:23:45.395Z" },
 ]
 
+[[package]]
+name = "win32-setctime"
+version = "1.2.0"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/b3/8f/705086c9d734d3b663af0e9bb3d4de6578d08f46b1b101c2442fd9aecaa2/win32_setctime-1.2.0.tar.gz", hash = "sha256:ae1fdf948f5640aae05c511ade119313fb6a30d7eabe25fef9764dca5873c4c0", size = 4867, upload-time = "2024-12-07T15:28:28.314Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/e1/07/c6fe3ad3e685340704d314d765b7912993bcb8dc198f0e7a89382d37974b/win32_setctime-1.2.0-py3-none-any.whl", hash = "sha256:95d644c4e708aba81dc3704a116d8cbc974d70b3bdb8be1d150e36be6e9d1390", size = 4083, upload-time = "2024-12-07T15:28:26.465Z" },
+]
+
 [[package]]
 name = "wsproto"
 version = "1.3.2"
@@ -2056,6 +3580,28 @@ wheels = [
     { url = "https://files.pythonhosted.org/packages/a4/f5/10b68b7b1544245097b2a1b8238f66f2fc6dcaeb24ba5d917f52bd2eed4f/wsproto-1.3.2-py3-none-any.whl", hash = "sha256:61eea322cdf56e8cc904bd3ad7573359a242ba65688716b0710a5eb12beab584", size = 24405, upload-time = "2025-11-20T18:18:00.454Z" },
 ]
 
+[[package]]
+name = "xgrammar"
+version = "0.2.2"
+source = { registry = "https://pypi.org/simple" }
+dependencies = [
+    { name = "apache-tvm-ffi" },
+    { name = "numpy" },
+    { name = "pydantic" },
+    { name = "torch" },
+    { name = "transformers" },
+    { name = "triton", marker = "platform_machine == 'x86_64' and sys_platform == 'linux'" },
+    { name = "typing-extensions" },
+]
+sdist = { url = "https://files.pythonhosted.org/packages/fe/b7/86178b241be0b29ce9ee991a44d4b7eddb0f0c98310c0067fe83afc897d1/xgrammar-0.2.2.tar.gz", hash = "sha256:42bcc5c4187ff9cb6edc44ff5f56030d8b69d62c7576f674a5a074f71b68b1fa", size = 2430492, upload-time = "2026-06-11T19:02:49.624Z" }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/31/48/c50e00cb390702153a740d6eb163efb6771c2c18087401160132b1be1bdb/xgrammar-0.2.2-cp312-cp312-macosx_10_14_x86_64.whl", hash = "sha256:f457e20099768203f313dfc4bdfcb80ebdd98040b8df7a4f25adf5b0ddaa2cd4", size = 23292820, upload-time = "2026-06-11T19:01:40.152Z" },
+    { url = "https://files.pythonhosted.org/packages/93/b9/d6004ab97588bf680fc5524a7a07b9e6c58caecd5de303c9e74d6b70267d/xgrammar-0.2.2-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:c4e46e6c62c534c8e539447008fc1f77b592d92d304a92961852e9540aa8f0cc", size = 23236355, upload-time = "2026-06-11T19:01:42.817Z" },
+    { url = "https://files.pythonhosted.org/packages/d5/20/84f3d71964cf2142c3470025731c0ccb3fb8c54b192c1435f10b79d72d31/xgrammar-0.2.2-cp312-cp312-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:5432dee479702807e0af956b3804d4c5b7ddafb61b1d97d437574be6837a4e15", size = 44091972, upload-time = "2026-06-11T19:01:45.96Z" },
+    { url = "https://files.pythonhosted.org/packages/b1/40/eb28b7343b019350a83cde47fe876a79f3ee19e412c1af451200b6351fe8/xgrammar-0.2.2-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:d3f0195ffc978bc2b44e4d5b682ff3a5ee86c378e8934b0f8582ddd0b00c24f0", size = 44588349, upload-time = "2026-06-11T19:01:49.062Z" },
+    { url = "https://files.pythonhosted.org/packages/d4/c6/01182ac2d0cf3943f8c990824027cb13b914867c32990c5e648a09f2d8fa/xgrammar-0.2.2-cp312-cp312-win_amd64.whl", hash = "sha256:2560d07b6b194334dd612a937c5a27b04e34f5b73aba8aea5d6860f758bcdff8", size = 7436669, upload-time = "2026-06-11T19:01:51.674Z" },
+]
+
 [[package]]
 name = "xxhash"
 version = "3.7.0"

From 50396c21b6465e07da08094be062a719ca1992b4 Mon Sep 17 00:00:00 2001
From: Zhifei Li <andylizf@outlook.com>
Date: Thu, 25 Jun 2026 20:45:52 -0700
Subject: [PATCH 04/13] fix(eval): grade NQ/NQ-Tables with the LLM judge to
 match the paper
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The paper's published NQ/NQT numbers use the gpt-4.1 LLM judge (semantic match), not strict
exact-match. The reader answers short but paraphrases the gold span, so strict exact-match
scores ~20pp lower (≈ the naive number) even when the answer is correct — measured NQ base
= 29% exact-match vs 53% LLM-judge (paper 57.9).

- grader: add a --llm-judge flag; for nq/nq_tables/triviaqa it grades with the existing
  gpt-4.1 judge (ground truth = "Any of: <gold aliases>") instead of exact-match. The
  default stays strict exact-match (no API key needed).
- reproduce.sh: pass --llm-judge for nq/nqt so the turnkey reproduction matches the paper.
- README: note NQ/NQT paper numbers are LLM-judge; update the cell table + grader section.
---
 eval/README.md     | 14 +++++++++++---
 eval/lib/grader.py | 27 ++++++++++++++++++++++++---
 eval/reproduce.sh  |  6 +++++-
 3 files changed, 40 insertions(+), 7 deletions(-)

diff --git a/eval/README.md b/eval/README.md
index 50357e9..dd0a10d 100644
--- a/eval/README.md
+++ b/eval/README.md
@@ -94,7 +94,7 @@ against the **public API** (mode 2), drive `run_bench.py` directly — `reproduc
 `localhost`, so it can't point at a remote serve:
 
 ```bash
-# NQ via the public base endpoint (exact-match grading — no OpenAI key needed):
+# NQ via the public base endpoint (then grade — see §5; the paper used --llm-judge):
 .venv/bin/python run_bench.py --task nq --model Qwen/Qwen3.5-4B \
   --api-base "$READER_URL" --api-key dummy --no-think \
   --retrieval-top-k 5 --reader-top-k 3 --num-examples 1000 --max-tokens 200 \
@@ -111,7 +111,7 @@ Per-cell config is locked inside `reproduce.sh`:
 
 | bench | think | max_tokens | n | grader | notes |
 |-------|-------|-----------|---|--------|-------|
-| nq / nqt | no-think | 200 | 1000 / all | exact-match | |
+| nq / nqt | no-think | 200 | 1000 / all | LLM-judge¹ | ¹paper numbers used the gpt-4.1 judge; `reproduce.sh` passes `--llm-judge` (needs the OpenAI key) |
 | sqa | no-think | 200 | 1000 | SimpleQA judge | nprobe 2000 |
 | mms (base/lora/traf) | **think** | 16384 | all | WorldVQA judge | pixel instr = V1 "Retrieve images or text relevant to the user's query." |
 | mms (naive) | no-think | 200 | all | WorldVQA judge | |
@@ -133,6 +133,11 @@ Paper Table 1 (Qwen3.5-4B, k=3):
 
 On H100, this harness reproduces the pixel cells (LiveVQA/MMS/EVQA base+lora) within ~1pp.
 
+NOTE on NQ/NQ-Tables grading: the paper's published numbers use the **gpt-4.1 LLM judge**
+(semantic match), not strict exact-match. The reader answers short but paraphrases the gold
+span, so strict exact-match scores ~20pp lower (≈ the naive number) even when the answer is
+right. Grade these cells with `--llm-judge` (what `reproduce.sh` does) to match the paper.
+
 NOTE on traf (text retrieval): `reproduce.sh` passes `--no-query-image` to match the paper's
 text-only text retrieval.
 
@@ -142,7 +147,10 @@ text-only text retrieval.
 - WorldVQA judge (mmsearch / encyclopedic_vqa): prompt verbatim, GT for EVQA =
   `"Any of: " + " | ".join(reference_list)` (any reference matches → correct), `<think>` stripped,
   judge gpt-4.1 temp 0 + `system="You are a helpful assistant."` + `seed=42` + `max_tokens=1000`.
-- exact-match (nq / nq_tables): SQuAD-style normalize + match against the gold answer list.
+- nq / nq_tables: **default** strict exact-match (SQuAD normalize + equality vs the gold list,
+  no API key). Pass `--llm-judge` to grade with the gpt-4.1 judge instead — that is what the
+  paper used for its published NQ/NQT numbers (strict exact-match runs ~20pp lower because the
+  reader paraphrases). `reproduce.sh` uses `--llm-judge` for these cells.
 - SimpleQA judge (simpleqa): the SimpleQA `GRADER_TEMPLATE` → A/B/C.
 
 ```bash
diff --git a/eval/lib/grader.py b/eval/lib/grader.py
index a939ad4..ba54c62 100644
--- a/eval/lib/grader.py
+++ b/eval/lib/grader.py
@@ -63,6 +63,10 @@ def strip_think(text: str) -> str:
 
 def build_ground_truth(task: str, original_data: dict) -> str:
     """Match evaluate.py convert_to_evaluate_format."""
+    if task in EXACT_MATCH_TASKS:
+        # nq / nq_tables / triviaqa carry a list of acceptable gold spans/aliases.
+        golds = [str(g) for g in (_golds_for(task, original_data) or []) if g]
+        return "Any of: " + " | ".join(golds) if golds else ""
     if task == "encyclopedic_vqa":
         refs = original_data.get("reference_list") or []
         if refs:
@@ -144,8 +148,11 @@ async def grade_file(
     path: str,
     grader_model: str = DEFAULT_GRADER_MODEL,
     concurrency: int = 16,
+    llm_judge: bool = False,
 ) -> dict:
-    if task in EXACT_MATCH_TASKS:
+    # nq/nq_tables/triviaqa default to strict exact-match (no API key needed). The paper's
+    # published numbers for these used the LLM judge (semantic match) — pass --llm-judge.
+    if task in EXACT_MATCH_TASKS and not llm_judge:
         return grade_exact_match(path)
     from openai import AsyncOpenAI
 
@@ -222,12 +229,26 @@ def main():
     ap.add_argument("jsonl", help="responses jsonl from run_bench.py")
     ap.add_argument("--grader-model", default=DEFAULT_GRADER_MODEL)
     ap.add_argument("--concurrency", type=int, default=16)
+    ap.add_argument(
+        "--llm-judge",
+        action="store_true",
+        help="For nq/nq_tables/triviaqa: grade with the gpt-4.1 LLM judge (semantic "
+        "match — what the paper used for its published numbers) instead of strict "
+        "exact-match. Requires OPENAI_API_KEY.",
+    )
     args = ap.parse_args()
     res = asyncio.run(
-        grade_file(args.task, args.jsonl, args.grader_model, args.concurrency)
+        grade_file(
+            args.task, args.jsonl, args.grader_model, args.concurrency, args.llm_judge
+        )
+    )
+    mode = (
+        "LLM-judge"
+        if args.task in EXACT_MATCH_TASKS and args.llm_judge
+        else ("exact-match" if args.task in EXACT_MATCH_TASKS else "LLM-judge")
     )
     print(
-        f"{Path(res['file']).name}: {res['correct']}/{res['n']} = {res['score']:.4f} "
+        f"{Path(res['file']).name} [{mode}]: {res['correct']}/{res['n']} = {res['score']:.4f} "
         f"(C={res['correct']} I={res['incorrect']} U={res['unattempted']} err={res['errors']})"
     )
     print(f"Score: {res['score']:.3f}")
diff --git a/eval/reproduce.sh b/eval/reproduce.sh
index 9d3169a..72cc84c 100755
--- a/eval/reproduce.sh
+++ b/eval/reproduce.sh
@@ -115,4 +115,8 @@ echo ">>> [$BENCH/$RETR] run_bench: reader=$READER_URL task=$TASK think=$THINK m
     $EXTRA "${RFLAGS[@]}"
 
 echo ">>> [$BENCH/$RETR] grading ($GRADE)"
-PYTHONPATH=. "$PY" -m lib.grader "$GRADE" "$OUT"
+# nq/nqt: the paper's published numbers used the LLM judge (semantic match), not strict
+# exact-match — grade with --llm-judge to match. (For strict exact-match, drop the flag.)
+JUDGEFLAG=""; case "$GRADE" in nq|nq_tables) JUDGEFLAG="--llm-judge" ;; esac
+# shellcheck disable=SC2086
+PYTHONPATH=. "$PY" -m lib.grader "$GRADE" "$OUT" $JUDGEFLAG

From f9c29a5e304d531b0e43436238133a16d489c14d Mon Sep 17 00:00:00 2001
From: Zhifei Li <andylizf@outlook.com>
Date: Thu, 25 Jun 2026 20:53:18 -0700
Subject: [PATCH 05/13] fix(serve)+docs(eval): add torchvision to the serve
 extra; document self-hosting the search serve
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

- The `serve` extra was missing torchvision, which transformers' Qwen3-VL processor imports —
  self-hosting the search serve failed until it was added by hand. Add it (cu129 via the
  existing tool.uv.sources, matching the embed extra).
- eval/README: add a "self-hosting the search serve" note — the [serve] install, the
  ~217G index + ~220G RAM, where articles.json lives (the pixelrag-tiles dataset), and the
  tiles options (full corpus vs --render-on-demand from a kiwix ZIM).
---
 eval/README.md | 12 ++++++++++++
 pyproject.toml |  1 +
 uv.lock        |  5 +++++
 3 files changed, 18 insertions(+)

diff --git a/eval/README.md b/eval/README.md
index dd0a10d..344d6c6 100644
--- a/eval/README.md
+++ b/eval/README.md
@@ -77,6 +77,18 @@ reach it three ways — pick one:
 
 In modes 2 and 3 the reader needs no local tiles, so leave `TILES_DIR` empty.
 
+**Self-hosting the search serve (modes 1 & 3) — what you need:**
+- `pip install -e '..[serve]'` (faiss + torch/torchvision + transformers + the query encoder
+  `Qwen/Qwen3-VL-Embedding-2B`). This is **separate from the eval client and the reader** — a
+  CUDA GPU box with plenty of RAM.
+- The FAISS index: `search_index_normed_v2` (~217G download from `StarTrail-org/pixelrag-faiss-indexes`,
+  ~220G RAM to load — `serve_up.sh` fetches it).
+- `articles.json` (the article-id → wiki-slug map the serve needs to resolve hit URLs) lives in
+  the **`StarTrail-org/pixelrag-tiles`** dataset, not the faiss-indexes one.
+- Tiles for the reader: either the full corpus at `TILES_DIR` (mode 1), or start the serve with
+  `--render-on-demand --kiwix-url <kiwix-serve>` so it renders only the retrieved pages from a
+  kiwix ZIM (mode 3) — no ~4T corpus.
+
 ## 3. Run a cell
 
 ```bash
diff --git a/pyproject.toml b/pyproject.toml
index 3346f8e..af4d814 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -39,6 +39,7 @@ serve = [
     "faiss-cpu>=1.9.0",
     "transformers>=4.57.0",
     "torch>=2.9.0",
+    "torchvision>=0.24.0",  # transformers' Qwen3-VL processor needs it; cu129 via tool.uv.sources
     "qwen-vl-utils",
     "pydantic>=2.0.0",
 ]
diff --git a/uv.lock b/uv.lock
index 8e97dc3..45481f0 100644
--- a/uv.lock
+++ b/uv.lock
@@ -1496,6 +1496,9 @@ serve = [
     { name = "qwen-vl-utils", marker = "sys_platform == 'darwin' or sys_platform == 'linux'" },
     { name = "torch", version = "2.9.1+cu129", source = { registry = "https://download.pytorch.org/whl/cu129" }, marker = "sys_platform == 'linux'" },
     { name = "torch", version = "2.12.1", source = { registry = "https://pypi.org/simple" }, marker = "sys_platform == 'darwin'" },
+    { name = "torchvision", version = "0.24.1", source = { registry = "https://download.pytorch.org/whl/cu129" }, marker = "platform_machine == 'aarch64' and sys_platform == 'linux'" },
+    { name = "torchvision", version = "0.24.1+cu129", source = { registry = "https://download.pytorch.org/whl/cu129" }, marker = "platform_machine != 'aarch64' and sys_platform == 'linux'" },
+    { name = "torchvision", version = "0.27.1", source = { registry = "https://pypi.org/simple" }, marker = "sys_platform == 'darwin'" },
     { name = "transformers", marker = "sys_platform == 'darwin' or sys_platform == 'linux'" },
     { name = "uvicorn", marker = "sys_platform == 'darwin' or sys_platform == 'linux'" },
 ]
@@ -1534,7 +1537,9 @@ requires-dist = [
     { name = "torch", marker = "sys_platform != 'linux' and extra == 'embed'", specifier = ">=2.9.0" },
     { name = "torch", marker = "sys_platform != 'linux' and extra == 'serve'", specifier = ">=2.9.0" },
     { name = "torchvision", marker = "sys_platform == 'linux' and extra == 'embed'", specifier = ">=0.24.0", index = "https://download.pytorch.org/whl/cu129" },
+    { name = "torchvision", marker = "sys_platform == 'linux' and extra == 'serve'", specifier = ">=0.24.0", index = "https://download.pytorch.org/whl/cu129" },
     { name = "torchvision", marker = "sys_platform != 'linux' and extra == 'embed'", specifier = ">=0.24.0" },
+    { name = "torchvision", marker = "sys_platform != 'linux' and extra == 'serve'", specifier = ">=0.24.0" },
     { name = "tqdm", marker = "extra == 'embed'", specifier = ">=4.60.0" },
     { name = "tqdm", marker = "extra == 'eval'", specifier = ">=4.60" },
     { name = "trafilatura", marker = "extra == 'eval'", specifier = ">=1.6" },

From 41bfb3765c72dbb8d704f945262c0d9cb3de1ef5 Mon Sep 17 00:00:00 2001
From: Zhifei Li <andylizf@outlook.com>
Date: Fri, 26 Jun 2026 03:36:02 -0700
Subject: [PATCH 06/13] fix(serve,render): make Mode-3 on-demand rendering
 actually work

Two real bugs that made the serve's --render-on-demand path (Mode 3 in the eval README)
return empty tiles, surfaced by a from-scratch reproduction:

- serve: the async /search handler called _ondemand_chunk_b64 synchronously, which runs
  render_url -> asyncio.run() inside the running event loop -> RuntimeError, swallowed, so
  image_base64 was always empty (the reader fell back to closed-book). Offload it with
  asyncio.to_thread so render_url's asyncio.run() runs in a thread with no running loop.
- render: the cdp/fast_cdp backends force GPU rasterization, which needs GPU device access
  (the `render` group on lab machines: `sudo usermod -aG render $USER`). On a headless box
  without it the renderer crashes and CDP capture hangs forever. Add PIXELSHOT_DISABLE_GPU=1
  to fall back to CPU rasterization; the default (GPU) path is unchanged.
---
 render/src/pixelrag_render/backends/cdp.py      | 12 ++++++++++--
 render/src/pixelrag_render/backends/fast_cdp.py | 11 +++++++++--
 serve/src/pixelrag_serve/api.py                 |  6 +++++-
 3 files changed, 24 insertions(+), 5 deletions(-)

diff --git a/render/src/pixelrag_render/backends/cdp.py b/render/src/pixelrag_render/backends/cdp.py
index df09fc6..a7606c3 100644
--- a/render/src/pixelrag_render/backends/cdp.py
+++ b/render/src/pixelrag_render/backends/cdp.py
@@ -40,6 +40,15 @@
 VIEWPORT_W = 875
 VIEWPORT_H = 1080
 
+# GPU rasterization is faster but needs GPU device access — on lab machines that means
+# membership in the `render` group (`sudo usermod -aG render $USER`). Without GPU access
+# (or headless with no GPU compositor) these flags make the renderer crash and CDP capture
+# hangs forever. Set PIXELSHOT_DISABLE_GPU=1 to fall back to CPU rasterization.
+_GPU_ARGS = (
+    ["--disable-gpu"]
+    if os.environ.get("PIXELSHOT_DISABLE_GPU")
+    else ["--enable-gpu-rasterization", "--force-gpu-rasterization"]
+)
 BROWSER_ARGS = [
     "--disable-dev-shm-usage",
     "--no-sandbox",
@@ -47,8 +56,7 @@
     "--disable-backgrounding-occluded-windows",
     "--disable-background-networking",
     "--disable-features=Translate,MediaRouter,OptimizationHints",
-    "--enable-gpu-rasterization",
-    "--force-gpu-rasterization",
+    *_GPU_ARGS,
 ]
 
 
diff --git a/render/src/pixelrag_render/backends/fast_cdp.py b/render/src/pixelrag_render/backends/fast_cdp.py
index 0427ddb..8bca69b 100644
--- a/render/src/pixelrag_render/backends/fast_cdp.py
+++ b/render/src/pixelrag_render/backends/fast_cdp.py
@@ -39,11 +39,18 @@
 VIEWPORT_WIDTH = 875
 TILE_HEIGHT = 8192
 
+# GPU rasterization needs GPU device access (the `render` group on lab machines). Without it
+# (or headless with no GPU compositor) it crashes the renderer — set PIXELSHOT_DISABLE_GPU=1
+# to fall back to CPU rasterization.
+_GPU_ARGS = (
+    ["--disable-gpu"]
+    if os.environ.get("PIXELSHOT_DISABLE_GPU")
+    else ["--enable-gpu-rasterization", "--force-gpu-rasterization"]
+)
 CHROME_ARGS = [
     "--no-sandbox",
     "--disable-dev-shm-usage",
-    "--enable-gpu-rasterization",
-    "--force-gpu-rasterization",
+    *_GPU_ARGS,
     "--disable-renderer-backgrounding",
     "--disable-backgrounding-occluded-windows",
     "--disable-background-networking",
diff --git a/serve/src/pixelrag_serve/api.py b/serve/src/pixelrag_serve/api.py
index 65b6d40..211c90a 100644
--- a/serve/src/pixelrag_serve/api.py
+++ b/serve/src/pixelrag_serve/api.py
@@ -32,6 +32,7 @@
 """
 
 import argparse
+import asyncio
 import base64
 import contextvars
 import functools
@@ -462,7 +463,10 @@ async def search(req: SearchRequest):
                 with open(tile_path, "rb") as fp:
                     img_b64 = base64.b64encode(fp.read()).decode()
             elif req.include_images and _state.get("ondemand") is not None:
-                img_b64 = _ondemand_chunk_b64(aid, ti, ci, th)
+                # Render off the event loop: _ondemand_chunk_b64 -> render_url uses
+                # asyncio.run(), which raises "cannot be called from a running event
+                # loop" if invoked directly here. Offload to a worker thread.
+                img_b64 = await asyncio.to_thread(_ondemand_chunk_b64, aid, ti, ci, th)
             # Expose a relative tile path, not the absolute server filesystem
             # path (avoids leaking the host's directory layout; clients fetch
             # tiles via /tile/{article_id}/{tile_index}/{chunk_index}).

From a770ad7dfc58fb426775ecae333bea70009eaa16 Mon Sep 17 00:00:00 2001
From: Zhifei Li <andylizf@outlook.com>
Date: Fri, 26 Jun 2026 04:00:31 -0700
Subject: [PATCH 07/13] fix(render): default to CPU rasterization (the render
 machines have no GPU)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Our render/serve machines are GPU-less (the serve runs --device cpu; nvidia-smi has no
driver), and the stable production capture config is GPU-rasterization-incompatible anyway.
Forcing `--enable-gpu-rasterization` by default makes Chrome crash / CDP capture hang on a
box without real GPU device access.

Flip the default to `--disable-gpu` (CPU rasterization — works on GPU-less and headless
boxes, the common case). On a properly configured graphics-GPU render box, set
PIXELSHOT_ENABLE_GPU=1 for ~2x throughput (GPU rasterization can also produce blank
captures, so verify output there).
---
 render/src/pixelrag_render/backends/cdp.py      | 15 ++++++++-------
 render/src/pixelrag_render/backends/fast_cdp.py | 13 +++++++------
 2 files changed, 15 insertions(+), 13 deletions(-)

diff --git a/render/src/pixelrag_render/backends/cdp.py b/render/src/pixelrag_render/backends/cdp.py
index a7606c3..36976f5 100644
--- a/render/src/pixelrag_render/backends/cdp.py
+++ b/render/src/pixelrag_render/backends/cdp.py
@@ -40,14 +40,15 @@
 VIEWPORT_W = 875
 VIEWPORT_H = 1080
 
-# GPU rasterization is faster but needs GPU device access — on lab machines that means
-# membership in the `render` group (`sudo usermod -aG render $USER`). Without GPU access
-# (or headless with no GPU compositor) these flags make the renderer crash and CDP capture
-# hangs forever. Set PIXELSHOT_DISABLE_GPU=1 to fall back to CPU rasterization.
+# Default to CPU rasterization (`--disable-gpu`): it works on GPU-less and headless boxes —
+# the common case. Forcing GPU rasterization needs real GPU device access (a graphics GPU +
+# the `render` group, `sudo usermod -aG render $USER`); without it the renderer crashes and
+# CDP capture hangs forever. On a properly-configured GPU render box, set PIXELSHOT_ENABLE_GPU=1
+# for ~2x throughput (note: GPU rasterization can also produce blank captures — verify output).
 _GPU_ARGS = (
-    ["--disable-gpu"]
-    if os.environ.get("PIXELSHOT_DISABLE_GPU")
-    else ["--enable-gpu-rasterization", "--force-gpu-rasterization"]
+    ["--enable-gpu-rasterization", "--force-gpu-rasterization"]
+    if os.environ.get("PIXELSHOT_ENABLE_GPU")
+    else ["--disable-gpu"]
 )
 BROWSER_ARGS = [
     "--disable-dev-shm-usage",
diff --git a/render/src/pixelrag_render/backends/fast_cdp.py b/render/src/pixelrag_render/backends/fast_cdp.py
index 8bca69b..2c0660a 100644
--- a/render/src/pixelrag_render/backends/fast_cdp.py
+++ b/render/src/pixelrag_render/backends/fast_cdp.py
@@ -39,13 +39,14 @@
 VIEWPORT_WIDTH = 875
 TILE_HEIGHT = 8192
 
-# GPU rasterization needs GPU device access (the `render` group on lab machines). Without it
-# (or headless with no GPU compositor) it crashes the renderer — set PIXELSHOT_DISABLE_GPU=1
-# to fall back to CPU rasterization.
+# Default to CPU rasterization (works on GPU-less / headless boxes — the common case).
+# GPU rasterization needs real GPU device access (graphics GPU + the `render` group); without
+# it the renderer crashes. On a configured GPU render box, set PIXELSHOT_ENABLE_GPU=1 (~2x,
+# but it can produce blank captures — verify output).
 _GPU_ARGS = (
-    ["--disable-gpu"]
-    if os.environ.get("PIXELSHOT_DISABLE_GPU")
-    else ["--enable-gpu-rasterization", "--force-gpu-rasterization"]
+    ["--enable-gpu-rasterization", "--force-gpu-rasterization"]
+    if os.environ.get("PIXELSHOT_ENABLE_GPU")
+    else ["--disable-gpu"]
 )
 CHROME_ARGS = [
     "--no-sandbox",

From 4879a1307316a2df43193ba89ed980d6bb051ec7 Mon Sep 17 00:00:00 2001
From: Zhifei Li <andylizf@outlook.com>
Date: Fri, 26 Jun 2026 05:39:18 -0700
Subject: [PATCH 08/13] revert(render): keep the original GPU-rasterization
 flags (no-GPU boxes render fine)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

A previous commit flipped the cdp/fast_cdp default to --disable-gpu on the theory that
--enable-gpu-rasterization crashes on GPU-less / headless machines. That was wrong: verified
on a no-GPU box, the original flags render correctly and fast (Chrome falls back to software
rasterization, no crash). The crash seen on one borrowed cluster box was specific to that
host (patched headless_shell + datacenter GPU), not a general problem — not a reason to change
the default. Restore cdp.py / fast_cdp.py to the original BROWSER_ARGS.

The genuine Mode-3 bug (the serve's async handler calling render_url -> asyncio.run inside a
running event loop) stays fixed via asyncio.to_thread in serve/api.py — that is unrelated to
the GPU flags.
---
 render/src/pixelrag_render/backends/cdp.py      | 13 ++-----------
 render/src/pixelrag_render/backends/fast_cdp.py | 12 ++----------
 2 files changed, 4 insertions(+), 21 deletions(-)

diff --git a/render/src/pixelrag_render/backends/cdp.py b/render/src/pixelrag_render/backends/cdp.py
index 36976f5..df09fc6 100644
--- a/render/src/pixelrag_render/backends/cdp.py
+++ b/render/src/pixelrag_render/backends/cdp.py
@@ -40,16 +40,6 @@
 VIEWPORT_W = 875
 VIEWPORT_H = 1080
 
-# Default to CPU rasterization (`--disable-gpu`): it works on GPU-less and headless boxes —
-# the common case. Forcing GPU rasterization needs real GPU device access (a graphics GPU +
-# the `render` group, `sudo usermod -aG render $USER`); without it the renderer crashes and
-# CDP capture hangs forever. On a properly-configured GPU render box, set PIXELSHOT_ENABLE_GPU=1
-# for ~2x throughput (note: GPU rasterization can also produce blank captures — verify output).
-_GPU_ARGS = (
-    ["--enable-gpu-rasterization", "--force-gpu-rasterization"]
-    if os.environ.get("PIXELSHOT_ENABLE_GPU")
-    else ["--disable-gpu"]
-)
 BROWSER_ARGS = [
     "--disable-dev-shm-usage",
     "--no-sandbox",
@@ -57,7 +47,8 @@
     "--disable-backgrounding-occluded-windows",
     "--disable-background-networking",
     "--disable-features=Translate,MediaRouter,OptimizationHints",
-    *_GPU_ARGS,
+    "--enable-gpu-rasterization",
+    "--force-gpu-rasterization",
 ]
 
 
diff --git a/render/src/pixelrag_render/backends/fast_cdp.py b/render/src/pixelrag_render/backends/fast_cdp.py
index 2c0660a..0427ddb 100644
--- a/render/src/pixelrag_render/backends/fast_cdp.py
+++ b/render/src/pixelrag_render/backends/fast_cdp.py
@@ -39,19 +39,11 @@
 VIEWPORT_WIDTH = 875
 TILE_HEIGHT = 8192
 
-# Default to CPU rasterization (works on GPU-less / headless boxes — the common case).
-# GPU rasterization needs real GPU device access (graphics GPU + the `render` group); without
-# it the renderer crashes. On a configured GPU render box, set PIXELSHOT_ENABLE_GPU=1 (~2x,
-# but it can produce blank captures — verify output).
-_GPU_ARGS = (
-    ["--enable-gpu-rasterization", "--force-gpu-rasterization"]
-    if os.environ.get("PIXELSHOT_ENABLE_GPU")
-    else ["--disable-gpu"]
-)
 CHROME_ARGS = [
     "--no-sandbox",
     "--disable-dev-shm-usage",
-    *_GPU_ARGS,
+    "--enable-gpu-rasterization",
+    "--force-gpu-rasterization",
     "--disable-renderer-backgrounding",
     "--disable-backgrounding-occluded-windows",
     "--disable-background-networking",

From 0cf430ad85663779c97ffcef342813373f9c222f Mon Sep 17 00:00:00 2001
From: Zhifei Li <andylizf@outlook.com>
Date: Fri, 26 Jun 2026 05:41:08 -0700
Subject: [PATCH 09/13] fix(eval): configurable retrieval timeout so a slow
 serve doesn't silently go closed-book
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

LocalAPIRetriever/TextAPIRetriever POST to the search serve with a hardcoded 600s timeout and
batch_size 32. Against a slow serve (e.g. --render-on-demand, which renders a page per tile),
a batch exceeds 600s, the request times out, the batch is cached EMPTY, and the reader answers
closed-book — a silently invalid run (looks like a bad score, not an error). Make the timeout
configurable via PIXELRAG_RETRIEVAL_TIMEOUT (default 600, unchanged for fast serves); document
raising it for mode 3 in the README.
---
 eval/README.md        |  3 +++
 eval/lib/retrieval.py | 10 ++++++++--
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/eval/README.md b/eval/README.md
index 344d6c6..118612f 100644
--- a/eval/README.md
+++ b/eval/README.md
@@ -88,6 +88,9 @@ In modes 2 and 3 the reader needs no local tiles, so leave `TILES_DIR` empty.
 - Tiles for the reader: either the full corpus at `TILES_DIR` (mode 1), or start the serve with
   `--render-on-demand --kiwix-url <kiwix-serve>` so it renders only the retrieved pages from a
   kiwix ZIM (mode 3) — no ~4T corpus.
+- On-demand render (mode 3) is slow (one page rendered per retrieved tile). Raise the retrieval
+  client timeout so a batch doesn't time out into an empty (closed-book) result:
+  `PIXELRAG_RETRIEVAL_TIMEOUT=7200`.
 
 ## 3. Run a cell
 
diff --git a/eval/lib/retrieval.py b/eval/lib/retrieval.py
index 25db01c..1b46cd5 100644
--- a/eval/lib/retrieval.py
+++ b/eval/lib/retrieval.py
@@ -17,6 +17,12 @@
 
 logger = logging.getLogger(__name__)
 
+# POST timeout (seconds) for a retrieval call to the search serve. The default suits a fast
+# serve; raise it for a slow one — e.g. on-demand render, where a batch can take minutes:
+# `PIXELRAG_RETRIEVAL_TIMEOUT=7200`. Otherwise the request times out, the batch is cached
+# empty, and the reader silently falls back to closed-book (looks like a bad score, not an error).
+_RETRIEVAL_TIMEOUT = float(os.environ.get("PIXELRAG_RETRIEVAL_TIMEOUT", "600"))
+
 
 @dataclass
 class RetrievalResult:
@@ -2223,7 +2229,7 @@ async def prefetch(self, examples: list[dict]):
                     async with session.post(
                         self.api_url,
                         json=payload,
-                        timeout=aiohttp.ClientTimeout(total=600),
+                        timeout=aiohttp.ClientTimeout(total=_RETRIEVAL_TIMEOUT),
                     ) as response:
                         if response.status != 200:
                             error_text = await response.text()
@@ -2917,7 +2923,7 @@ async def prefetch(self, examples: list[dict]):
                     async with session.post(
                         self.api_url,
                         json=payload,
-                        timeout=aiohttp.ClientTimeout(total=600),
+                        timeout=aiohttp.ClientTimeout(total=_RETRIEVAL_TIMEOUT),
                     ) as response:
                         if response.status != 200:
                             error_text = await response.text()

From dd21843c702737d1e26dcc4034365d39c87d7e5a Mon Sep 17 00:00:00 2001
From: Zhifei Li <andylizf@outlook.com>
Date: Fri, 26 Jun 2026 08:31:04 -0700
Subject: [PATCH 10/13] fix(serve): render on-demand tiles in a subprocess so
 the serve doesn't wedge

The on-demand renderer called render_url() in-process. render_url uses asyncio.run() + multiprocessing.Pool (fork); the SECOND in-process call deadlocks (fork-in-threaded-process), so a long-lived serve renders the first retrieved page fine and then wedges on the next request. Verified on a no-GPU box: same-process 2nd render hangs (SIGKILL after 40s), a fresh subprocess per render runs 3/3 clean (~1.85s each). Run each render in a subprocess (timeout via PIXELRAG_RENDER_TIMEOUT, default 120s). Pairs with the to_thread offload in api.py that keeps the blocking subprocess off the event loop.
---
 serve/src/pixelrag_serve/render_ondemand.py | 43 +++++++++++++++++----
 1 file changed, 36 insertions(+), 7 deletions(-)

diff --git a/serve/src/pixelrag_serve/render_ondemand.py b/serve/src/pixelrag_serve/render_ondemand.py
index c22adfc..5c29bd2 100644
--- a/serve/src/pixelrag_serve/render_ondemand.py
+++ b/serve/src/pixelrag_serve/render_ondemand.py
@@ -11,12 +11,17 @@
 and the shared ``pixelrag_embed.chunk`` slicer (1024px chunks).
 """
 
+import glob
 import os
 import shutil
+import subprocess
+import sys
 import threading
 from urllib.parse import quote
 
 _render_lock = threading.Lock()  # one Chrome render at a time per process
+# Hard timeout (seconds) for a single page render subprocess.
+_RENDER_TIMEOUT = float(os.environ.get("PIXELRAG_RENDER_TIMEOUT", "120"))
 
 
 class OnDemandTiles:
@@ -58,22 +63,46 @@ def chunk_path(
         return cpath if os.path.exists(cpath) else None
 
     def _render_and_chunk(self, article_id: int, title: str) -> None:
-        from pixelrag_render import render_url
         from pixelrag_embed.chunk import chunk_article
 
         url = f"{self.kiwix_url}/content/{self.book}/{quote(title, safe='')}"
         staging = os.path.join(self.cache_dir, f".render_{article_id}")
         shutil.rmtree(staging, ignore_errors=True)
-        dirs = render_url(
-            url,
-            staging,
-            viewport_width=self.viewport_width,
-            tile_height=self.tile_height,
+        os.makedirs(staging, exist_ok=True)
+        # Render in a SEPARATE PROCESS. render_url internally uses asyncio.run() +
+        # multiprocessing.Pool (fork); calling it a second time in this long-lived
+        # serve process deadlocks (fork-in-threaded-process). A fresh subprocess per
+        # render is the reliable fix — verified: same-process 2nd render hangs, a
+        # fresh process per render does not.
+        code = (
+            "import sys; from pixelrag_render import render_url; "
+            "render_url(sys.argv[1], sys.argv[2], "
+            "viewport_width=int(sys.argv[3]), tile_height=int(sys.argv[4]))"
         )
+        try:
+            subprocess.run(
+                [
+                    sys.executable,
+                    "-c",
+                    code,
+                    url,
+                    staging,
+                    str(self.viewport_width),
+                    str(self.tile_height),
+                ],
+                check=True,
+                timeout=_RENDER_TIMEOUT,
+                stdout=subprocess.DEVNULL,
+                stderr=subprocess.DEVNULL,
+            )
+        except (subprocess.CalledProcessError, subprocess.TimeoutExpired):
+            shutil.rmtree(staging, ignore_errors=True)
+            return
+        dirs = glob.glob(os.path.join(staging, "*.png.tiles"))
         if not dirs:
             shutil.rmtree(staging, ignore_errors=True)
             return
-        rendered = str(dirs[0])  # <sanitized-url>.png.tiles/ (has tiles.json)
+        rendered = dirs[0]  # <sanitized-url>.png.tiles/ (has tiles.json)
         chunk_article(rendered)  # writes chunk_XXXX_YY.png + chunks.json
         dest = self._article_dir(article_id)
         shutil.rmtree(dest, ignore_errors=True)

From d8bf1f93b74cb33165127c64499d16da358b92a4 Mon Sep 17 00:00:00 2001
From: Zhifei Li <andylizf@outlook.com>
Date: Fri, 26 Jun 2026 08:49:26 -0700
Subject: [PATCH 11/13] refactor(serve): render on-demand tiles via the
 pixelshot CLI, not inline python -c

The pixelshot CLI already exposes --viewport-width / --tile-height (defaults 875/8192), so use the standard module entry (python -m pixelrag_render.render --output ... --viewport-width ... --tile-height ... --workers 1) instead of an inline python -c string. Same subprocess isolation that fixes the wedge, cleaner invocation. Verified: 2 consecutive renders 1.67s/1.57s, correct *.png.tiles output, no hang.
---
 serve/src/pixelrag_serve/render_ondemand.py | 24 ++++++++++-----------
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/serve/src/pixelrag_serve/render_ondemand.py b/serve/src/pixelrag_serve/render_ondemand.py
index 5c29bd2..3de351f 100644
--- a/serve/src/pixelrag_serve/render_ondemand.py
+++ b/serve/src/pixelrag_serve/render_ondemand.py
@@ -69,26 +69,26 @@ def _render_and_chunk(self, article_id: int, title: str) -> None:
         staging = os.path.join(self.cache_dir, f".render_{article_id}")
         shutil.rmtree(staging, ignore_errors=True)
         os.makedirs(staging, exist_ok=True)
-        # Render in a SEPARATE PROCESS. render_url internally uses asyncio.run() +
-        # multiprocessing.Pool (fork); calling it a second time in this long-lived
-        # serve process deadlocks (fork-in-threaded-process). A fresh subprocess per
-        # render is the reliable fix — verified: same-process 2nd render hangs, a
-        # fresh process per render does not.
-        code = (
-            "import sys; from pixelrag_render import render_url; "
-            "render_url(sys.argv[1], sys.argv[2], "
-            "viewport_width=int(sys.argv[3]), tile_height=int(sys.argv[4]))"
-        )
+        # Render in a SEPARATE PROCESS via the pixelshot CLI. render_url internally uses
+        # asyncio.run() + multiprocessing.Pool (fork); calling it a second time in this
+        # long-lived serve process deadlocks (fork-in-threaded-process). A fresh subprocess
+        # per render is the reliable fix — verified: same-process 2nd render hangs, a fresh
+        # process per render does not.
         try:
             subprocess.run(
                 [
                     sys.executable,
-                    "-c",
-                    code,
+                    "-m",
+                    "pixelrag_render.render",
                     url,
+                    "--output",
                     staging,
+                    "--viewport-width",
                     str(self.viewport_width),
+                    "--tile-height",
                     str(self.tile_height),
+                    "--workers",
+                    "1",
                 ],
                 check=True,
                 timeout=_RENDER_TIMEOUT,

From c3eda5aaf58fb1b79ad49e13472511b77413adb7 Mon Sep 17 00:00:00 2001
From: Zhifei Li <andylizf@outlook.com>
Date: Fri, 26 Jun 2026 08:56:46 -0700
Subject: [PATCH 12/13] fix(render): default GPU rasterization OFF (it was an
 inherited no-op that crashes some boxes)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

git shows --enable-gpu-rasterization/--force-gpu-rasterization have been in the cdp/fast_cdp backends since the initial release (82c5794, then carried through PR #38) — inherited on the assumption GPU rasterization speeds up capture. It doesn't: headless Chrome falls back to the software renderer and ignores the flags (verified on a no-GPU box: enable == disable == 1.7s; the bottleneck is capture IPC, not rasterization — see docs/screenshot-throughput-optimization.md). On a box that has a GPU device but no access (e.g. /dev/dri without the render group), Chrome tries the GPU, the GPU process crashes on init, and capture hangs. Default to --disable-gpu; opt in with PIXELSHOT_ENABLE_GPU=1 only on a real graphics-GPU box with device access. Default render verified working (1.62s/tile).
---
 render/src/pixelrag_render/backends/cdp.py      | 16 ++++++++++++++--
 render/src/pixelrag_render/backends/fast_cdp.py | 12 ++++++++++--
 2 files changed, 24 insertions(+), 4 deletions(-)

diff --git a/render/src/pixelrag_render/backends/cdp.py b/render/src/pixelrag_render/backends/cdp.py
index df09fc6..8803c1c 100644
--- a/render/src/pixelrag_render/backends/cdp.py
+++ b/render/src/pixelrag_render/backends/cdp.py
@@ -40,6 +40,19 @@
 VIEWPORT_W = 875
 VIEWPORT_H = 1080
 
+# GPU rasterization: default OFF. Headless Chrome can't actually GPU-rasterize — it falls
+# back to the software renderer and ignores these flags (no-op), so they never sped anything
+# up (verified: enable == disable timing; the bottleneck is capture IPC, not rasterization).
+# Worse, on a box that HAS a GPU device but no access to it (e.g. /dev/dri without the render
+# group), Chrome tries the GPU, the GPU process crashes on init, and capture hangs. The
+# `--enable-gpu-rasterization` pair was inherited from the initial release on the assumption
+# it would help; it doesn't. Default to `--disable-gpu`; opt in with PIXELSHOT_ENABLE_GPU=1
+# only on a real graphics-GPU box with device access.
+_GPU_ARGS = (
+    ["--enable-gpu-rasterization", "--force-gpu-rasterization"]
+    if os.environ.get("PIXELSHOT_ENABLE_GPU")
+    else ["--disable-gpu"]
+)
 BROWSER_ARGS = [
     "--disable-dev-shm-usage",
     "--no-sandbox",
@@ -47,8 +60,7 @@
     "--disable-backgrounding-occluded-windows",
     "--disable-background-networking",
     "--disable-features=Translate,MediaRouter,OptimizationHints",
-    "--enable-gpu-rasterization",
-    "--force-gpu-rasterization",
+    *_GPU_ARGS,
 ]
 
 
diff --git a/render/src/pixelrag_render/backends/fast_cdp.py b/render/src/pixelrag_render/backends/fast_cdp.py
index 0427ddb..fea9d56 100644
--- a/render/src/pixelrag_render/backends/fast_cdp.py
+++ b/render/src/pixelrag_render/backends/fast_cdp.py
@@ -39,11 +39,19 @@
 VIEWPORT_WIDTH = 875
 TILE_HEIGHT = 8192
 
+# GPU rasterization default OFF — headless Chrome falls back to software and ignores these
+# flags (no-op, never sped anything up), but on a GPU box without device access it crashes the
+# GPU process and hangs capture. Inherited-from-initial-release assumption that didn't hold.
+# Opt in with PIXELSHOT_ENABLE_GPU=1 only on a real graphics-GPU box with device access.
+_GPU_ARGS = (
+    ["--enable-gpu-rasterization", "--force-gpu-rasterization"]
+    if os.environ.get("PIXELSHOT_ENABLE_GPU")
+    else ["--disable-gpu"]
+)
 CHROME_ARGS = [
     "--no-sandbox",
     "--disable-dev-shm-usage",
-    "--enable-gpu-rasterization",
-    "--force-gpu-rasterization",
+    *_GPU_ARGS,
     "--disable-renderer-backgrounding",
     "--disable-backgrounding-occluded-windows",
     "--disable-background-networking",

From a8cad0745524fdc402f0e81c7f1ecd21c111ae0b Mon Sep 17 00:00:00 2001
From: Zhifei Li <andylizf@outlook.com>
Date: Fri, 26 Jun 2026 10:04:06 -0700
Subject: [PATCH 13/13] docs(render): note throughput is capture-only; measure
 with the bench harness

Clarify that the documented t/s excludes Chrome startup (the bench timer starts after strategy.setup()). A hand-rolled end-to-end loop that counts the ~49s 48-worker startup reports ~13 t/s, which is the startup tax not the capture rate. Re-measured on the reference EPYC 7763 box with the harness: 130 t/s capture-only (200 maxi-ZIM pages, 48w, raw).
---
 docs/screenshot-throughput-optimization.md | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/docs/screenshot-throughput-optimization.md b/docs/screenshot-throughput-optimization.md
index ef52be7..6c6b58d 100644
--- a/docs/screenshot-throughput-optimization.md
+++ b/docs/screenshot-throughput-optimization.md
@@ -141,6 +141,14 @@ sub-frames). None achieved 100% correct at 48 workers.
 
 ## Reproducing
 
+**Measure with the bench harness, not a hand-rolled loop.** `tiles_per_s` here is
+**capture-only**: the bench's timer starts *after* `strategy.setup()` brings the Chrome
+workers up. At 48 workers, that startup is ~49s of serial Chrome launches — it is setup
+cost, not throughput, so it must not be counted. A naive end-to-end loop that includes the
+48-worker startup reports ~13 t/s (the startup tax), which is not the capture rate.
+Re-measured on the reference box (EPYC 7763, 128c) with the harness below: **130 t/s
+capture-only** (200 maxi-ZIM pages, 48 workers, `fmt="raw"`).
+
 ```python
 from pixelrag_render.strategies.cdp_phased import CDPPhasedStrategy
 from pixelrag_render.bench import Bench