Offload traffic to static workers and use node as the proxy#13366
Offload traffic to static workers and use node as the proxy#13366freddyaboulton wants to merge 45 commits into
Conversation
🦄 change detectedThis Pull Request includes changes to the following packages.
✅ Changeset approved by @freddyaboulton
|
|
|
||
| import numpy as np | ||
|
|
||
| prediction_traces = [ |
There was a problem hiding this comment.
Needed for the benchmarking. Need to separate the upload trace since it doesn't have the rest of the lifecycle stages (pre/postprocess e.g)
| import functools | ||
| import hashlib | ||
| import hmac | ||
| import importlib.resources |
There was a problem hiding this comment.
There is a lot of code dupe between server.py and static_server.py. Consolidated the overlap into util functions here. The util functions cover file i/o (like serving assets from the various build dirs)
| @@ -0,0 +1,101 @@ | |||
| import http from "node:http"; | |||
There was a problem hiding this comment.
This is the node proxy. it gets copied to the build directory to replace the index.js that is generated by sveltekit. The handler is generated by sveltekit and its how we serve the sveltekit page.
| "ts:check": "svelte-check --tsconfig tsconfig.json --threshold error", | ||
| "test": "pnpm --filter @gradio/client build && vitest dev --config .config/vitest.config.ts", | ||
| "test:run": "pnpm --filter @gradio/client build && vitest run --config .config/vitest.config.ts", | ||
| "test:run": "pnpm --filter @gradio/client build && vitest run --config .config/vitest.config.ts && node --test js/app/proxy_routes.test.js", |
There was a problem hiding this comment.
Added the proxy unit tests to test:run command
| @@ -1,24 +0,0 @@ | |||
| """Audio generation app — exercises file caching and downloading on output.""" | |||
There was a problem hiding this comment.
Moving everything here to gradio-app/hf-perftest repo
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 483df89. Configure here.
| max_fields=1000, | ||
| max_file_size=max_file_size, | ||
| upload_id=upload_id, | ||
| ) |
There was a problem hiding this comment.
Upload progress tracking completely broken after refactoring
High Severity
The refactored upload_fn does not pass upload_progress to GradioMultiPartParser, and the calling code in routes.py no longer calls file_upload_statuses.track(upload_id) before invoking it. The GradioMultiPartParser uses upload_progress to report chunk-level progress, and file_upload_statuses.track() is required for the SSE-based /upload_progress endpoint to work. This breaks client-side upload progress bars entirely.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit 483df89. Configure here.
| except MultiPartException as exc: | ||
| code = 413 if "maximum allowed size" in exc.message else 400 | ||
| return PlainTextResponse(exc.message, status_code=code) | ||
| return output_files |
There was a problem hiding this comment.
Static worker silently loses files on cross-filesystem uploads
Medium Severity
The static worker's upload handler calls upload_fn with force_move=False but discards the files_to_copy and locations return values (using _, _). When os.rename fails (e.g., cross-filesystem temp-to-upload move), files remain at the temp location while the returned paths point to destinations that were never written. Subsequent file downloads will 404.
Reviewed by Cursor Bugbot for commit 483df89. Configure here.
|
Very cool @freddyaboulton! Will review this. Would it be possible to include benchmarks that compare an app built with |
|
@abidlabs I'm working on a FastAPI comparison now. Main thing is ensuring the comparison is apples-to-apples since gradio does a lot of things FastAPI doesnt like SSR and queueing. |


Description
Similar to #13351 but using the built-in node server for ssr mode as the proxy
Architecture
flowchart TD Client[Client / Browser] Client --> Node["<b>node :7860</b><br/>reverse proxy"] Node -- "/upload, /file=, /static,<br/>/assets, /svelte, /favicon.ico" --> Pool Node -- "/queue/*, /api/*, SSE" --> Main Node -- "/, /_app/*" --> node subgraph Pool["Static Workers (round-robin)"] W1["Worker 1 :7862<br/>uploads, downloads, static assets"] W2["Worker 2 :7863<br/>uploads, downloads, static assets"] end Main["<b>Main Server :7861</b><br/>queue, SSE, API, session state, ML inference"] Pool --> FS Main --> FS FS[("Shared filesystem<br/>/tmp/gradio/")]AI Disclosure
We encourage the use of AI tooling in creating PRs, but the any non-trivial use of AI needs be disclosed. E.g. if you used Claude to write a first draft, you should mention that. Trivial tab-completion doesn't need to be disclosed. You should self-review all PRs, especially if they were generated with AI.
🎯 PRs Should Target Issues
Before your create a PR, please check to see if there is an existing issue for this change. If not, please create an issue before you create this PR, unless the fix is very small.
Not adhering to this guideline will result in the PR being closed.
Testing and Formatting Your Code
PRs will only be merged if tests pass on CI. We recommend at least running the backend tests locally, please set up your Gradio environment locally and run the backed tests:
bash scripts/run_backend_tests.shPlease run these bash scripts to automatically format your code:
bash scripts/format_backend.sh, and (if you made any changes to non-Python files)bash scripts/format_frontend.shNote
High Risk
High risk because it changes the SSR launch/network architecture (Node becomes the user-facing reverse proxy) and introduces multi-process static/upload serving, which can affect routing, ports, and file security behavior.
Overview
In SSR mode, switches to a new Node-as-front-proxy architecture: Node now binds the user-facing port and proxies requests to an internal Python port, with optional round-robin routing of static/upload/file routes to a background worker pool.
Adds
num_workers/GRADIO_NUM_WORKERSto start per-process static servers (StaticWorkerPool) that handle/static,/assets,/svelte,/upload, and/file*traffic, and refactors shared file/upload helpers intoroute_utilsso both the main app and workers reuse the same safe path/mimetype/range handling.Updates the Node build/runtime to include a custom proxy entrypoint (via
http-proxy) with routing logic + unit tests, adjusts SSR startup/health checks/logging and shutdown cleanup, expands server mode to propagate SSR/proxy env, and removes legacy Python→Node proxy middleware/caching. Also tweaks profiling summaries to separate upload timing from prediction traces and deletes the old benchmark tooling directory.Reviewed by Cursor Bugbot for commit 6c71719. Bugbot is set up for automated code reviews on this repo. Configure here.