Improve deployment setup: requirements.txt, install.sh, Dockerfile, and README updates by Hardik-369 · Pull Request #41 · baidu/Unlimited-OCR

Hardik-369 · 2026-06-26T17:10:00Z

Closes #27

This PR addresses several of the deployment blockers raised in #27.

Changes

requirements.txt — Pinned dependencies for both Transformers and SGLang inference paths. Uses --find-links wheel to resolve the custom SGLang wheel.
install.sh — Automated uv-based environment creation and dependency installation. Includes a prominent warning that the custom wheel is required.
Dockerfile — Container image based on nvidia/cuda:12.9.0-runtime-ubuntu24.04 with Python 3.12, the custom SGLang wheel, and all dependencies.
README updates — Added a prominent warning about the custom SGLang wheel, a Quick Start section, and Docker usage examples.

…el warning Addresses multiple deployment blockers from issue baidu#27: - Add requirements.txt with all pinned dependencies (Transformers and SGLang paths) - Add install.sh for automated uv-based environment setup - Add Dockerfile for containerized deployment (CUDA 12.9 + Python 3.12) - Update README with prominent warning about the custom SGLang wheel, quick-start section, and Docker usage instructions

Copilot

Pull request overview

This PR improves deployment reproducibility and onboarding by adding pinned dependency installation, an automated setup script, containerization via Docker, and clearer README setup/run instructions—addressing the deployment blockers described in #27 (especially the custom SGLang wheel and lack of Docker support).

Changes:

Added a pinned requirements.txt that points pip to the bundled wheel/ directory for the custom SGLang wheel.
Introduced install.sh to automate environment creation and dependency installation using uv.
Added a CUDA 12.9 + Python 3.12 Dockerfile and expanded README with prerequisites, quick start, and Docker usage.

Reviewed changes

Copilot reviewed 3 out of 4 changed files in this pull request and generated 5 comments.

File	Description
requirements.txt	Adds a dependency set intended to be reproducible and to resolve the bundled SGLang wheel via `--find-links`.
install.sh	Automates venv creation + installs the bundled SGLang wheel and requirements using `uv`.
Dockerfile	Provides a CUDA runtime image that installs the custom wheel + dependencies and launches an SGLang server by default.
README.md	Documents prerequisites, quick start installation paths, and Docker build/run examples.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+#!/usr/bin/env bash
+set -euo pipefail
+


+ENV PATH="/app/.venv/bin:$PATH"
+ENV CUDA_VISIBLE_DEVICES=0


+ENTRYPOINT ["python", "-m", "sglang.launch_server", \
+    "--model", "baidu/Unlimited-OCR", \
+    "--served-model-name", "Unlimited-OCR", \
+    "--attention-backend", "fa3", \
+    "--page-size", "1", \
+    "--mem-fraction-static", "0.8", \
+    "--context-length", "32768", \
+    "--enable-custom-logit-processor", \
+    "--disable-overlap-schedule", \
+    "--skip-server-warmup", \
+    "--host", "0.0.0.0", \
+    "--port", "10000"]


+# Use a local model directory instead of downloading from Hugging Face
+docker run --gpus all -p 10000:10000 \
+    -v /path/to/local-model:/model \
+    unlimited-ocr --model-dir /model


+# Batch inference with infer.py (launches server automatically)
+docker run --gpus all \
+    -v /path/to/images:/data \
+    -v /path/to/outputs:/app/outputs \
+    unlimited-ocr python infer.py --image-dir /data --output-dir /app/outputs


- install.sh: remove set -euo pipefail to avoid changing caller's shell state - Dockerfile: remove ENV CUDA_VISIBLE_DEVICES=0 to let users control GPU at runtime - Dockerfile: change ENTRYPOINT to CMD so infer.py override works - README: fix --model-dir to --model in Docker examples

kushdab · 2026-06-27T00:02:03Z

Solid direction -- the install.sh + requirements.txt combo is exactly what new users need, and the README warning about the custom SGLang wheel will save a lot of confusion. A few issues worth fixing before this lands:

Bugs

1. `--trust-remote-code` missing from Docker `CMD`

baidu/Unlimited-OCR ships custom model code in modeling_unlimitedocr.py. Without this flag, SGLang refuses to load it and exits immediately:

Error: Loading baidu/Unlimited-OCR requires --trust-remote-code

Add it to the CMD:

CMD ["python", "-m", "sglang.launch_server", \
    "--model", "baidu/Unlimited-OCR", \
    "--trust-remote-code", \          # <- add this
    ...

2. `kernels==0.11.7` in `requirements.txt` causes the SGLang ValueError

The custom SGLang wheel bundles its own sgl_kernel. Installing kernels==0.11.7 alongside it downgrades sgl_kernel, which breaks SGLang's unlimited_ocr.py model registration at import time -- SGLang swallows the import error, starts anyway, then crashes at inference with ValueError: UnlimitedOCRForCausalLM is not supported by SGLang (the root cause of issue #12). PR #34 removes this line from the README; the same fix applies here:

-kernels==0.11.7

This affects all three install paths: install.sh, direct pip install -r requirements.txt, and the Dockerfile.

3. `--image-dir` / `--output-dir` in README Docker examples should be `--image_dir` / `--output_dir`

infer.py uses underscore-delimited argparse flags. The Docker README examples use hyphens, which will produce unrecognized arguments errors:

# As written (broken):
docker run ... unlimited-ocr python infer.py --image-dir /data --output-dir /app/outputs

# Fix:
docker run ... unlimited-ocr python infer.py --image_dir /data --output_dir /app/outputs

Security / image quality

4. Container runs as root

The Dockerfile has no useradd/USER directive, so the container runs as root. This is a security concern for production deployments. Adding a non-root user is straightforward:

RUN groupadd --gid 1000 ocr && \
    useradd --uid 1000 --gid 1000 --create-home --shell /bin/bash ocr && \
    chown -R ocr:ocr /app
USER ocr

5. `COPY assets/ assets/` adds documentation images to the runtime image

The assets/ folder contains the README GIF and other images used only for documentation. Copying them into the container adds unnecessary size. Either drop the COPY assets/ line (nothing in infer.py reads from assets/ at runtime) or add a .dockerignore to keep the build context clean.

A minimal .dockerignore:

.git/
__pycache__/
*.py[cod]
.venv/
outputs/
log/
assets/
*.pdf

Comparison with PR #39

Both PRs address the same deployment need (#27). Key differences:

	PR #39 (marcelMaier)	PR #41 (this PR)
Base image	`devel` (wrong, ~7.5 GB)	`runtime` (correct, ~3.5 GB)
Multi-stage build	Yes	No
Non-root user	Yes	No
`install.sh`	No	Yes
`requirements.txt`	No	Yes
`--trust-remote-code` in CMD	Missing	Missing
`kernels==0.11.7`	Present (wrong)	Present (wrong)

This PR's nvidia/cuda:12.9.0-runtime-ubuntu24.04 base is the correct choice (PR #39 mistakenly uses devel). If the maintainers want to merge one Docker implementation, cherry-picking the runtime base, install.sh, and requirements.txt from this PR while adopting the multi-stage build and non-root user from PR #39 would give the best combined result.

What's done well

nvidia/cuda:12.9.0-runtime-ubuntu24.04 (runtime, not devel) is the right base -- cuts final image size roughly in half vs. PR Add Docker image publishing and API runtime #39
Layer ordering is cache-optimized: wheel first (changes rarely), requirements second, application code last
--find-links wheel in requirements.txt cleanly handles the local wheel without manual path instructions
install.sh is the clearest new-user onboarding addition in any open PR right now -- the prominent warning about the custom wheel alone will close several support issues
Quick Start section in README is well-structured and covers both uv and manual pip paths

- Add --trust-remote-code to Docker CMD (model uses custom code) - Remove kernels==0.11.7 from requirements.txt (conflicts with SGLang wheel) - Switch Dockerfile to non-root user (ocr:ocr) - Drop COPY assets/ from Dockerfile and add .dockerignore - Fix --image-dir/--output-dir to --image_dir/--output_dir in all docs

Copilot AI review requested due to automatic review settings June 26, 2026 17:10

Copilot started reviewing on behalf of Hardik-369 June 26, 2026 17:10 View session

Copilot AI reviewed Jun 26, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve deployment setup: requirements.txt, install.sh, Dockerfile, and README updates#41

Improve deployment setup: requirements.txt, install.sh, Dockerfile, and README updates#41
Hardik-369 wants to merge 3 commits into
baidu:mainfrom
Hardik-369:improve/deployment-setup

Hardik-369 commented Jun 26, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

kushdab commented Jun 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

Hardik-369 commented Jun 26, 2026

Changes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

kushdab commented Jun 27, 2026

Bugs

1. --trust-remote-code missing from Docker CMD

2. kernels==0.11.7 in requirements.txt causes the SGLang ValueError

3. --image-dir / --output-dir in README Docker examples should be --image_dir / --output_dir

Security / image quality

4. Container runs as root

5. COPY assets/ assets/ adds documentation images to the runtime image

Comparison with PR #39

What's done well

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

1. `--trust-remote-code` missing from Docker `CMD`

2. `kernels==0.11.7` in `requirements.txt` causes the SGLang ValueError

3. `--image-dir` / `--output-dir` in README Docker examples should be `--image_dir` / `--output_dir`

5. `COPY assets/ assets/` adds documentation images to the runtime image