Skip to content

Improve deployment setup: requirements.txt, install.sh, Dockerfile, and README updates#41

Open
Hardik-369 wants to merge 3 commits into
baidu:mainfrom
Hardik-369:improve/deployment-setup
Open

Improve deployment setup: requirements.txt, install.sh, Dockerfile, and README updates#41
Hardik-369 wants to merge 3 commits into
baidu:mainfrom
Hardik-369:improve/deployment-setup

Conversation

@Hardik-369

Copy link
Copy Markdown

Closes #27

This PR addresses several of the deployment blockers raised in #27.

Changes

  • requirements.txt — Pinned dependencies for both Transformers and SGLang inference paths. Uses --find-links wheel to resolve the custom SGLang wheel.
  • install.sh — Automated uv-based environment creation and dependency installation. Includes a prominent warning that the custom wheel is required.
  • Dockerfile — Container image based on nvidia/cuda:12.9.0-runtime-ubuntu24.04 with Python 3.12, the custom SGLang wheel, and all dependencies.
  • README updates — Added a prominent warning about the custom SGLang wheel, a Quick Start section, and Docker usage examples.

…el warning

Addresses multiple deployment blockers from issue baidu#27:

- Add requirements.txt with all pinned dependencies (Transformers and SGLang paths)
- Add install.sh for automated uv-based environment setup
- Add Dockerfile for containerized deployment (CUDA 12.9 + Python 3.12)
- Update README with prominent warning about the custom SGLang wheel,
  quick-start section, and Docker usage instructions
Copilot AI review requested due to automatic review settings June 26, 2026 17:10

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves deployment reproducibility and onboarding by adding pinned dependency installation, an automated setup script, containerization via Docker, and clearer README setup/run instructions—addressing the deployment blockers described in #27 (especially the custom SGLang wheel and lack of Docker support).

Changes:

  • Added a pinned requirements.txt that points pip to the bundled wheel/ directory for the custom SGLang wheel.
  • Introduced install.sh to automate environment creation and dependency installation using uv.
  • Added a CUDA 12.9 + Python 3.12 Dockerfile and expanded README with prerequisites, quick start, and Docker usage.

Reviewed changes

Copilot reviewed 3 out of 4 changed files in this pull request and generated 5 comments.

File Description
requirements.txt Adds a dependency set intended to be reproducible and to resolve the bundled SGLang wheel via --find-links.
install.sh Automates venv creation + installs the bundled SGLang wheel and requirements using uv.
Dockerfile Provides a CUDA runtime image that installs the custom wheel + dependencies and launches an SGLang server by default.
README.md Documents prerequisites, quick start installation paths, and Docker build/run examples.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread install.sh Outdated
Comment on lines +1 to +3
#!/usr/bin/env bash
set -euo pipefail

Comment thread Dockerfile Outdated
Comment on lines +42 to +43
ENV PATH="/app/.venv/bin:$PATH"
ENV CUDA_VISIBLE_DEVICES=0
Comment thread Dockerfile Outdated
Comment on lines +47 to +58
ENTRYPOINT ["python", "-m", "sglang.launch_server", \
"--model", "baidu/Unlimited-OCR", \
"--served-model-name", "Unlimited-OCR", \
"--attention-backend", "fa3", \
"--page-size", "1", \
"--mem-fraction-static", "0.8", \
"--context-length", "32768", \
"--enable-custom-logit-processor", \
"--disable-overlap-schedule", \
"--skip-server-warmup", \
"--host", "0.0.0.0", \
"--port", "10000"]
Comment thread README.md Outdated
Comment on lines +314 to +317
# Use a local model directory instead of downloading from Hugging Face
docker run --gpus all -p 10000:10000 \
-v /path/to/local-model:/model \
unlimited-ocr --model-dir /model
Comment thread README.md Outdated
Comment on lines +319 to +323
# Batch inference with infer.py (launches server automatically)
docker run --gpus all \
-v /path/to/images:/data \
-v /path/to/outputs:/app/outputs \
unlimited-ocr python infer.py --image-dir /data --output-dir /app/outputs
- install.sh: remove set -euo pipefail to avoid changing caller's shell state
- Dockerfile: remove ENV CUDA_VISIBLE_DEVICES=0 to let users control GPU at runtime
- Dockerfile: change ENTRYPOINT to CMD so infer.py override works
- README: fix --model-dir to --model in Docker examples
@kushdab

kushdab commented Jun 27, 2026

Copy link
Copy Markdown

Solid direction -- the install.sh + requirements.txt combo is exactly what new users need, and the README warning about the custom SGLang wheel will save a lot of confusion. A few issues worth fixing before this lands:

Bugs

1. --trust-remote-code missing from Docker CMD

baidu/Unlimited-OCR ships custom model code in modeling_unlimitedocr.py. Without this flag, SGLang refuses to load it and exits immediately:

Error: Loading baidu/Unlimited-OCR requires --trust-remote-code

Add it to the CMD:

CMD ["python", "-m", "sglang.launch_server", \
    "--model", "baidu/Unlimited-OCR", \
    "--trust-remote-code", \          # <- add this
    ...

2. kernels==0.11.7 in requirements.txt causes the SGLang ValueError

The custom SGLang wheel bundles its own sgl_kernel. Installing kernels==0.11.7 alongside it downgrades sgl_kernel, which breaks SGLang's unlimited_ocr.py model registration at import time -- SGLang swallows the import error, starts anyway, then crashes at inference with ValueError: UnlimitedOCRForCausalLM is not supported by SGLang (the root cause of issue #12). PR #34 removes this line from the README; the same fix applies here:

-kernels==0.11.7

This affects all three install paths: install.sh, direct pip install -r requirements.txt, and the Dockerfile.

3. --image-dir / --output-dir in README Docker examples should be --image_dir / --output_dir

infer.py uses underscore-delimited argparse flags. The Docker README examples use hyphens, which will produce unrecognized arguments errors:

# As written (broken):
docker run ... unlimited-ocr python infer.py --image-dir /data --output-dir /app/outputs

# Fix:
docker run ... unlimited-ocr python infer.py --image_dir /data --output_dir /app/outputs

Security / image quality

4. Container runs as root

The Dockerfile has no useradd/USER directive, so the container runs as root. This is a security concern for production deployments. Adding a non-root user is straightforward:

RUN groupadd --gid 1000 ocr && \
    useradd --uid 1000 --gid 1000 --create-home --shell /bin/bash ocr && \
    chown -R ocr:ocr /app
USER ocr

5. COPY assets/ assets/ adds documentation images to the runtime image

The assets/ folder contains the README GIF and other images used only for documentation. Copying them into the container adds unnecessary size. Either drop the COPY assets/ line (nothing in infer.py reads from assets/ at runtime) or add a .dockerignore to keep the build context clean.

A minimal .dockerignore:

.git/
__pycache__/
*.py[cod]
.venv/
outputs/
log/
assets/
*.pdf

Comparison with PR #39

Both PRs address the same deployment need (#27). Key differences:

PR #39 (marcelMaier) PR #41 (this PR)
Base image devel (wrong, ~7.5 GB) runtime (correct, ~3.5 GB)
Multi-stage build Yes No
Non-root user Yes No
install.sh No Yes
requirements.txt No Yes
--trust-remote-code in CMD Missing Missing
kernels==0.11.7 Present (wrong) Present (wrong)

This PR's nvidia/cuda:12.9.0-runtime-ubuntu24.04 base is the correct choice (PR #39 mistakenly uses devel). If the maintainers want to merge one Docker implementation, cherry-picking the runtime base, install.sh, and requirements.txt from this PR while adopting the multi-stage build and non-root user from PR #39 would give the best combined result.

What's done well

  • nvidia/cuda:12.9.0-runtime-ubuntu24.04 (runtime, not devel) is the right base -- cuts final image size roughly in half vs. PR Add Docker image publishing and API runtime #39
  • Layer ordering is cache-optimized: wheel first (changes rarely), requirements second, application code last
  • --find-links wheel in requirements.txt cleanly handles the local wheel without manual path instructions
  • install.sh is the clearest new-user onboarding addition in any open PR right now -- the prominent warning about the custom wheel alone will close several support issues
  • Quick Start section in README is well-structured and covers both uv and manual pip paths

- Add --trust-remote-code to Docker CMD (model uses custom code)
- Remove kernels==0.11.7 from requirements.txt (conflicts with SGLang wheel)
- Switch Dockerfile to non-root user (ocr:ocr)
- Drop COPY assets/ from Dockerfile and add .dockerignore
- Fix --image-dir/--output-dir to --image_dir/--output_dir in all docs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

High-level deployment blockers for community adoption

3 participants