devenv

Containerized dev environment for compressed-tensors, llm-compressor, speculators, and vLLM.

Why

On shared GPU servers, keeping a consistent dev setup across the team is hard — different Python versions, conflicting deps, broken system packages. This repo provides a reproducible, pre-built container with everything needed to develop and test across all four repos. One command to start, no manual install steps, and you can't break the host.

Quick start

# Run the onboarding script (auto-detects bare-metal vs cluster)
git clone https://github.com/neuralmagic/devenv.git ~/devenv
cd ~/devenv
./setup.sh           # clones repos, sets up auth, creates PVCs (cluster), adds shell aliases
source ~/.bashrc     # loads aliases, env vars
devenv               # pulls image, starts container, attaches tmux

Container

Based on vllm/vllm-openai:latest with compressed-tensors, llm-compressor, speculators, and vllm. Includes Python, torch, transformers, pytest, ruff, Claude Code, VS Code CLI, gcloud CLI, gh CLI, uv, and tmux.

Usage

The launch script auto-detects the environment (bare-metal vs OpenShift cluster).

devenv                             # start container + attach tmux
devenv --down                      # stop/delete container
devenv --restart                   # restart with latest image
devenv --gpus 4                    # request specific GPU count (cluster mode)
devenv --gpu-type h100             # target specific GPU type (cluster mode)
devenv --name exp1                 # named instance (for multiple pods)
devenv --fast-storage              # mount tier 1 NVMe at /data (cluster mode)
devenv-status                      # show pods, GPU availability, PVCs (cluster mode)

Re-running devenv after an SSH disconnect reattaches to the existing tmux session.

Per-repo venvs

Each repo gets its own venv (with --system-site-packages to inherit torch/CUDA from the base image). This isolates conflicting deps like transformers across repos.

use speculators       # activates venv + cd into repo
use vllm
use llm-compressor

Venvs live at /workspace/<repo>/.venv and persist on the PVC. They're only rebuilt when pyproject.toml/setup.py changes.

Bare-metal setup

Prerequisites

Docker (or Podman) with NVIDIA GPU support
NVIDIA GPUs with CDI configured (nvidia.com/gpu=all)
Repos cloned at ~/repos/{compressed-tensors,llm-compressor,speculators,vllm}

Repos are bind-mounted from ~/repos/ into /workspace/ — edits on the host appear instantly in the container.

Auth

Authenticate on the host or inside the container — gcloud and gh configs are bind-mounted and shared between both:

gcloud auth login
gcloud config set project itpc-gcp-ai-eng-claude
gcloud auth application-default login
gcloud auth application-default set-quota-project cloudability-it-gemini
gh auth login
hf auth login

vLLM server

The container does not auto-start a server — start it manually:

use vllm
python -m vllm.entrypoints.openai.api_server --model <model> --port 8000

The server is reachable at localhost:8000 (also exposed to the host).

OpenShift cluster setup

Prerequisites

oc CLI, logged into the cluster
Namespace: machine-learning

First-time setup

Run ./setup.sh on the bastion — it handles oc login, PVC creation, and cloning repos onto the PVC.

Daily use

devenv --gpus 2                    # auto-detects cluster when oc is available
devenv --gpu-type h100             # target H100 nodes
devenv --down                      # delete the pod
devenv-status                      # show pods, GPU availability, PVCs

Auth persistence

Auth credentials (gcloud, gh, huggingface, Claude Code) and SSH keys persist across pod restarts via a config PVC mounted at /root/.config, /root/.claude, and /root/.ssh. Authenticate once inside the pod:

gcloud auth login
gcloud config set project itpc-gcp-ai-eng-claude
gcloud auth application-default login
gcloud auth application-default set-quota-project cloudability-it-gemini
gh auth login
hf auth login

Jira skills (optional)

After authenticating with gh, clone mle-jira and link the skills:

git clone https://github.com/neuralmagic/mle-jira.git /workspace/mle-jira
mkdir -p ~/.claude/skills
ln -sfn /workspace/mle-jira/skills/* ~/.claude/skills/
claude mcp add --transport http atlassian-mcp-server https://mcp.atlassian.com/v1/mcp

Persists across pod restarts via the workspace and config PVCs.

VS Code remote editing

A VS Code tunnel starts automatically on pod startup. On first use, authenticate via:

tmux switch -t tunnel    # follow the GitHub auth link

After that, connect from VS Code on your laptop using the Remote - Tunnels extension.

Multiple instances

Use --name to run multiple independent pods. Each instance gets its own repos PVC (auto-created with cloned repos on first launch), while hf-cache, pip-cache, and config remain shared.

devenv --name exp1 --gpus 2
devenv --name exp2 --gpus 1    # separate repos PVC

To tear down, pass the same --name:

devenv --name exp1 --down      # deletes pod, keeps repos PVC
oc delete pvc devenv-workspace-$USER-exp1 -n machine-learning   # delete repos PVC

Name		Name	Last commit message	Last commit date
Latest commit History 207 Commits
.devcontainer		.devcontainer
.github/workflows		.github/workflows
config		config
k8s		k8s
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
Dockerfile		Dockerfile
README.md		README.md
cluster-status.sh		cluster-status.sh
entrypoint.sh		entrypoint.sh
launch.sh		launch.sh
setup.sh		setup.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

devenv

Why

Quick start

Container

Usage

Per-repo venvs

Bare-metal setup

Prerequisites

Auth

vLLM server

OpenShift cluster setup

Prerequisites

First-time setup

Daily use

Auth persistence

Jira skills (optional)

VS Code remote editing

Multiple instances

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

devenv

Why

Quick start

Container

Usage

Per-repo venvs

Bare-metal setup

Prerequisites

Auth

vLLM server

OpenShift cluster setup

Prerequisites

First-time setup

Daily use

Auth persistence

Jira skills (optional)

VS Code remote editing

Multiple instances

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages