Enclave

Self-hosted LLM infrastructure with OpenAI-compatible API. CPU-optimized. Source-available.

Product page · Wiki · Latest release · Changelog

Enclave runs LLMs on your hardware. OpenAI-compatible API, Ollama backend, zero cloud dependencies.

What's new — Architecture-aware orchestration (Phases 1–6): per-host detection of memory + deployment topology, four-tier keep_alive resolver with arch-detected defaults, scheduler facade with feasibility validation, and tick-based parallel DAG dispatch that uses the arch to decide what to run concurrently. Plus: installable Python wheel + sdist, mirrored Docker image on GHCR, Linux source tarball with SHA256/SHA512, n8n release-update workflow, and a curated Wiki seed. See the CHANGELOG for the full PR-by-PR detail.

What it does

OpenAI-compatible API — drop-in replacement. Point your existing code at localhost:8000
CPU-optimized inference — GGUF quantized models via Ollama. 7B at 40-50 tok/s, 13B at 25-30 tok/s
Model management — download, configure, and switch between 18+ models from the registry
Multi-agent workflows — YAML-defined step pipelines with role-based model selection
Web dashboard — monitor models, system health, and API status
macOS app — native desktop wrapper with setup wizard
No telemetry by default — no data leaves your machine unless you opt in; optional, operator-owned error reporting (your own sink, redaction mandatory — see docs/deployment/error-reporting.md). No internet required for inference

Quick start

Three paths — pick one:

macOS app (DMG) — for end users

Download Enclave.dmg from the latest release (Or grab the rolling nightly build for the freshest master.)
Open the DMG and drag Enclave.app to /Applications.
First launch: macOS Gatekeeper will warn — the app is currently not signed/notarized. Bypass once with:
```
xattr -dr com.apple.quarantine /Applications/Enclave.app
```
Then double-click Enclave in Launchpad.
The native window opens the first-run setup wizard (/setup) which installs Ollama if needed and pulls a starter model. After that you land on the dashboard.

Requirements: macOS 12.0 (Monterey) or later. ~6 GB free disk for the bundled runtime + a small starter model. Ollama is installed automatically by the wizard if missing.

Docker — any platform with Docker Desktop

For non-developers on Linux / Windows, or anyone who wants Enclave fully isolated in containers. No Python, no virtualenv, no manual Ollama install.

Install Docker Desktop (or Docker Engine on Linux) and make sure the whale icon is running.
Clone or download this repo, open a terminal in the project folder, and run:
```
./run.sh
```
The script verifies Docker, brings up the stack (ollama + api), pulls a small starter model on first run (llama3.2:3b, ~2 GB), and opens the dashboard in your browser.

	URL
Enclave SPA (the application)	`http://localhost:8000`
API docs	`http://localhost:8000/docs`
Open WebUI (opt-in)	`http://localhost:8081` — `docker compose -f docker-compose.yml -f docker-compose.webui.yml up -d`

To stop: ./stop.sh (data preserved) — or ./stop.sh --reset to wipe models and chat history.

Requirements: ~4 GB free RAM and ~3 GB free disk for the starter model. Pick a different starter with ENCLAVE_DEFAULT_MODEL=qwen2.5:3b ./run.sh.

Prefer to pull the published image directly? (Substitute <version> with the latest tag.)

# Docker Hub — canonical
docker pull hankthebldrr/local-ai-platfrom:<version>

# GHCR mirror — same digest, no Hub account required
docker pull ghcr.io/hankthebldr/enclave:<version>

pip install — embed in an existing Python app

For developers who want to use the Enclave engine inside another Python service. Bundles the FastAPI app, workflow engine, RAG pipeline, and CLI dispatcher.

# From a GitHub Release asset (no PyPI required)
pip install https://github.com/hankthebldr/local-ai-platform/releases/download/v<version>/enclave-<version>-py3-none-any.whl

# Then run the API server with the same uvicorn settings the DMG uses:
enclave-api                 # starts FastAPI on 127.0.0.1:8000
enclave --help              # CLI dispatcher (chat, workflow, query, api)

You still need an Ollama runtime reachable at OLLAMA_URL (defaults to http://localhost:11434). The Python package does not install Ollama for you — see the Wiki › Deployment page for production setups.

From source — for developers

# Install (creates ./venv, installs core+dev deps, sets up systemd unit on Linux)
./setup/install.sh

# Boot Ollama + API + auto-open the dashboard in your browser
./scripts/start.sh

# Or, on macOS, exercise the same native pywebview window the DMG ships
./scripts/start_desktop.sh

# Verify everything boots and every UX route renders
./scripts/verify_local.sh

API at http://localhost:8000 · Dashboard at http://localhost:8000/ · Docs at http://localhost:8000/docs · First-run wizard at http://localhost:8000/setup.

Models

# List available models
python models/download.py --list

# Download a model
python models/download.py dolphin-mixtral

# List installed
ollama list

Default quantization: Q4_K_M (best quality/speed balance). See MODELS.md for the full registry.

API usage

curl http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "mistral",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

Compatible with any OpenAI SDK client.

Code-level artifacts

What ships in this repo, and where to find it:

Surface	Path	Notes
FastAPI server (OpenAI-compatible)	api/main.py	16 routers under `api/routers/`, services under `api/services/`
Web dashboard + setup wizard	api/static/	Served at `/` and `/setup` by the FastAPI app
CLI chat / query / workflow	cli/	Rich-formatted; `python -m cli.chat`, `cli/workflow.py`
Multi-agent workflow engine	api/services/workflow_engine.py	YAML pipelines under workflows/
Custom agents (Gems)	agents/ + api/routers/agents.py	YAML-defined personas with pinned context
Model registry	models/download.py	18+ models — see MODELS.md
macOS desktop wrapper	desktop/app.py	pywebview window around the FastAPI server
DMG builder	scripts/build_mac.sh	Bundles a self-contained `.app` + dmg
Local dev scripts	scripts/	`start.sh`, `start_desktop.sh`, `verify_local.sh`, `status.sh`, `test.sh`

Build the DMG yourself

The same script CI uses on tag pushes:

brew install librsvg create-dmg     # one-time
./scripts/generate-icons.sh         # regenerate icns from SVG
./scripts/build_mac.sh              # produces dist/Enclave.app + dist/Enclave.dmg
open dist/Enclave.app               # smoke-test the bundle

The build script reads ENCLAVE_VERSION (or falls back to git describe) and stamps it into Info.plist. Override for a one-off custom build:

ENCLAVE_VERSION=v1.2.3-local ./scripts/build_mac.sh

Release pipeline

Trigger	Workflow	Artifact
Tag push `v..*`	release.yml	Stable GitHub Release with signed DMG
Push to `master`	release.yml	Rolling `nightly` pre-release (replaced each merge)
PR / push to `master`	ci.yml	pytest + lint + macOS `.app` smoke build (boots and probes UX routes)
Tag push or release publish	pages.yml	Updates hankthebldr.github.io/local-ai-platform with the latest release version

Every master merge re-publishes a freshly smoke-tested DMG to the nightly release. Stable releases are cut by pushing a vX.Y.Z tag.

Hardware targets

Machine	RAM	Role	Throughput
Mac M4 Pro	48GB	Development	7B @ 50 tok/s
MS-01 (Ryzen 9 7945HX)	64GB	API serving	34B @ 12 tok/s
BD790i (Ryzen 9 7945HX)	96GB	Research / 70B-class workflows	70B @ 5 tok/s

The BD790i is the only host in the fleet that can exercise the full 1.3.0 MCP & Skills co-scheduler against 70B-class models + multi-GB MCP RSS simultaneously. Bring-up + benchmark recipes: docs/deployment/bd790i-testing.md.

Documentation

The canonical operator-facing docs live on the GitHub Wiki (sourced from docs/wiki/ on every tag). Highlights:

Quickstart — first 60 seconds
Architecture — request flow, services, workflow engine, arch-aware dispatch
Workflows — authoring YAML pipelines + composite step kinds
Agents — Gems-style YAML personas
Models — registry, quantization, throughput
Deployment — DMG · Docker · pip · source · systemd
Configuration — env vars, auth, CORS, perf knobs
Troubleshooting — common failure modes
Release notes

Source-of-truth references inside the repo:

MODELS.md — model registry and selection
CLAUDE.md — developer guide
CHANGELOG.md — every release, every PR
docs/ — design docs, plans, deployment guides
Product page: hankthebldr.github.io/local-ai-platform

Name		Name	Last commit message	Last commit date
Latest commit History 385 Commits
.claude		.claude
.github		.github
agents		agents
api		api
assets		assets
cli		cli
data		data
desktop		desktop
docs		docs
evals		evals
models		models
playwright-results/videos		playwright-results/videos
plugins		plugins
prompts		prompts
scripts		scripts
setup		setup
tests		tests
triage		triage
workflows		workflows
.env.example		.env.example
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
Dockerfile		Dockerfile
ENTERPRISE_DEPLOYMENT_GAPS.md		ENTERPRISE_DEPLOYMENT_GAPS.md
MODELS.md		MODELS.md
README.md		README.md
docker-compose.bd790i.yml		docker-compose.bd790i.yml
docker-compose.gpu.yml		docker-compose.gpu.yml
docker-compose.webui.yml		docker-compose.webui.yml
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
run.sh		run.sh
stop.sh		stop.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Enclave

What it does

Quick start

macOS app (DMG) — for end users

Docker — any platform with Docker Desktop

pip install — embed in an existing Python app

From source — for developers

Models

API usage

Code-level artifacts

Build the DMG yourself

Release pipeline

Hardware targets

Documentation

About

Uh oh!

Releases 3

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Enclave

What it does

Quick start

macOS app (DMG) — for end users

Docker — any platform with Docker Desktop

pip install — embed in an existing Python app

From source — for developers

Models

API usage

Code-level artifacts

Build the DMG yourself

Release pipeline

Hardware targets

Documentation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages