Off Grid AI

Private, on-device AI. Your models, your data — no cloud, no accounts, no API keys.

A local-first AI runtime + studio — run open models (text, vision, image, voice, speech) entirely on your machine, behind one OpenAI-compatible gateway. Plus an always-on layer that sees, remembers, reflects, and acts — all on-device.

Download (macOS · Windows) · Features · getoffgridai.co · Pro early access

What it is

Off Grid AI is a local-first AI runtime for your desktop. Download open models from the built-in catalog (or any GGUF from Hugging Face) and use them across every modality — all inference runs on your hardware via bundled llama.cpp, stable-diffusion.cpp, whisper.cpp, and Kokoro. Nothing routes through a server we own; your conversations, files, and models never leave your device.

Three things in one app:

A studio — chat (text + vision + reasoning), on-device image generation, voice in/out, live artifacts/canvas, projects with RAG, and in-chat tools — a local Claude/LM-Studio/Ollama with everything on-device.
A gateway — one local OpenAI-compatible API (http://127.0.0.1:7878/v1, no key) for chat, vision, image, audio, and embeddings. Run it headless as just the gateway.
Off Grid Pro — an always-on private layer that sees your work (screen → OCR), remembers it, helps you reflect, and acts with your approval. On-device, opt-in.

A look inside

Chat — text, vision, reasoning, artifacts, on-device	Models — curated catalog + Hugging Face search
The Gateway — one local OpenAI-compatible API	Projects — group chats, RAG over your docs
Connectors (MCP) — add servers, use them in chat	Artifacts — HTML, React, SVG & Mermaid in a local sandbox
Off Grid Pro — the sees/remembers/reflects/acts layer	Private by default — runs on your machine, no account

Features (free & open source)

The free, open app is a complete on-device AI studio:

Chat — text + vision, streaming, with a reasoning ("thinking") mode and per-chat model settings (temperature, context window).
Image generation — text→image and image→image via stable-diffusion.cpp (Metal). Ships SDXL-Lightning (few-step, fast), SDXL, SD 1.5/2.1, and Z-Image-Turbo (2026 flagship, ~8-step). Live per-step preview, progress + ETA, cancel, lightbox, and an artifacts gallery of everything you've generated.
Voice — speech-to-text (whisper) and text-to-speech (Kokoro-82M, multilingual), plus a hands-free voice mode.
Artifacts / canvas — the model's HTML, React/JSX, SVG, Mermaid, and Markdown render live in a sandboxed iframe (no network/file access); Code/Preview toggle, download, saved per chat & project.
Projects — group chats, upload documents (txt/md/PDF/DOCX, image, audio, video) and chat grounded in them (RAG with cited sources); per-project instructions.
Tools in chat — an agentic loop calls local tools mid-conversation: built-ins (calculator, datetime) plus any MCP connector you've added.
Connectors (MCP) — add Model Context Protocol servers (none / token / OAuth) and use them right inside chat. Preset catalog included.
Model catalog — curated, size-bucketed recommendations + direct Hugging Face search; download, manage, and set the active model per modality.
The Gateway — one OpenAI-compatible endpoint for everything; see below.
Auto-update — signed releases update themselves.

A full breakdown is in docs/FEATURES.md.

The Gateway

One local server (http://127.0.0.1:7878) speaks the OpenAI API:

Capability	Endpoint
Chat (text + vision)	`POST /v1/chat/completions`
Text → Image	`POST /v1/images` (`/generations`, `/edits`)
Speech → Text	`POST /v1/audio/transcriptions`
Text → Speech	`POST /v1/audio/speech`
Embeddings	`POST /v1/embeddings`
Models	`GET /v1/models`

curl http://127.0.0.1:7878/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model":"local","messages":[{"role":"user","content":"Hello!"}]}'

from openai import OpenAI
client = OpenAI(base_url="http://127.0.0.1:7878/v1", api_key="not-needed")
print(client.chat.completions.create(model="local",
      messages=[{"role":"user","content":"Hello!"}]).choices[0].message.content)

Interactive API reference + an OpenAPI spec are served at /docs and /openapi.json.

Run just the gateway (headless)

You don't need the desktop UI to serve models — run only the gateway (no UI, no capture) and point any OpenAI client at it. Ideal for a server, a homelab box, or wiring local models into your own apps:

# from a built app
/Applications/Off\ Grid\ AI.app/Contents/MacOS/Off\ Grid\ AI --server-only
# or from source
OFFGRID_SERVER_ONLY=1 npm run gateway

It's self-sufficient — manage models over HTTP, no UI required:

Action	Endpoint
List the catalog	`GET /v1/models/catalog`
List installed	`GET /v1/models/installed`
Active model per modality	`GET /v1/models/active`
Pull a model	`POST /v1/models/pull` `{ "id": "…" }` → poll `GET /v1/models/pull/status?id=…`
Activate a model	`POST /v1/models/activate` `{ "id": "…", "kind"?: "image\|speech\|transcription" }`
Delete a model	`POST /v1/models/delete` `{ "id": "…" }`

# pull a model into a headless gateway, then chat
curl -X POST http://127.0.0.1:7878/v1/models/pull \
  -H 'Content-Type: application/json' -d '{"id":"unsloth/gemma-4-E4B-it-GGUF"}'
curl -X POST http://127.0.0.1:7878/v1/models/activate \
  -H 'Content-Type: application/json' -d '{"id":"unsloth/gemma-4-E4B-it-GGUF"}'

Off Grid Pro — coming July 2026

The free app runs models. Pro adds the always-on layer that turns your own work into private, on-device memory — and an assistant that helps you act on it. Everything is explicit opt-in, with a visible recording indicator, and nothing leaves the device.

Sees — screen capture → OCR → on-device LLM distill into observations + entities. Multi-monitor aware, consumption-vs-work classification, blank/locked frames skipped.
Remembers — Day (a persisted journal with time blocks), Entities (a private CRM-for-everything: people, projects, companies, auto-built with synthesis summaries), and Replay (a "movie of your day" you can scrub and play back).
Reflects — mind-share, balance, context-switching, and Day/Week trends.
Meetings — records Google Meet + Zoom (screen + system audio + mic), transcribes locally with whisper, and folds an LLM title/summary/attendees into your timeline.
Acts (with approval) — action items detected from your communication, an approval queue + audit log (nothing executes without a logged approval), MCP connectors as authoritative sources, and a skills framework (trigger → action) — on the roadmap toward a proactive secretary and a prospective "Ahead" view of your day.

Pro launches July 2026 — already paid? You're first in line when it ships. Pro features live in a separate private package (a pro/ submodule); the open core never imports it — see Architecture.

Day — your day, planned from real activity	Entities — a private CRM, auto-built
Reflect — where your attention actually went	Search — unified search across everything
Meetings — recorded + transcribed locally	Replay — rewind your screen like a film

_{Pro screens shown with synthetic demo data.}

→ Join early access (free) — or pay now for lifetime free + first access.

Install

Grab the latest build from Releases:

macOS (Apple Silicon) — signed + notarized .dmg
Windows (x64) — .exe installer

Linux (AppImage/deb) is in progress.

Build from source

git clone https://github.com/off-grid-ai/desktop.git
cd desktop
npm install
npm run dev          # full app
npm run gateway      # headless gateway only (:7878)
npm run build:mac    # package a macOS app

Stack: Electron 39 + React 19 + Tailwind v4 (electron-vite), better-sqlite3-multiple-ciphers (encrypted local DB), @lancedb/lancedb (vectors), bundled llama.cpp / whisper.cpp / stable-diffusion.cpp / ffmpeg in resources/bin. Shared @offgrid/* packages (design, models, rag) come from the workspace. Verify changes with npm run typecheck before declaring done.

Architecture — open core

This repository is the open, AGPL core: the model runner, gateway, studio (chat, image, voice, artifacts), projects, connectors, and the model catalog. Pro features live in a separate private package loaded as a git submodule (pro/). The core never imports pro — pro registers itself through small registries (an activate() pattern) and is simply absent in this build, so the open app compiles and runs entirely on its own.

Privacy

All model inference is local. Your conversations, documents, and models stay on your device — there's no cloud inference, no account, and no API key. You can run it fully offline.

Name		Name	Last commit message	Last commit date
Latest commit History 134 Commits
.agent/rules		.agent/rules
.github		.github
.vscode		.vscode
build		build
docs		docs
electron/accessibility		electron/accessibility
packages		packages
resources		resources
scripts		scripts
src		src
.editorconfig		.editorconfig
.gitattributes		.gitattributes
.gitignore		.gitignore
.prettierignore		.prettierignore
.prettierrc.yaml		.prettierrc.yaml
LICENSE		LICENSE
README.md		README.md
components.json		components.json
electron-builder.yml		electron-builder.yml
electron.vite.config.ts		electron.vite.config.ts
eslint.config.mjs		eslint.config.mjs
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json
tsconfig.node.json		tsconfig.node.json
tsconfig.web.json		tsconfig.web.json
your_memories_logo_1768971715945.png		your_memories_logo_1768971715945.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Off Grid AI

What it is

A look inside

Features (free & open source)

The Gateway

Run just the gateway (headless)

Off Grid Pro — coming July 2026

Install

Build from source

Architecture — open core

Privacy

License

About

Uh oh!

Releases 21

Packages

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Off Grid AI

What it is

A look inside

Features (free & open source)

The Gateway

Run just the gateway (headless)

Off Grid Pro — coming July 2026

Install

Build from source

Architecture — open core

Privacy

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 21

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages