AIToolkit - Deployment Guide

This documentation covers the deployment of a (mostly) GDPR-compliant Multi-Provider AI Suite with:

💬 Chat (Multiple LLM providers with context-aware history, tool calling, web search)
🔍 Web Search (Agentic tool-calling loop via Tavily, Brave Search, DuckDuckGo, or self-hosted SearXNG)
📧 Microsoft Exchange (E-Mail, Calendar, Contacts CRUD, via EWS)
🎙️ Audio Transcription (Gladia, Whisper, Mistral, Deepgram, AssemblyAI, CrispASR 26+ backends; Speaker Diarization; Chunking; real-time Dictation; Audio Q&A; punctuation/truecasing/hotwords)
🔊 Text-to-Speech (20+ TTS engines via CrispASR: Kokoro, Qwen3-TTS, Orpheus, Chatterbox, Piper, etc.; voice cloning)
📄 Content Extraction for uploads (text extraction from PPTX, DOCX, XLSX, PDF; OCR for images/scans) and OCR (Mistral OCR, CrispEmbed DBNet+TrOCR, local Tesseract, Vision LLMs)
🔤 Translation (NMT + LLM-based DOCX translation via translator.py; offline text translation via CrispASR M2M100/MADLAD 419 languages)
🗣️ Language Detection (text LID via CLD3/GlotLID, audio LID via CrispASR)
🔎 Semantic Search (transcription + image history search via CrispEmbed embeddings)
🔍 AI Watermark Detection (AudioSeal via CrispASR)
📊 Presentation Generation (AI-powered PPTX creation via LLM + pptx_engine)
📋 Format-Transplant (Apply formatting/styles of one DOCX to the content of another)
📦 Cloud Storage (Hetzner Storage Box integration via SMB)
🔄 Resumable Jobs (Job tracking for large files)
👁️ Vision Analysis (Image understanding)
🎨 Image Generation (e.g. FLUX models via Nebius and directly per Black Forest Labs API)
💾 Database Integration (SQLite with per-user AES-256-GCM encryption)
📱 PWA Support (Progressive Web App for mobile installation)
🛡️ Security (Gradio auth= with httpOnly cookies, Fail2Ban, EU-only mode, per-user AES key wrapping, systemd sandboxing, security headers)

The application runs as a Python/Gradio service on localhost:7860, managed by systemd, and exposed via Apache reverse proxy with SSL.

🌍 LLM Providers & EU Compliance

The suite supports multiple LLM providers. EU_ONLY_MODE = True in config.py restricts which providers regular users can access. Admins and Medienverwalter bypass this restriction.

Provider	EU Status	Notes
Mistral	🇫🇷 EU (Paris)	Chat, Vision, OCR, Audio
Scaleway	🇫🇷 EU (Paris)	Chat, Vision, Transcription
Nebius	🇳🇱 EU (Amsterdam)	Chat, Vision, Image Generation
Gladia	🇫🇷 EU	Transcription
Deepgram	🇪🇺 EU endpoint	Transcription (`api.eu.deepgram.com`)
AssemblyAI	🇪🇺 EU endpoint	Transcription (`api.eu.assemblyai.com`)
BFL	🇩🇪 EU (Germany)	Image Generation (FLUX)
Requesty	🇪🇺 EU router	Chat — EU-filtered (see below)
Langdock	🇩🇪 EU (Hamburg, Azure)	Chat — DSGVO-konform, OpenAI-compatible
OpenRouter	🇺🇸 US	Chat, Vision, Image — restricted in EU mode
Groq	🇺🇸 US	Chat, Transcription — restricted in EU mode
Poe	🇺🇸 US	Chat, Vision — restricted in EU mode

Requesty EU Router

Requesty (router.eu.requesty.ai) is a model aggregator routing to 20+ backend providers. The suite applies client-side EU filtering based on model ID prefix and @region suffix:

EU-native (always allowed): nebius/…, mistral/…
EU-regional (allowed only with explicit EU region in model ID):
- azure/model@swedencentral, azure/model@francecentral, azure/model@uksouth
- bedrock/model@eu-central-1, bedrock/model@eu-west-1, etc.
- vertex/model@europe-west1, vertex/model@europe-central2, etc.
Blocked: openai/…, anthropic/…, novita/…, groq/…, xai/…, moonshot/…, deepseek/…, alibaba/…, and others

Role-based access: Admins and Medienverwalter see all 430+ Requesty models (incl. non-EU). Regular users see only the ~117 EU-routed models.

Langdock

Langdock is a German company (Hamburg) providing an OpenAI-compatible API backed by EU Azure infrastructure, fully DSGVO-compliant.

All models served by Langdock are chat models (no image-gen, no embeddings)
No additional EU filtering required — all models are on EU Azure by definition
Use Fetch Models to always get the live list from Langdock's API

🔍 Web Search / Tool Calling

The chat supports an agentic tool-calling loop via tool_executor.py. When enabled, the model can call tools mid-conversation and use the results to ground its final answer.

Supported providers for tool calling: Mistral, Scaleway, Nebius, OpenRouter

Available Tools

Tool	Description
`web_search`	Real-time web search (Tavily, Brave, DuckDuckGo)
`fetch_url`	Fetch and read text content of a URL
`calculate`	Precise math via sympy (algebra, calculus, matrices, …)
`python_exec`	Execute Python 3 code in a sandboxed subprocess
`shell`	BusyBox shell in persistent sandbox (optional, Mayflower)
`generate_image`	AI image generation via FLUX models (Nebius / BFL)
`analyze_image_url`	Vision analysis of an image at a URL
`exchange_list`	Search and list Exchange emails
`exchange_show`	Show full content of a specific email
`exchange_send`	Send email or save as draft (To/CC/BCC/importance/HTML)
`exchange_folders`	View Exchange mailbox folder hierarchy
`exchange_calendar_list`	List calendar events (past, future, all-future, by type)
`exchange_calendar_show`	Full event details by UID or index
`exchange_calendar_add`	Create events / send meeting invitations
`exchange_calendar_edit`	Edit events or recurring series (UID-based)
`exchange_contact_search`	Search contacts: personal, GAL, autocomplete cache, mailbox mining
`exchange_contact_show`	Full contact details + optional email-signature extraction
`exchange_contact_create`	Create a new Exchange contact
`exchange_contact_edit`	Edit an existing Exchange contact by UID
`library_search`	Search bibliographic catalogs (DNB, LoC, BnF, ZDB, IxTheo) — CrispLib
`resolve_identifier`	Resolve DOI / PMID / ISBN / URL → citation metadata — CrispLib
`list_library_endpoints`	List the available bibliographic endpoints — CrispLib

Search backends (selectable in the ⚙️ Konfiguration tab):

Backend	Key Required	Notes
`tavily`	`TAVILY_API_KEY`	Default. AI-optimised results, free 1 000 req/month
`brave_web`	`BRAVE_SEARCH_API_KEY`	Web snippets, own index, EU-friendly, ~1 000 free/month via $5 credit
`brave_llm`	`BRAVE_SEARCH_API_KEY`	Pre-extracted text chunks optimised for LLM grounding / RAG
`brave_answers`	`BRAVE_ANSWERS_API_KEY`	AI-generated answer with citations (separate Brave subscription)
`brave_images`	`BRAVE_SEARCH_API_KEY`	Image search
`duckduckgo`	—	No key needed, rate-limited, production fallback
`searxng`	`SEARXNG_URL`	Self-hosted meta-search (future option)

Fallback chain: if the primary backend fails, the router automatically tries duckduckgo before surfacing an error.

Options exposed per search call: freshness (24h/7d/31d/365d), country, search_lang, safesearch, extra_snippets, max_tokens (LLM context), enable_citations (Answers).

The tool_executor.py module is standalone-testable without running the full app:

python tool_executor.py --query "What is the latest news about Mistral AI?" --backend tavily
python tool_executor.py --query "..." --backend brave_web --provider Scaleway --model mistral-small-3.2-24b-instruct-2506

📚 Bibliographic Search (CrispLib Integration)

Three additional tools (library_search, resolve_identifier, list_library_endpoints) are provided by CrispLib, a sibling repo. They let the LLM search academic library catalogs and resolve known identifiers to structured citation metadata.

Capabilities

library_search — search SRU (DNB, LoC, BnF, ZDB) and IxTheo (Index Theologicus) by title, author, isbn, subject, year, or free-text. Returns up to 25 normalized records (title, authors, year, publisher, journal, volume, ISBN, DOI, …).
resolve_identifier — DOI → Crossref CSL-JSON · PMID → NCBI esummary · ISBN → Open Library · URL → Wikipedia Citoid. Auto-detects type. Pure-stdlib + requests, no heavy deps.
list_library_endpoints — discovery so the model can pick the right catalog before calling library_search.

Installation

CrispLib is loaded via sys.path injection rather than vendored, so upstream fixes propagate automatically. Two options:

# Option A — sibling clone (default)
cd "$(dirname "$APP_DIR")"        # parent of your install directory
git clone https://github.com/CrispStrobe/CrispLib.git

# Option B — custom location
export CRISPLIB_PATH=/opt/CrispLib

The adapter crisplib_tool.py looks for ../CrispLib relative to AIToolkit, or honours CRISPLIB_PATH if set. If CrispLib is missing, the three tools simply don't register and the rest of the app keeps working.

Hardening: defusedxml>=0.7.1 is now in requirements.txt — CrispLib's SRU/OAI-PMH parsers prefer it over stdlib xml.etree.ElementTree to avoid XXE / billion-laughs / quadratic-blowup attacks against untrusted catalog responses. The stdlib path remains as a fallback if defusedxml is unavailable.

Per-user toggle

The Konfiguration tab has a 📚 Bibliothek (CrispLib) checkbox. Enabling it adds all three CrispLib tools to that user's enabled_tools list, persisted in UserSettings.tool_preferences_json. Default is off — opt-in per user. To make it on-by-default for new users, append the tool names to DEFAULT_TOOLS_STANDARD in tool_executor.py.

Standalone testing

# Resolve a DOI
python crisplib_tool.py --resolve 10.1038/nature12373

# Search DNB for a title
python crisplib_tool.py --search-title "Python" --endpoint dnb

# List endpoints
python crisplib_tool.py --list

# CrispLib's own resolver (no AIToolkit needed)
cd ../CrispLib && python identifier_resolver.py 9783658310844 isbn

🎙️ Local ASR (CrispASR Integration)

Offline, API-key-free audio transcription is powered by CrispASR, a sibling repo — one C++ binary, 22+ ASR model families, zero Python dependencies. AIToolkit talks to CrispASR via two paths:

Local binary (CrispASR provider) — spawns the crispasr CLI directly. Works offline, no network.
Remote HTTP (CrispASR-HTTP provider) — POSTs to any OpenAI-compatible HTTP server (/v1/audio/transcriptions, /load, /health). Works against CrispASR's own C++ server (examples/server), the Gradio HF space, or a custom FastAPI/Express wrapper. Lets a fleet share one GPU host.

Both paths share the same backend list and accept the same lang / diarize options.

Backends exposed in the UI

The provider dropdown lists 22 backends (kept in sync with CrispASR/README.md). Models marked auto-DL are fetched by the binary on first use into ~/.cache/crispasr/.

Backend	Notes
parakeet	NVIDIA Parakeet-TDT-0.6B — fast, multilingual, auto-DL (~467 MB)
whisper	OpenAI Whisper — 99 languages, needs a `ggml-*.bin` in `CrispASR/models/`
canary	NVIDIA Canary-1B — explicit language + speech translation, auto-DL
cohere	Cohere Transcribe — lowest WER, auto-DL
qwen3	Qwen3-ASR-0.6B — 30+ languages + Chinese dialects, auto-DL
voxtral / voxtral4b	Mistral Voxtral-Mini speech-LLM (3B / 4B realtime), auto-DL
granite / granite-4.1 / granite-4.1-plus / granite-4.1-nar	IBM Granite Speech (Apache-2.0); 4.1 variants run encoder as a single ggml graph (~2.5× faster on M1)
fastconformer-ctc	NeMo FastConformer + CTC, lightweight
wav2vec2	Wav2Vec2 CTC — single-language, manual GGUF needed
moonshine / moonshine-streaming	Edge-targeted, sliding-window streaming variant
omniasr	wav2vec2-style + 24–48L transformer + CTC, 1600+ languages
firered-asr	Mandarin + 20 Chinese dialects, includes LID
glm-asr	GLM-ASR-Nano — Mandarin / English / Cantonese (17 langs)
kyutai-stt	Kyutai Mimi codec + causal LM (en, fr)
mimo-asr	Xiaomi MiMo — Mandarin Wu/Cantonese/Hokkien/Sichuanese + EN code-switch
vibevoice	Microsoft VibeVoice-ASR — diarization + hotwords (50+ langs)
gemma4-e2b	Google Gemma-4 E2B — USM Conformer + Gemma decoder, 140+ langs

Select CrispASR (local binary) or CrispASR-HTTP (remote server) as the transcription engine, then pick a backend from the Modell dropdown.

Installation (fresh clone)

CrispASR is compiled on first use — no manual setup required:

# Option A — sibling clone (default, auto-detected)
cd "$(dirname "$APP_DIR")"        # parent of your install directory
git clone https://github.com/CrispStrobe/CrispASR.git

# The next time a CrispASR transcription is started, AIToolkit will
# cmake-configure and build the binary automatically (requires cmake + C++17).
# The build helper prefers the Ninja generator + `crispasr` target into
# build-ninja-compile/ and falls back to make + whisper-cli into build/.

# Option B — custom location
export CRISPASR_PATH=/opt/CrispASR

The adapter crispasr_tool.py looks for ../CrispASR relative to AIToolkit, or honours CRISPASR_PATH if set. It probes both build-ninja-compile/bin/ (the active upstream layout) and the legacy build/bin/ for an existing binary. If neither source nor binary is found, AIToolkit attempts a git clone of the repo and then a cmake build. If the build fails (missing cmake, missing compiler), the CrispASR engine is simply absent from the provider list and all other engines keep working.

Build prerequisites:

# Ubuntu / Debian
sudo apt install cmake g++ build-essential ninja-build

# macOS (Homebrew)
brew install cmake ninja
# Xcode CLT already provide clang

# Windows
# Install Visual Studio Build Tools ≥ 2022 + CMake from cmake.org

Optional GPU acceleration:

cmake -B build-ninja-compile -G Ninja -DCMAKE_BUILD_TYPE=Release -DGGML_CUDA=ON   # NVIDIA
cmake -B build-ninja-compile -G Ninja -DCMAKE_BUILD_TYPE=Release -DGGML_METAL=ON  # Apple Silicon

Remote HTTP server

To use the CrispASR-HTTP provider, point AIToolkit at any OpenAI-compatible CrispASR server. Out of the box, that's:

CrispASR/examples/server — CrispASR's own C++ cpp-httplib server (built with the crispasr-server target).
CrispASR/hf-space/ — the Gradio HuggingFace space wrapper.
Any custom FastAPI / Express wrapper that proxies to the CrispASR binary.

The server must expose POST /v1/audio/transcriptions, POST /load, and GET /health.

# In .env or systemd EnvironmentFile
CRISPASR_REMOTE_HTTP_URL=https://asr.example.com
CRISPASR_REMOTE_HTTP_KEY=<bearer-token-or-empty>

Per-request, AIToolkit:

Pings /health and aborts fast on failure.
Calls /load (idempotent) to switch the server to the requested backend.
Transcodes non-native audio (.m4a/.mp4/.webm/…) to 16 kHz mono PCM via ffmpeg.
Uploads the file to /v1/audio/transcriptions with response_format=verbose_json and surfaces speaker labels / per-segment timestamps when the server provides them.

Optional whisper model search paths

For the whisper backend AIToolkit looks for a ggml-*.bin in (first match wins):

$CRISPASR_WHISPER_MODEL — explicit override
<crispasr_root>/models/
/Volumes/backups/ai/crispasr-models/ — only if the volume is mounted (matches CrispASR's CLAUDE.md)
~/.cache/crispasr/ — the binary's own auto-DL cache

Standalone testing

# Verify adapter + auto-build
python crispasr_tool.py

# Test a transcription (binary must be built first)
../CrispASR/build-ninja-compile/bin/crispasr --backend parakeet -m auto -f /tmp/test.wav

📧 Microsoft Exchange Integration

The suite provides direct Exchange Web Services (EWS) access via exchange_cli.py (using exchangelib). It works both as a standalone CLI tool and as a set of LLM-callable tools in the chat interface.

Features

E-Mail

List, filter, and search inbox (by sender, subject, body, date, read status, regex)
Read full email content
Send emails: To, CC, BCC, Reply-To, importance (Low/Normal/High), plain text or HTML body
Save as draft instead of sending
Move emails between folders
Browse folder hierarchy (incl. hidden/technical folders)

Kalender

List events: upcoming, past, all-future (up to 10 years), by type (meeting/appointment/recurring/allday)
Show full event details: attendees, description, location, OLX plugin fields, recurrence pattern
Create events with attendees (meeting invitations via EWS)
Edit events: date/time, subject, description, location, categories, attendees, OLX fields
Recurring series support: edit single occurrence or entire series master
UID-based lookup (stable across pagination/filter changes)
Automatic chunking of Exchange's 2-year view limit (720-day windows, transparent to the caller)

Kontakte

Search across multiple sources:
- personal — all sub-folders of the personal contacts folder
- gal — Global Address List (Active Directory) via resolve_names
- suggested — Outlook autocomplete cache: Recipient Cache, Suggested Contacts, AllContacts, RelevantContacts, Contact Search, MyContactsExtended, GAL Contacts, Organizational Contacts
- mailbox — mine inbox/sent for unique senders and recipients
- all (default) — personal + GAL + suggested
- full — all sources including mailbox scan
Show full contact details: phones, emails, physical addresses, notes, categories
Signature intelligence: scan inbox/sent for the contact's emails and extract phone, title, website from signature blocks
Create and edit contacts (name, company, department, job title, emails, phones, addresses, notes, categories)
Deduplication by item ID across sources

CLI Usage

# E-Mail
python exchange_cli.py list --limit 20 --unread
python exchange_cli.py list --sender boss@company.de --start 2026-01-01
python exchange_cli.py show 3
python exchange_cli.py send --to colleague@company.de --subject "Meeting" --body "Hi..." \
    --cc cc@company.de --importance High
python exchange_cli.py send --to draft@company.de --subject "Draft" --body "..." --draft
python exchange_cli.py folders

# Kalender
python exchange_cli.py calendar-list --limit 20
python exchange_cli.py calendar-list --start 2026-04-01 --end 2026-06-30
python exchange_cli.py calendar-list --all-future --limit 50
python exchange_cli.py calendar-list --past --limit 10
python exchange_cli.py calendar-show --uid <UID>
python exchange_cli.py calendar-show 2 --start 2026-04-01
python exchange_cli.py calendar-add --subject "Team Meeting" --start "2026-05-01 10:00" \
    --end "2026-05-01 11:00" --attendees a@x.de,b@x.de --location "Raum 3"
python exchange_cli.py calendar-edit --uid <UID> --new-start "2026-05-02 10:00" \
    --new-end "2026-05-02 11:00" --send-updates

# Kontakte
python exchange_cli.py contact-search "Müller"
python exchange_cli.py contact-search "Dominic" --source suggested
python exchange_cli.py contact-search "sales" --source full --limit 50
python exchange_cli.py contact-show 1 --query "Müller"
python exchange_cli.py contact-show --uid <UID> --with-signatures
python exchange_cli.py contact-create "Max Mustermann" --company "ACME" \
    --email-primary max@acme.de --phone-business "+49 711 123456"
python exchange_cli.py contact-edit <UID> --job-title "Senior Developer"

Configuration

Exchange credentials are read from environment variables (.env file or systemd EnvironmentFile):

EXCHANGE_SERVER=mail.company.de
EXCHANGE_USER=DOMAIN\username      # or UPN: user@company.de
EXCHANGE_PASSWORD=secret
EXCHANGE_EMAIL=user@company.de     # primary SMTP address (sender address)

Per-user encrypted credentials can be stored in the database (AES-256-GCM with per-user key wrapping) and passed at runtime, overriding the global env vars.

Installation

pip install exchangelib

exchangelib is the only additional dependency. The rest of the Exchange module uses stdlib only.

Testing

python -m pytest tests/test_exchange.py tests/test_exchange_calendar.py tests/test_exchange_contacts.py -v

All tests use stubs and do not require a real Exchange server.

📋 Prerequisites

OS: Ubuntu 20.04 LTS or newer
RAM: Minimum 8GB (for handling audio files)
Root/Sudo Access
Domain: DNS A-Record pointing to server IP (e.g., ai.yourdomain.de)
Storage: Hetzner Storage Box with sub-account access
API Keys (stored in .env file):
- MISTRAL_API_KEY — multipurpose: chat, OCR, vision, audio
- SCALEWAY_API_KEY — chat, transcription
- NEBIUS_API_KEY — chat, image generation
- GLADIA_API_KEY — long-form transcription
- DEEPGRAM_API_KEY — transcription, EU endpoint
- ASSEMBLYAI_API_KEY — transcription, EU endpoint
- GROQ_API_KEY — transcription, chat (US, restricted in EU mode)
- REQUESTY_API_KEY — EU router: chat across many providers
- LANGDOCK_API_KEY — chat, DSGVO-konform, German company, EU Azure
- OPENROUTER_API_KEY — optional, additional models (US, restricted in EU mode)
- BFL_API_KEY — optional, FLUX image generation
- POE_API_KEY — optional, additional models (US, restricted in EU mode)
- TAVILY_API_KEY — web search, free 1 000 req/month
- BRAVE_SEARCH_API_KEY — web search: snippets, LLM context, images
- BRAVE_ANSWERS_API_KEY — web search: AI-generated answers (separate subscription)
- SEARXNG_URL — optional, self-hosted SearXNG base URL (e.g. http://localhost:8888)
- EXCHANGE_SERVER — Exchange server hostname (for Exchange integration)
- EXCHANGE_USER — Exchange username (DOMAIN\user or UPN)
- EXCHANGE_PASSWORD — Exchange password
- EXCHANGE_EMAIL — Exchange primary SMTP address

🔧 Deployment Variables

Define these variables before following the deployment steps. All bash blocks below reference them:

APP_DIR=/var/www/aitoolkit         # ← change to your install path
APP_USER=www-data                  # ← service account (non-root)
DOMAIN=ai.yourdomain.de            # ← your FQDN
STORAGE_MOUNT=/mnt/storage         # ← Storage Box mount point
VENV=$APP_DIR/venv

🛠️ Step 1: System Dependencies & Security

CRITICAL: FFmpeg must be in PATH. cifs-utils is required for Storage Box. OCR and document tools are required for the Content Extractor.

sudo apt update
sudo apt upgrade -y

# Install all required system packages
sudo apt install -y \
    wget \
    python3-pip \
    ffmpeg \
    apache2 \
    certbot \
    python3-certbot-apache \
    sqlite3 \
    fail2ban \
    cifs-utils \
    tesseract-ocr \
    tesseract-ocr-deu \
    poppler-utils \
    pandoc \
    cmake \
    g++ \
    build-essential \
    git

CrispASR: cmake, g++, build-essential, and git are required to auto-build the local ASR binary. If omitted, CrispASR is silently unavailable while all cloud transcription engines keep working.

Firewall Configuration (UFW)

CRITICAL: You must allow SSH before enabling the firewall, or you will lock yourself out.

# 1. Allow incoming SSH connections
sudo ufw allow OpenSSH

# 2. Allow Web Traffic (HTTP/HTTPS)
sudo ufw allow 'Apache Full'

# 3. Enable the Firewall
sudo ufw enable

# 4. Verify Status
sudo ufw status

Configure Swap Space (prevent OOM crashes)

sudo fallocate -l 4G /swapfile
sudo chmod 600 /swapfile
sudo mkswap /swapfile
sudo swapon /swapfile
echo '/swapfile none swap sw 0 0' | sudo tee -a /etc/fstab

Verify critical installations:

which ffmpeg                    # Must show: /usr/bin/ffmpeg
sudo systemctl status fail2ban  # Must be: active (running)

📦 Step 2: Storage Box Mounting (Hetzner)

2.1 Create Mount Point

sudo mkdir -p "$STORAGE_MOUNT"

2.2 Create Credentials File

sudo nano /etc/cifs-credentials

Content:

username=u12345-sub1
password=YOUR_SUBACCOUNT_PASSWORD

sudo chmod 600 /etc/cifs-credentials

2.3 Configure Auto-Mount via fstab

sudo nano /etc/fstab

Add (single line):

//u12345-sub1.your-storagebox.de/u12345-sub1 /mnt/storage cifs credentials=/etc/cifs-credentials,uid=$(id -u www-data),gid=$(id -g www-data),file_mode=0770,dir_mode=0770,nounix,vers=3.0,x-systemd.automount,x-systemd.idle-timeout=60 0 0

Note: Replace www-data with your $APP_USER if different. The uid/gid values ensure the service account owns mounted files. Use id -u $APP_USER and id -g $APP_USER to look up the numeric IDs, then paste those literal numbers into fstab (shell expansions do not work in fstab). Replace /mnt/storage with your $STORAGE_MOUNT path.

2.4 Mount and Verify

sudo systemctl daemon-reload
sudo systemctl restart remote-fs.target
ls -la "$STORAGE_MOUNT"  # Should list files from Storage Box

🐍 Step 3: Application Setup

3.1 Create Directory Structure

sudo mkdir -p "$APP_DIR"
sudo mkdir -p "$APP_DIR/static"
sudo mkdir -p "$APP_DIR/generated_images"
sudo mkdir -p "$APP_DIR/jobs"
cd "$APP_DIR"

3.2 Clone Repository or Copy Files

Option A: Clone from GitHub

git clone https://github.com/YOUR_USERNAME/YOUR_REPO.git .

# Optional sibling repos (auto-detected by path, no extra config needed):
#   CrispLib — bibliographic search tools
git clone https://github.com/CrispStrobe/CrispLib.git ../CrispLib
#   CrispASR — offline/local ASR (auto-built on first use if cmake is available)
git clone https://github.com/CrispStrobe/CrispASR.git ../CrispASR

Option B: Manual File Upload

Upload these files:

app.py — Top-level Gradio Blocks assembly + login/logout wiring (now ~2 900 lines, was ~6 800 before the Phase 7 per-tab split)
app_helpers.py — Pure-Python helpers (login, file-explorer paths, provider dropdown updaters)
config.py — Central provider & key configuration
crypto_utils.py — Encryption logic & key wrapping
db_models.py — SQLAlchemy models & schema migrations
db_ops.py — Database CRUD operations
provider_utils.py — LLM provider clients, compliance, routing
context_utils.py — Token estimation, context pruning, chunking
image_utils.py — Image encoding & resize helpers
image_gen_utils.py — Image generation & vision analysis
transcription_utils.py — Transcription dispatcher + per-provider runners
transcription_youtube.py — YouTube + generic-URL audio download (split out from transcription_utils)
storage_utils.py — File storage, pCloud, per-user directories
chat_handlers.py — Chat logic, content extraction (UniversalExtractor), attachments
ocr_utils.py — OCR engines (Mistral, Groq, OpenRouter, Ollama/GLM, Vision LLM)
translation_utils.py — Async DOCX translation with progress streaming
export_utils.py — Export to .docx / .md / .txt
tool_executor.py — Tool-calling router (web search, Exchange, image gen, …)
exchange_cli.py — Microsoft Exchange integration (EWS: mail, calendar, contacts)
translator.py — DOCX translation engine (NMT + LLM)
format_transplant.py — DOCX format-transplant engine
dictation_manager.py — Real-time dictation (Deepgram, AssemblyAI, Gladia, Mistral Realtime, Faster-Whisper, CrispASR local + HTTP)
crispasr_tool.py — CrispASR adapter: binary discovery, auto-clone, auto-build
ui/ package — Per-tab Gradio modules. Each ui/<tab>_tab.py exposes a build_<tab>_tab(session_state) function called from app.py. Currently 9 modules: chat_tab, transcription_tab, vision_tab, image_gen_tab, ocr_tab, dictation_tab, doc_translator_tab, format_transplant_tab, admin_tab.
tests/ — pytest suite (76 tests). Run with python -m pytest tests/.
pyproject.toml — ruff config (line length, ignored rules) + Python tooling.
requirements.txt — Python dependencies
static/ folder: custom.css, manifest.json, pwa.js, service-worker.js, icon-192.png, icon-512.png

3.3 Create Environment File

CRITICAL: Store all API keys in a secured .env file.

sudo nano "$APP_DIR/.env"

Required content:

# LLM Providers
MISTRAL_API_KEY=...
SCALEWAY_API_KEY=...
NEBIUS_API_KEY=...
OPENROUTER_API_KEY=...
REQUESTY_API_KEY=rqsty-sk-...
LANGDOCK_API_KEY=sk-...

# Transcription
GLADIA_API_KEY=...
DEEPGRAM_API_KEY=...
ASSEMBLYAI_API_KEY=...
GROQ_API_KEY=gsk_...
# CrispASR remote HTTP server (optional — local binary works without these)
CRISPASR_REMOTE_HTTP_URL=https://asr.example.com
CRISPASR_REMOTE_HTTP_KEY=...

# Image Generation
BFL_API_KEY=...

# Web Search (tool calling)
TAVILY_API_KEY=tvly-...          # Free 1 000 req/month — https://tavily.com
BRAVE_SEARCH_API_KEY=BSA...      # ~1 000 free/month via $5 credit — https://brave.com/search/api/
BRAVE_ANSWERS_API_KEY=BSA...     # Separate Brave Answers subscription
# SEARXNG_URL=http://localhost:8888  # Optional: self-hosted SearXNG

# Microsoft Exchange (EWS)
EXCHANGE_SERVER=mail.company.de
EXCHANGE_USER=DOMAIN\username    # or UPN: user@company.de
EXCHANGE_PASSWORD=...
EXCHANGE_EMAIL=user@company.de

# Optional / US providers (restricted in EU mode)
POE_API_KEY=...
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...

# Misc
GRADIO_ANALYTICS_ENABLED=False

Secure it:

sudo chmod 600 "$APP_DIR/.env"

Local development: The app automatically loads .env via python-dotenv when running locally (python app.py). On the VPS the env vars are loaded by systemd's EnvironmentFile= directive instead.

3.4 Create Virtual Environment (Miniconda Method)

# 1. Download and Install Miniconda
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh -b -p /opt/miniconda

# 2. Initialize Conda
/opt/miniconda/bin/conda init bash
source ~/.bashrc

# Accept TOS
conda tos accept --override-channels --channel https://repo.anaconda.com/pkgs/main
conda tos accept --override-channels --channel https://repo.anaconda.com/pkgs/r

# 3. Create the 'ak_suite' environment with Python 3.10
/opt/miniconda/bin/conda create -n ak_suite python=3.10 -y

# 4. Link it to the app directory
cd "$APP_DIR"
rm -rf venv
ln -s /opt/miniconda/envs/ak_suite ./venv
conda activate ak_suite

3.5 Install Python Dependencies

conda activate ak_suite
pip install --upgrade pip wheel
pip install -r "$APP_DIR/requirements.txt"

Optional but recommended:

pip install pqcrypto   # Post-quantum encryption (Kyber-512)

Alternatively, for single installs without activating the environment:

$VENV/bin/pip install [package]

3.6 Initialize Permissions

sudo chown -R $APP_USER:$APP_USER "$APP_DIR"
sudo chmod 600 "$APP_DIR/.env"
sudo chmod 640 "$APP_DIR/.master_key"
sudo chmod 660 "$APP_DIR"/*.db
sudo touch "$APP_DIR/app.log"
sudo chown $APP_USER:$APP_USER "$APP_DIR/app.log"
sudo chmod 640 "$APP_DIR/app.log"

Note: Do not use chmod -R 755 or chmod 666 on logs. Restrictive defaults prevent information disclosure and accidental writes.

🌐 Step 4: Apache Configuration

This configuration serves PWA files directly via Apache and proxies the Gradio app with proper HTTPS headers.

4.1 Enable Required Modules

sudo a2enmod proxy proxy_http proxy_wstunnel rewrite headers ssl

4.2 SSL Certificate Setup

sudo certbot --apache -d "$DOMAIN"

4.3 HTTP Config (Port 80 - HTTPS Redirect)

Edit /etc/apache2/sites-available/aitoolkit.conf:

<VirtualHost *:80>
    ServerName ${DOMAIN}
    ErrorLog ${APACHE_LOG_DIR}/aitoolkit_error.log
    CustomLog ${APACHE_LOG_DIR}/aitoolkit_access.log combined
    RewriteEngine On
    RewriteRule ^ https://%{SERVER_NAME}%{REQUEST_URI} [END,NE,R=permanent]
</VirtualHost>

Note: Replace ${DOMAIN} with your actual FQDN in the Apache config files. Apache does not expand shell variables — the ${DOMAIN} notation here is a placeholder for documentation.

4.4 HTTPS Config (Port 443 - Main Config)

Edit /etc/apache2/sites-available/aitoolkit-le-ssl.conf:

<IfModule mod_ssl.c>
<VirtualHost *:443>
    ServerName ${DOMAIN}

    # =================================================
    # SECURITY HEADERS
    # =================================================
    Header always unset Server
    Header always unset X-Powered-By
    Header unset Server
    Header unset X-Powered-By

    # HSTS — enforce HTTPS for 1 year, include subdomains
    Header always set Strict-Transport-Security "max-age=31536000; includeSubDomains"
    # Prevent clickjacking
    Header always set X-Frame-Options "DENY"
    # Prevent MIME-type sniffing
    Header always set X-Content-Type-Options "nosniff"
    # Referrer policy — send origin only on cross-origin
    Header always set Referrer-Policy "strict-origin-when-cross-origin"
    # Permissions policy — disable unused browser features
    Header always set Permissions-Policy "camera=(), microphone=(self), geolocation=(), payment=()"

    # =================================================
    # BLOCK OPENAPI & SENSITIVE ENDPOINTS - FIRST PRIORITY
    # =================================================
    RewriteEngine On
    RewriteRule ^/openapi\.json$ - [F,L]
    RewriteRule ^/docs/?$ - [F,L]
    RewriteRule ^/redoc/?$ - [F,L]
    RewriteRule ^/api(/.*)?$ - [F,L]
    RewriteRule ^/gradio_api/openapi\.json$ - [F,L]

    # Block Gradio info/config endpoints (prevent app metadata leakage)
    RewriteRule ^/gradio_api/info$ - [F,L]
    RewriteRule ^/config$ - [F,L]

    # =================================================
    # 1. STATIC FILES (Served by Apache)
    # =================================================
    # NOTE: Replace ${APP_DIR} below with your actual install path
    Alias /static ${APP_DIR}/static
    <Directory ${APP_DIR}/static>
        Require all granted
        Options -Indexes
        AddType text/css .css
        AddType application/javascript .js
        AddType image/png .png
        Header set Cache-Control "public, max-age=31536000, immutable"
    </Directory>

    Alias /manifest.json ${APP_DIR}/static/manifest.json
    <Files "manifest.json">
        Require all granted
        Header set Content-Type "application/manifest+json"
        Header set Cache-Control "no-cache"
    </Files>

    Alias /service-worker.js ${APP_DIR}/static/service-worker.js
    <Files "service-worker.js">
        Require all granted
        Header set Content-Type "application/javascript"
        Header set Cache-Control "no-cache"
        Header set Service-Worker-Allowed "/"
    </Files>

    # =================================================
    # 2. PROXY SETTINGS (Gradio)
    # =================================================
    ProxyPreserveHost On
    RequestHeader set X-Forwarded-Proto "https"
    RequestHeader set X-Forwarded-Port "443"
    RequestHeader set X-Forwarded-Host "${DOMAIN}"

    RewriteCond %{HTTP:Upgrade} =websocket [NC]
    RewriteRule /(.*)           ws://127.0.0.1:7860/$1 [P,L]

    ProxyPass /static !
    ProxyPass /manifest.json !
    ProxyPass /service-worker.js !
    ProxyPass / http://127.0.0.1:7860/
    ProxyPassReverse / http://127.0.0.1:7860/

    # =================================================
    # 3. SSL CONFIGURATION
    # =================================================
    # NOTE: Replace ${DOMAIN} with your actual FQDN in the cert paths
    SSLCertificateFile /etc/letsencrypt/live/${DOMAIN}/fullchain.pem
    SSLCertificateKeyFile /etc/letsencrypt/live/${DOMAIN}/privkey.pem
    Include /etc/letsencrypt/options-ssl-apache.conf

    LimitRequestBody 1048576000
    ProxyTimeout 600
    TimeOut 600
</VirtualHost>
</IfModule>

4.5 Enable Sites and Restart Apache

sudo a2ensite aitoolkit.conf
sudo a2ensite aitoolkit-le-ssl.conf
sudo apache2ctl configtest  # Should show "Syntax OK"
sudo systemctl restart apache2

⚙️ Step 5: Systemd Service

Create /etc/systemd/system/aitoolkit.service:

[Unit]
Description=Gradio App
After=network.target

[Service]
Type=simple
User=www-data
WorkingDirectory=${APP_DIR}
Environment="PATH=${VENV}/bin:/usr/local/bin:/usr/bin:/bin"
Environment="GRADIO_ANALYTICS_ENABLED=False"
Environment="GRADIO_SERVER_NAME=127.0.0.1"
EnvironmentFile=${APP_DIR}/.env
ExecStart=${VENV}/bin/python app.py
Restart=always
RestartSec=10

# Sandboxing — restrict what the service can do
NoNewPrivileges=true
ProtectSystem=strict
ProtectHome=true
PrivateTmp=true
ReadWritePaths=${APP_DIR}
ReadWritePaths=${STORAGE_MOUNT}

[Install]
WantedBy=multi-user.target

Note: Replace ${APP_DIR}, ${VENV}, and ${STORAGE_MOUNT} with your actual paths in the unit file. systemd does not expand shell variables — these are placeholders for documentation.

Key Points:

EnvironmentFile=${APP_DIR}/.env — loads API keys from .env (replaces load_dotenv for production)
User=www-data — runs as a non-root service account (never run as root in production)
NoNewPrivileges=true — prevents privilege escalation
ProtectSystem=strict — makes /usr, /boot, /etc read-only
ProtectHome=true — hides /home, /root, /run/user
PrivateTmp=true — isolates /tmp to a private namespace
ReadWritePaths= — only the app and storage directories are writable
Restart=always — auto-restart on crashes

Enable and start:

sudo systemctl daemon-reload
sudo systemctl enable aitoolkit
sudo systemctl start aitoolkit
sudo systemctl status aitoolkit

🛡️ Step 6: Fail2Ban Configuration

6.1 Create Local Configuration

sudo cp /etc/fail2ban/jail.conf /etc/fail2ban/jail.local
sudo nano /etc/fail2ban/jail.local

6.2 Enable Apache Protections

[sshd]
enabled = true

[apache-auth]
enabled = true

[apache-badbots]
enabled = true

[apache-noscript]
enabled = true

[apache-overflows]
enabled = true

[apache-404-scan]
enabled  = true
port     = http,https
filter   = apache-404-scan
logpath  = /var/log/apache2/*access.log
maxretry = 5
findtime = 600
bantime  = 3600

6.3 Restart Fail2Ban

sudo systemctl restart fail2ban
sudo fail2ban-client status

🔄 Step 7: Log Rotation

sudo nano /etc/logrotate.d/aitoolkit

Content (replace the path with your $APP_DIR):

# Replace with your $APP_DIR path:
/var/www/aitoolkit/app.log {
    daily
    missingok
    rotate 14
    compress
    delaycompress
    notifempty
    create 640 www-data www-data
    copytruncate
}

Note: Replace the log path and www-data with your $APP_DIR/app.log and $APP_USER respectively. Mode 640 ensures only the service account and its group can read the log.

Test:

sudo logrotate -d /etc/logrotate.d/aitoolkit

💻 Local Development

The app runs locally without any server configuration:

# Clone repo
git clone https://github.com/YOUR_USERNAME/YOUR_REPO.git
cd AIToolkit

# Install dependencies
pip install -r requirements.txt
pip install python-dotenv ddgs exchangelib

# Create .env with your API keys (see Step 3.3 for key names)
cp .env.example .env   # or create manually

# Run
python app.py
# → http://127.0.0.1:7860

Local vs VPS differences:

.env is loaded automatically via python-dotenv locally; on VPS it's loaded by systemd
Storage mounts to ./storage/ locally instead of $STORAGE_MOUNT
Encryption keys saved to ./.master_key locally instead of $APP_DIR/.master_key
Both directories are created automatically if missing

Reset local passwords (example usernames — replace with your own):

python -c "
import bcrypt, sqlite3
db = sqlite3.connect('aitoolkit.db')
# NOTE: 'admin123' and 'user123' are example usernames for local dev only.
# Replace with your actual usernames.
for user, pw in [('admin123', 'admin123'), ('user123', 'user123')]:
    h = bcrypt.hashpw(pw.encode(), bcrypt.gensalt(12)).decode()
    db.execute('UPDATE users SET password_hash=? WHERE username=?', (h, user))
    print(f'Reset {user}')
db.commit(); db.close()
"

Test web search standalone:

python tool_executor.py --query "latest Mistral AI news" --backend tavily
python tool_executor.py --query "..." --backend brave_web --provider Scaleway

Test Exchange CLI standalone:

python exchange_cli.py list --limit 5
python exchange_cli.py calendar-list
python exchange_cli.py contact-search "Mustermann"

Run tests:

# Full suite (76 tests + Exchange suite)
python -m pytest tests/ -v

# Just the gr.Blocks structural sentinel (catches dropped event bindings):
python -m pytest tests/test_app_blocks.py -v

# Skip the Exchange tests (need a real EWS server):
python -m pytest tests/ -v \
  --ignore=tests/test_exchange.py \
  --ignore=tests/test_exchange_calendar.py \
  --ignore=tests/test_exchange_contacts.py \
  --ignore=tests/test_integration.py

Lint / format:

# Medium+ severity rules (matches CI gate)
ruff check --select=F821,S105,S324,B904,B008,E711,E712,E722,RUF012,S607,S101 .

# Auto-format
ruff format .

CI: .github/workflows/ci.yml runs ruff (medium+ rule set), bandit (-ll), mypy on the typed-cluster modules, and the full pytest suite (excluding Exchange tests) on every push and PR.

📁 Final File Structure

$APP_DIR/
├── app.py                      # Top-level Blocks assembly + login/logout wiring
├── app_helpers.py              # Pure-Python helpers (auth, dropdowns, file-explorer paths)
│
│   ── Per-tab UI modules (Phase 7 split — each tab is a standalone module) ──
├── ui/
│   ├── __init__.py
│   ├── chat_tab.py             # 💬 Chat (provider/model + tools + history + attachments)
│   ├── transcription_tab.py    # 🎙️ Transkription (upload / Storage Box / URL download)
│   ├── vision_tab.py           # 👁️ Vision (image Q&A via vision LLMs)
│   ├── image_gen_tab.py        # 🎨 Bilderzeugung (text→image)
│   ├── ocr_tab.py              # 📄 OCR (Mistral / Groq / Vision LLM / OpenRouter)
│   ├── dictation_tab.py        # 🎙️ Diktieren (real-time speech-to-text streaming)
│   ├── doc_translator_tab.py   # 📄 Dokument-Übersetzer (DOCX/PPTX with format preservation)
│   ├── format_transplant_tab.py # 📋 Format-Transplant (DOCX layout transplant + LLM stylepass)
│   └── admin_tab.py            # 👥 Benutzerverwaltung (admin-only user management)
│   #  (📚 Verlauf & Verwaltung still inline in app.py — pending split)
│
│   ── Core ──────────────────────────────────────────────────────
├── config.py                   # Provider & key configuration
├── crypto_utils.py             # AES-256-GCM encryption & key wrapping
├── db_models.py                # SQLAlchemy models & schema migrations
├── db_ops.py                   # Database CRUD operations
│
│   ── Provider / Inference ──────────────────────────────────────
├── provider_utils.py           # LLM provider clients, compliance, routing
├── context_utils.py            # Token estimation, context pruning, chunking
├── image_utils.py              # Image encoding & resize helpers
├── image_gen_utils.py          # Image generation & vision analysis
│
│   ── Feature Modules ──────────────────────────────────────────
├── transcription_utils.py      # Transcription dispatcher + per-provider runners
├── transcription_youtube.py    # YouTube + URL download (split from transcription_utils)
├── storage_utils.py            # File storage, pCloud, per-user directories
├── chat_handlers.py            # Chat logic, UniversalExtractor, attachments
├── ocr_utils.py                # OCR engines (Mistral, Groq, OpenRouter, Ollama/GLM, Vision LLM)
├── translation_utils.py        # Async DOCX translation with progress streaming
├── export_utils.py             # Export to .docx / .md / .txt
│
│   ── Standalone Tools ─────────────────────────────────────────
├── tool_executor.py            # Tool-calling router: web search, Exchange, image gen, … (standalone-testable)
├── exchange_cli.py             # Microsoft Exchange EWS: mail, calendar, contacts (standalone CLI + LLM tools)
├── translator.py               # DOCX translation engine (NMT + LLM)
├── format_transplant.py        # DOCX format-transplant engine
├── dictation_manager.py        # Real-time dictation (DG, AAI, Gladia, Mistral, Whisper, CrispASR)
├── crispasr_tool.py            # CrispASR adapter: binary discovery, auto-clone, auto-build
├── fn2md.py                    # Function-to-markdown helper
│
│   ── Tests (76 unit/integration + Exchange suite) ────────────
├── tests/
│   ├── conftest.py                 # sys.path + env defaults shared across tests
│   ├── test_app_blocks.py          # gr.Blocks structural sentinel — catches dropped event bindings
│   ├── test_app_helpers.py         # Pure-helper unit tests (auth, file-explorer paths, dispatchers)
│   ├── test_provider_utils.py      # EU mode, restricted providers, role-based filtering
│   ├── test_crispasr_tool.py       # Binary path probing, build dirs, model search precedence
│   ├── test_crispasr_remote.py     # HTTP runner via `responses`, async dictation via `httpx.MockTransport`
│   ├── test_transcription_youtube.py # URL classifier, channel-id extraction, whitelist
│   ├── test_exchange.py            # Mail/folder/send tests (stubbed EWS)
│   ├── test_exchange_calendar.py   # Calendar CRUD, UID lookup, chunking (100 tests)
│   ├── test_exchange_contacts.py   # Contact search, GAL, suggested, mailbox, signatures
│   └── test_integration.py         # Credential encryption, modular account flow
│
│   ── Tracking docs ────────────────────────────────────────────
├── PLAN.md                     # Active optimization roadmap
├── HISTORY.md                  # Append-only log of completed work
├── LEARNINGS.md                # Durable insights (loaded into every Claude session)
│
│   ── Runtime ───────────────────────────────────────────────────
├── pyproject.toml              # ruff config (line-length, ignored rules)
├── requirements.txt            # Python dependencies
├── .env                        # API keys (NOT in repo — create manually)
├── .master_key                 # Global encryption key (CRITICAL — backup!)
├── .pq_keypair                 # Optional post-quantum keypair
├── aitoolkit.db           # SQLite database (auto-created)
├── app.log                     # Application logs
├── venv/                       # Python virtual environment (symlink to conda)
├── jobs/                       # Resume job manifests (auto-created)
├── generated_images/           # AI-generated images (auto-created)
├── storage/                    # Local fallback when Storage Box unavailable
└── static/                     # PWA assets (owned by $APP_USER)
    ├── custom.css
    ├── pwa.js
    ├── manifest.json
    ├── service-worker.js
    ├── icon-192.png
    └── icon-512.png

$STORAGE_MOUNT/                 # Hetzner Storage Box (mounted via CIFS)
└── users/[username]/           # Per-user storage

⚠️ Critical backup: Always back up .master_key together with aitoolkit.db. The database is encrypted — without the key, data is unrecoverable.

🔍 Troubleshooting

Service Not Starting

sudo systemctl status aitoolkit
sudo journalctl -u aitoolkit -f
tail -f "$APP_DIR/app.log"

# Common issues:
# - Missing .env → Create $APP_DIR/.env
# - Missing FFmpeg → sudo apt install ffmpeg
# - Port in use → sudo lsof -i :7860

API Keys Not Loading Locally

The app uses python-dotenv to load .env at startup. Verify:

python -c "from dotenv import load_dotenv; load_dotenv(); import os; print(os.environ.get('MISTRAL_API_KEY','MISSING')[:8])"

Web Search Not Working

# Test each backend directly
python tool_executor.py --query "test" --backend tavily
python tool_executor.py --query "test" --backend duckduckgo

# Check keys are loaded
python -c "from config import API_KEYS; print('TAVILY:', bool(API_KEYS.get('TAVILY')))"

Exchange Not Connecting

# Test CLI directly (shows detailed EWS errors)
python exchange_cli.py --debug list --limit 1

# Common issues:
# - Wrong server name → check EXCHANGE_SERVER (must be EWS host, not webmail URL)
# - Wrong username format → try both DOMAIN\user and user@company.de
# - Self-signed cert → add BaseProtocol.HTTP_ADAPTER_CLS override in exchange_cli.py
# - Exchange 2-year limit → handled automatically (720-day chunking)

OpenRouter 404 "No endpoints available"

This happens when your OpenRouter account privacy settings conflict with the selected model. The app automatically adds allow_fallbacks: true to all OpenRouter requests to mitigate this. If it persists, check your OpenRouter privacy settings at https://openrouter.ai/settings/privacy.

PWA Not Installing on Mobile

curl -I "https://$DOMAIN/manifest.json"
curl -I "https://$DOMAIN/service-worker.js"
curl -I "https://$DOMAIN/static/icon-192.png"
# All should return 200 OK

Storage Box Not Mounting

mount | grep storage
sudo mount -t cifs //u12345-sub1.your-storagebox.de/u12345-sub1 "$STORAGE_MOUNT" \
    -o credentials=/etc/cifs-credentials,uid=$(id -u $APP_USER),gid=$(id -g $APP_USER)

Apache Configuration Issues

sudo apache2ctl configtest
sudo tail -f /var/log/apache2/error.log
sudo systemctl restart apache2

📝 Maintenance

Update Application

cd "$APP_DIR"
git pull
source "$VENV/bin/activate"
pip install -r requirements.txt --upgrade
sudo systemctl restart aitoolkit

Backup Database & Keys

mkdir -p /var/backups/aitoolkit
DATE=$(date +%Y%m%d)
cp "$APP_DIR/aitoolkit.db" /var/backups/aitoolkit/db-$DATE.bak
cp "$APP_DIR/.master_key" /var/backups/aitoolkit/master_key-$DATE.bak
[ -f "$APP_DIR/.pq_keypair" ] && \
    cp "$APP_DIR/.pq_keypair" /var/backups/aitoolkit/pq_keypair-$DATE.bak

SSL Certificate Renewal

sudo certbot renew --dry-run

Monitor Disk Space

df -h "$STORAGE_MOUNT"
df -h "$APP_DIR"

🚀 Quick Reference

Service Management

sudo systemctl start aitoolkit
sudo systemctl stop aitoolkit
sudo systemctl restart aitoolkit
sudo systemctl status aitoolkit

View Logs

sudo journalctl -u aitoolkit -f
tail -f "$APP_DIR/app.log"
sudo tail -f /var/log/apache2/error.log

Critical Files

Application: $APP_DIR/app.py — UI & event wiring
Chat logic: $APP_DIR/chat_handlers.py
OCR engines: $APP_DIR/ocr_utils.py
Transcription: $APP_DIR/transcription_utils.py
Tool Router: $APP_DIR/tool_executor.py
Exchange CLI: $APP_DIR/exchange_cli.py
Provider Config: $APP_DIR/config.py
Environment: $APP_DIR/.env — secrets
Encryption Key: $APP_DIR/.master_key — backup this!
Service: /etc/systemd/system/aitoolkit.service
Apache SSL: /etc/apache2/sites-available/aitoolkit-le-ssl.conf

⚠️ Security Checklist

✅ Service runs as non-root user ($APP_USER, not root)
✅ systemd sandboxing enabled (NoNewPrivileges, ProtectSystem=strict, ProtectHome, PrivateTmp)
✅ Security headers configured (HSTS, X-Frame-Options DENY, X-Content-Type-Options nosniff, Referrer-Policy, Permissions-Policy)
✅ .env permissions set to 600
✅ .master_key mode 640, owned by root:$APP_USER — backup regularly with the DB
✅ Database files (*.db) mode 660
✅ /etc/cifs-credentials permissions set to 600
✅ Fail2Ban enabled and running
✅ SSL certificate valid and auto-renewing
✅ Firewall configured (only ports 22, 80, 443 open)
✅ Apache upload limits configured (1GB)
✅ Storage Box mounted with restricted permissions (owned by $APP_USER)
✅ EU-only mode enabled (EU_ONLY_MODE = True in config.py)
✅ Requesty configured with EU router endpoint (router.eu.requesty.ai)
✅ OpenRouter allow_fallbacks: true set to avoid data-policy 404s
✅ Web search keys stored in .env, never hardcoded
✅ Exchange credentials stored in .env or encrypted per-user in DB (never plaintext in code)
✅ Gradio API info/config endpoints blocked via Apache RewriteRules

Last Updated: June 2026 — Security hardening: non-root service account, systemd sandboxing, security headers (HSTS, X-Frame-Options, CSP), hardened file permissions, parameterized deployment paths

Name		Name	Last commit message	Last commit date
Latest commit History 372 Commits
.github/workflows		.github/workflows
pptx_engine		pptx_engine
providers		providers
static		static
tests		tests
tools		tools
ui		ui
.env.example		.env.example
.gitignore		.gitignore
.pylintrc		.pylintrc
HISTORY.md		HISTORY.md
LEARNINGS.md		LEARNINGS.md
LICENSE		LICENSE
PLAN.md		PLAN.md
PPTX_INTEGRATION.md		PPTX_INTEGRATION.md
README.md		README.md
TODO.md		TODO.md
app.py		app.py
app.py.pre-auth-migration		app.py.pre-auth-migration
app_helpers.py		app_helpers.py
chat_handlers.py		chat_handlers.py
config.py		config.py
context_utils.py		context_utils.py
crispasr_tool.py		crispasr_tool.py
crispembed_tool.py		crispembed_tool.py
crisplib_tool.py		crisplib_tool.py
crypto_utils.py		crypto_utils.py
db_models.py		db_models.py
db_ops.py		db_ops.py
deploy.sh		deploy.sh
dictation_manager.py		dictation_manager.py
exchange_cli.py		exchange_cli.py
export_utils.py		export_utils.py
fn2md.py		fn2md.py
format_transplant.py		format_transplant.py
gladia.py		gladia.py
gladia_handler.py		gladia_handler.py
image_gen_utils.py		image_gen_utils.py
image_utils.py		image_utils.py
migrate_db.py		migrate_db.py
migrate_master_key.py		migrate_master_key.py
ocr_utils.py		ocr_utils.py
pcloud_dl.py		pcloud_dl.py
pptx_tool.py		pptx_tool.py
provider_utils.py		provider_utils.py
pyproject.toml		pyproject.toml
requirements.lock		requirements.lock
requirements.txt		requirements.txt
service-worker.js.save		service-worker.js.save
ssrf_guard.py		ssrf_guard.py
storage_utils.py		storage_utils.py
test_pptx_tool.py		test_pptx_tool.py
tool_executor.py		tool_executor.py
tr_gemma.py		tr_gemma.py
transcription_audio.py		transcription_audio.py
transcription_jobs.py		transcription_jobs.py
transcription_runners.py		transcription_runners.py
transcription_utils.py		transcription_utils.py
transcription_youtube.py		transcription_youtube.py
translation_utils.py		translation_utils.py
translator-app.py		translator-app.py
translator-readme.md		translator-readme.md
translator.py		translator.py
tts_utils.py		tts_utils.py
ui_constants.py		ui_constants.py
yt_tk.py		yt_tk.py
yt_tk_vps.py		yt_tk_vps.py

Folders and files

Latest commit

History

Repository files navigation

AIToolkit - Deployment Guide

🌍 LLM Providers & EU Compliance

Requesty EU Router

Langdock

🔍 Web Search / Tool Calling

Available Tools

📚 Bibliographic Search (CrispLib Integration)

Capabilities

Installation

Per-user toggle

Standalone testing

🎙️ Local ASR (CrispASR Integration)

Backends exposed in the UI

Installation (fresh clone)

Remote HTTP server

Optional whisper model search paths

Standalone testing

📧 Microsoft Exchange Integration

Features

CLI Usage

Configuration

Installation

Testing

📋 Prerequisites

🔧 Deployment Variables

🛠️ Step 1: System Dependencies & Security

Firewall Configuration (UFW)

Configure Swap Space (prevent OOM crashes)

📦 Step 2: Storage Box Mounting (Hetzner)

2.1 Create Mount Point

2.2 Create Credentials File

2.3 Configure Auto-Mount via fstab

2.4 Mount and Verify

🐍 Step 3: Application Setup

3.1 Create Directory Structure

3.2 Clone Repository or Copy Files

3.3 Create Environment File

3.4 Create Virtual Environment (Miniconda Method)

3.5 Install Python Dependencies

3.6 Initialize Permissions

🌐 Step 4: Apache Configuration

4.1 Enable Required Modules

4.2 SSL Certificate Setup

4.3 HTTP Config (Port 80 - HTTPS Redirect)

4.4 HTTPS Config (Port 443 - Main Config)

4.5 Enable Sites and Restart Apache

⚙️ Step 5: Systemd Service

🛡️ Step 6: Fail2Ban Configuration

6.1 Create Local Configuration

6.2 Enable Apache Protections

6.3 Restart Fail2Ban

🔄 Step 7: Log Rotation

💻 Local Development

📁 Final File Structure

🔍 Troubleshooting

Service Not Starting

API Keys Not Loading Locally

Web Search Not Working

Exchange Not Connecting

OpenRouter 404 "No endpoints available"

PWA Not Installing on Mobile

Storage Box Not Mounting

Apache Configuration Issues

📝 Maintenance

Update Application

Backup Database & Keys

SSL Certificate Renewal

Monitor Disk Space

🚀 Quick Reference

Service Management

View Logs

Critical Files

⚠️ Security Checklist

About

Topics

Resources

Packages