This documentation covers the deployment of a (mostly) GDPR-compliant Multi-Provider AI Suite with:
- 💬 Chat (Multiple LLM providers with context-aware history, tool calling, web search)
- 🔍 Web Search (Agentic tool-calling loop via Tavily, Brave Search, DuckDuckGo, or self-hosted SearXNG)
- 📧 Microsoft Exchange (E-Mail, Calendar, Contacts CRUD, via EWS)
- 🎙️ Audio Transcription (Gladia, Whisper, Mistral, Deepgram, AssemblyAI, CrispASR 26+ backends; Speaker Diarization; Chunking; real-time Dictation; Audio Q&A; punctuation/truecasing/hotwords)
- 🔊 Text-to-Speech (20+ TTS engines via CrispASR: Kokoro, Qwen3-TTS, Orpheus, Chatterbox, Piper, etc.; voice cloning)
- 📄 Content Extraction for uploads (text extraction from PPTX, DOCX, XLSX, PDF; OCR for images/scans) and OCR (Mistral OCR, CrispEmbed DBNet+TrOCR, local Tesseract, Vision LLMs)
- 🔤 Translation (NMT + LLM-based DOCX translation via
translator.py; offline text translation via CrispASR M2M100/MADLAD 419 languages) - 🗣️ Language Detection (text LID via CLD3/GlotLID, audio LID via CrispASR)
- 🔎 Semantic Search (transcription + image history search via CrispEmbed embeddings)
- 🔍 AI Watermark Detection (AudioSeal via CrispASR)
- 📊 Presentation Generation (AI-powered PPTX creation via LLM +
pptx_engine) - 📋 Format-Transplant (Apply formatting/styles of one DOCX to the content of another)
- 📦 Cloud Storage (Hetzner Storage Box integration via SMB)
- 🔄 Resumable Jobs (Job tracking for large files)
- 👁️ Vision Analysis (Image understanding)
- 🎨 Image Generation (e.g. FLUX models via Nebius and directly per Black Forest Labs API)
- 💾 Database Integration (SQLite with per-user AES-256-GCM encryption)
- 📱 PWA Support (Progressive Web App for mobile installation)
- 🛡️ Security (Gradio
auth=with httpOnly cookies, Fail2Ban, EU-only mode, per-user AES key wrapping, systemd sandboxing, security headers)
The application runs as a Python/Gradio service on localhost:7860, managed by systemd, and exposed via Apache reverse proxy with SSL.
The suite supports multiple LLM providers. EU_ONLY_MODE = True in config.py restricts which providers regular users can access. Admins and Medienverwalter bypass this restriction.
| Provider | EU Status | Notes |
|---|---|---|
| Mistral | 🇫🇷 EU (Paris) | Chat, Vision, OCR, Audio |
| Scaleway | 🇫🇷 EU (Paris) | Chat, Vision, Transcription |
| Nebius | 🇳🇱 EU (Amsterdam) | Chat, Vision, Image Generation |
| Gladia | 🇫🇷 EU | Transcription |
| Deepgram | 🇪🇺 EU endpoint | Transcription (api.eu.deepgram.com) |
| AssemblyAI | 🇪🇺 EU endpoint | Transcription (api.eu.assemblyai.com) |
| BFL | 🇩🇪 EU (Germany) | Image Generation (FLUX) |
| Requesty | 🇪🇺 EU router | Chat — EU-filtered (see below) |
| Langdock | 🇩🇪 EU (Hamburg, Azure) | Chat — DSGVO-konform, OpenAI-compatible |
| OpenRouter | 🇺🇸 US | Chat, Vision, Image — restricted in EU mode |
| Groq | 🇺🇸 US | Chat, Transcription — restricted in EU mode |
| Poe | 🇺🇸 US | Chat, Vision — restricted in EU mode |
Requesty (router.eu.requesty.ai) is a model aggregator routing to 20+ backend providers. The suite applies client-side EU filtering based on model ID prefix and @region suffix:
- EU-native (always allowed):
nebius/…,mistral/… - EU-regional (allowed only with explicit EU region in model ID):
azure/model@swedencentral,azure/model@francecentral,azure/model@uksouthbedrock/model@eu-central-1,bedrock/model@eu-west-1, etc.vertex/model@europe-west1,vertex/model@europe-central2, etc.
- Blocked:
openai/…,anthropic/…,novita/…,groq/…,xai/…,moonshot/…,deepseek/…,alibaba/…, and others
Role-based access: Admins and Medienverwalter see all 430+ Requesty models (incl. non-EU). Regular users see only the ~117 EU-routed models.
Langdock is a German company (Hamburg) providing an OpenAI-compatible API backed by EU Azure infrastructure, fully DSGVO-compliant.
- All models served by Langdock are chat models (no image-gen, no embeddings)
- No additional EU filtering required — all models are on EU Azure by definition
- Use Fetch Models to always get the live list from Langdock's API
The chat supports an agentic tool-calling loop via tool_executor.py. When enabled, the model can call tools mid-conversation and use the results to ground its final answer.
Supported providers for tool calling: Mistral, Scaleway, Nebius, OpenRouter
| Tool | Description |
|---|---|
web_search |
Real-time web search (Tavily, Brave, DuckDuckGo) |
fetch_url |
Fetch and read text content of a URL |
calculate |
Precise math via sympy (algebra, calculus, matrices, …) |
python_exec |
Execute Python 3 code in a sandboxed subprocess |
shell |
BusyBox shell in persistent sandbox (optional, Mayflower) |
generate_image |
AI image generation via FLUX models (Nebius / BFL) |
analyze_image_url |
Vision analysis of an image at a URL |
exchange_list |
Search and list Exchange emails |
exchange_show |
Show full content of a specific email |
exchange_send |
Send email or save as draft (To/CC/BCC/importance/HTML) |
exchange_folders |
View Exchange mailbox folder hierarchy |
exchange_calendar_list |
List calendar events (past, future, all-future, by type) |
exchange_calendar_show |
Full event details by UID or index |
exchange_calendar_add |
Create events / send meeting invitations |
exchange_calendar_edit |
Edit events or recurring series (UID-based) |
exchange_contact_search |
Search contacts: personal, GAL, autocomplete cache, mailbox mining |
exchange_contact_show |
Full contact details + optional email-signature extraction |
exchange_contact_create |
Create a new Exchange contact |
exchange_contact_edit |
Edit an existing Exchange contact by UID |
library_search |
Search bibliographic catalogs (DNB, LoC, BnF, ZDB, IxTheo) — CrispLib |
resolve_identifier |
Resolve DOI / PMID / ISBN / URL → citation metadata — CrispLib |
list_library_endpoints |
List the available bibliographic endpoints — CrispLib |
Search backends (selectable in the ⚙️ Konfiguration tab):
| Backend | Key Required | Notes |
|---|---|---|
tavily |
TAVILY_API_KEY |
Default. AI-optimised results, free 1 000 req/month |
brave_web |
BRAVE_SEARCH_API_KEY |
Web snippets, own index, EU-friendly, ~1 000 free/month via $5 credit |
brave_llm |
BRAVE_SEARCH_API_KEY |
Pre-extracted text chunks optimised for LLM grounding / RAG |
brave_answers |
BRAVE_ANSWERS_API_KEY |
AI-generated answer with citations (separate Brave subscription) |
brave_images |
BRAVE_SEARCH_API_KEY |
Image search |
duckduckgo |
— | No key needed, rate-limited, production fallback |
searxng |
SEARXNG_URL |
Self-hosted meta-search (future option) |
Fallback chain: if the primary backend fails, the router automatically tries duckduckgo before surfacing an error.
Options exposed per search call: freshness (24h/7d/31d/365d), country, search_lang, safesearch, extra_snippets, max_tokens (LLM context), enable_citations (Answers).
The tool_executor.py module is standalone-testable without running the full app:
python tool_executor.py --query "What is the latest news about Mistral AI?" --backend tavily
python tool_executor.py --query "..." --backend brave_web --provider Scaleway --model mistral-small-3.2-24b-instruct-2506Three additional tools (library_search, resolve_identifier, list_library_endpoints) are provided by CrispLib, a sibling repo. They let the LLM search academic library catalogs and resolve known identifiers to structured citation metadata.
library_search— search SRU (DNB, LoC, BnF, ZDB) and IxTheo (Index Theologicus) bytitle,author,isbn,subject,year, or free-text. Returns up to 25 normalized records (title, authors, year, publisher, journal, volume, ISBN, DOI, …).resolve_identifier— DOI → Crossref CSL-JSON · PMID → NCBI esummary · ISBN → Open Library · URL → Wikipedia Citoid. Auto-detects type. Pure-stdlib +requests, no heavy deps.list_library_endpoints— discovery so the model can pick the right catalog before callinglibrary_search.
CrispLib is loaded via sys.path injection rather than vendored, so upstream fixes propagate automatically. Two options:
# Option A — sibling clone (default)
cd "$(dirname "$APP_DIR")" # parent of your install directory
git clone https://github.com/CrispStrobe/CrispLib.git
# Option B — custom location
export CRISPLIB_PATH=/opt/CrispLibThe adapter crisplib_tool.py looks for ../CrispLib relative to AIToolkit, or honours CRISPLIB_PATH if set. If CrispLib is missing, the three tools simply don't register and the rest of the app keeps working.
Hardening: defusedxml>=0.7.1 is now in requirements.txt — CrispLib's SRU/OAI-PMH parsers prefer it over stdlib xml.etree.ElementTree to avoid XXE / billion-laughs / quadratic-blowup attacks against untrusted catalog responses. The stdlib path remains as a fallback if defusedxml is unavailable.
The Konfiguration tab has a 📚 Bibliothek (CrispLib) checkbox. Enabling it adds all three CrispLib tools to that user's enabled_tools list, persisted in UserSettings.tool_preferences_json. Default is off — opt-in per user. To make it on-by-default for new users, append the tool names to DEFAULT_TOOLS_STANDARD in tool_executor.py.
# Resolve a DOI
python crisplib_tool.py --resolve 10.1038/nature12373
# Search DNB for a title
python crisplib_tool.py --search-title "Python" --endpoint dnb
# List endpoints
python crisplib_tool.py --list
# CrispLib's own resolver (no AIToolkit needed)
cd ../CrispLib && python identifier_resolver.py 9783658310844 isbnOffline, API-key-free audio transcription is powered by CrispASR, a sibling repo — one C++ binary, 22+ ASR model families, zero Python dependencies. AIToolkit talks to CrispASR via two paths:
- Local binary (
CrispASRprovider) — spawns thecrispasrCLI directly. Works offline, no network. - Remote HTTP (
CrispASR-HTTPprovider) — POSTs to any OpenAI-compatible HTTP server (/v1/audio/transcriptions,/load,/health). Works against CrispASR's own C++ server (examples/server), the Gradio HF space, or a custom FastAPI/Express wrapper. Lets a fleet share one GPU host.
Both paths share the same backend list and accept the same lang / diarize options.
The provider dropdown lists 22 backends (kept in sync with CrispASR/README.md). Models marked auto-DL are fetched by the binary on first use into ~/.cache/crispasr/.
| Backend | Notes |
|---|---|
| parakeet | NVIDIA Parakeet-TDT-0.6B — fast, multilingual, auto-DL (~467 MB) |
| whisper | OpenAI Whisper — 99 languages, needs a ggml-*.bin in CrispASR/models/ |
| canary | NVIDIA Canary-1B — explicit language + speech translation, auto-DL |
| cohere | Cohere Transcribe — lowest WER, auto-DL |
| qwen3 | Qwen3-ASR-0.6B — 30+ languages + Chinese dialects, auto-DL |
| voxtral / voxtral4b | Mistral Voxtral-Mini speech-LLM (3B / 4B realtime), auto-DL |
| granite / granite-4.1 / granite-4.1-plus / granite-4.1-nar | IBM Granite Speech (Apache-2.0); 4.1 variants run encoder as a single ggml graph (~2.5× faster on M1) |
| fastconformer-ctc | NeMo FastConformer + CTC, lightweight |
| wav2vec2 | Wav2Vec2 CTC — single-language, manual GGUF needed |
| moonshine / moonshine-streaming | Edge-targeted, sliding-window streaming variant |
| omniasr | wav2vec2-style + 24–48L transformer + CTC, 1600+ languages |
| firered-asr | Mandarin + 20 Chinese dialects, includes LID |
| glm-asr | GLM-ASR-Nano — Mandarin / English / Cantonese (17 langs) |
| kyutai-stt | Kyutai Mimi codec + causal LM (en, fr) |
| mimo-asr | Xiaomi MiMo — Mandarin Wu/Cantonese/Hokkien/Sichuanese + EN code-switch |
| vibevoice | Microsoft VibeVoice-ASR — diarization + hotwords (50+ langs) |
| gemma4-e2b | Google Gemma-4 E2B — USM Conformer + Gemma decoder, 140+ langs |
Select CrispASR (local binary) or CrispASR-HTTP (remote server) as the transcription engine, then pick a backend from the Modell dropdown.
CrispASR is compiled on first use — no manual setup required:
# Option A — sibling clone (default, auto-detected)
cd "$(dirname "$APP_DIR")" # parent of your install directory
git clone https://github.com/CrispStrobe/CrispASR.git
# The next time a CrispASR transcription is started, AIToolkit will
# cmake-configure and build the binary automatically (requires cmake + C++17).
# The build helper prefers the Ninja generator + `crispasr` target into
# build-ninja-compile/ and falls back to make + whisper-cli into build/.
# Option B — custom location
export CRISPASR_PATH=/opt/CrispASRThe adapter crispasr_tool.py looks for ../CrispASR relative to AIToolkit, or honours CRISPASR_PATH if set. It probes both build-ninja-compile/bin/ (the active upstream layout) and the legacy build/bin/ for an existing binary. If neither source nor binary is found, AIToolkit attempts a git clone of the repo and then a cmake build. If the build fails (missing cmake, missing compiler), the CrispASR engine is simply absent from the provider list and all other engines keep working.
Build prerequisites:
# Ubuntu / Debian
sudo apt install cmake g++ build-essential ninja-build
# macOS (Homebrew)
brew install cmake ninja
# Xcode CLT already provide clang
# Windows
# Install Visual Studio Build Tools ≥ 2022 + CMake from cmake.orgOptional GPU acceleration:
cmake -B build-ninja-compile -G Ninja -DCMAKE_BUILD_TYPE=Release -DGGML_CUDA=ON # NVIDIA
cmake -B build-ninja-compile -G Ninja -DCMAKE_BUILD_TYPE=Release -DGGML_METAL=ON # Apple SiliconTo use the CrispASR-HTTP provider, point AIToolkit at any OpenAI-compatible CrispASR server. Out of the box, that's:
CrispASR/examples/server— CrispASR's own C++ cpp-httplib server (built with thecrispasr-servertarget).CrispASR/hf-space/— the Gradio HuggingFace space wrapper.- Any custom FastAPI / Express wrapper that proxies to the CrispASR binary.
The server must expose POST /v1/audio/transcriptions, POST /load, and GET /health.
# In .env or systemd EnvironmentFile
CRISPASR_REMOTE_HTTP_URL=https://asr.example.com
CRISPASR_REMOTE_HTTP_KEY=<bearer-token-or-empty>Per-request, AIToolkit:
- Pings
/healthand aborts fast on failure. - Calls
/load(idempotent) to switch the server to the requested backend. - Transcodes non-native audio (
.m4a/.mp4/.webm/…) to 16 kHz mono PCM via ffmpeg. - Uploads the file to
/v1/audio/transcriptionswithresponse_format=verbose_jsonand surfaces speaker labels / per-segment timestamps when the server provides them.
For the whisper backend AIToolkit looks for a ggml-*.bin in (first match wins):
$CRISPASR_WHISPER_MODEL— explicit override<crispasr_root>/models//Volumes/backups/ai/crispasr-models/— only if the volume is mounted (matches CrispASR's CLAUDE.md)~/.cache/crispasr/— the binary's own auto-DL cache
# Verify adapter + auto-build
python crispasr_tool.py
# Test a transcription (binary must be built first)
../CrispASR/build-ninja-compile/bin/crispasr --backend parakeet -m auto -f /tmp/test.wavThe suite provides direct Exchange Web Services (EWS) access via exchange_cli.py (using exchangelib). It works both as a standalone CLI tool and as a set of LLM-callable tools in the chat interface.
- List, filter, and search inbox (by sender, subject, body, date, read status, regex)
- Read full email content
- Send emails: To, CC, BCC, Reply-To, importance (Low/Normal/High), plain text or HTML body
- Save as draft instead of sending
- Move emails between folders
- Browse folder hierarchy (incl. hidden/technical folders)
Kalender
- List events: upcoming, past, all-future (up to 10 years), by type (meeting/appointment/recurring/allday)
- Show full event details: attendees, description, location, OLX plugin fields, recurrence pattern
- Create events with attendees (meeting invitations via EWS)
- Edit events: date/time, subject, description, location, categories, attendees, OLX fields
- Recurring series support: edit single occurrence or entire series master
- UID-based lookup (stable across pagination/filter changes)
- Automatic chunking of Exchange's 2-year view limit (720-day windows, transparent to the caller)
Kontakte
- Search across multiple sources:
personal— all sub-folders of the personal contacts foldergal— Global Address List (Active Directory) viaresolve_namessuggested— Outlook autocomplete cache:Recipient Cache,Suggested Contacts,AllContacts,RelevantContacts,Contact Search,MyContactsExtended,GAL Contacts,Organizational Contactsmailbox— mine inbox/sent for unique senders and recipientsall(default) — personal + GAL + suggestedfull— all sources including mailbox scan
- Show full contact details: phones, emails, physical addresses, notes, categories
- Signature intelligence: scan inbox/sent for the contact's emails and extract phone, title, website from signature blocks
- Create and edit contacts (name, company, department, job title, emails, phones, addresses, notes, categories)
- Deduplication by item ID across sources
# E-Mail
python exchange_cli.py list --limit 20 --unread
python exchange_cli.py list --sender boss@company.de --start 2026-01-01
python exchange_cli.py show 3
python exchange_cli.py send --to colleague@company.de --subject "Meeting" --body "Hi..." \
--cc cc@company.de --importance High
python exchange_cli.py send --to draft@company.de --subject "Draft" --body "..." --draft
python exchange_cli.py folders
# Kalender
python exchange_cli.py calendar-list --limit 20
python exchange_cli.py calendar-list --start 2026-04-01 --end 2026-06-30
python exchange_cli.py calendar-list --all-future --limit 50
python exchange_cli.py calendar-list --past --limit 10
python exchange_cli.py calendar-show --uid <UID>
python exchange_cli.py calendar-show 2 --start 2026-04-01
python exchange_cli.py calendar-add --subject "Team Meeting" --start "2026-05-01 10:00" \
--end "2026-05-01 11:00" --attendees a@x.de,b@x.de --location "Raum 3"
python exchange_cli.py calendar-edit --uid <UID> --new-start "2026-05-02 10:00" \
--new-end "2026-05-02 11:00" --send-updates
# Kontakte
python exchange_cli.py contact-search "Müller"
python exchange_cli.py contact-search "Dominic" --source suggested
python exchange_cli.py contact-search "sales" --source full --limit 50
python exchange_cli.py contact-show 1 --query "Müller"
python exchange_cli.py contact-show --uid <UID> --with-signatures
python exchange_cli.py contact-create "Max Mustermann" --company "ACME" \
--email-primary max@acme.de --phone-business "+49 711 123456"
python exchange_cli.py contact-edit <UID> --job-title "Senior Developer"Exchange credentials are read from environment variables (.env file or systemd EnvironmentFile):
EXCHANGE_SERVER=mail.company.de
EXCHANGE_USER=DOMAIN\username # or UPN: user@company.de
EXCHANGE_PASSWORD=secret
EXCHANGE_EMAIL=user@company.de # primary SMTP address (sender address)Per-user encrypted credentials can be stored in the database (AES-256-GCM with per-user key wrapping) and passed at runtime, overriding the global env vars.
pip install exchangelibexchangelib is the only additional dependency. The rest of the Exchange module uses stdlib only.
python -m pytest tests/test_exchange.py tests/test_exchange_calendar.py tests/test_exchange_contacts.py -vAll tests use stubs and do not require a real Exchange server.
- OS: Ubuntu 20.04 LTS or newer
- RAM: Minimum 8GB (for handling audio files)
- Root/Sudo Access
- Domain: DNS A-Record pointing to server IP (e.g.,
ai.yourdomain.de) - Storage: Hetzner Storage Box with sub-account access
- API Keys (stored in
.envfile):MISTRAL_API_KEY— multipurpose: chat, OCR, vision, audioSCALEWAY_API_KEY— chat, transcriptionNEBIUS_API_KEY— chat, image generationGLADIA_API_KEY— long-form transcriptionDEEPGRAM_API_KEY— transcription, EU endpointASSEMBLYAI_API_KEY— transcription, EU endpointGROQ_API_KEY— transcription, chat (US, restricted in EU mode)REQUESTY_API_KEY— EU router: chat across many providersLANGDOCK_API_KEY— chat, DSGVO-konform, German company, EU AzureOPENROUTER_API_KEY— optional, additional models (US, restricted in EU mode)BFL_API_KEY— optional, FLUX image generationPOE_API_KEY— optional, additional models (US, restricted in EU mode)TAVILY_API_KEY— web search, free 1 000 req/monthBRAVE_SEARCH_API_KEY— web search: snippets, LLM context, imagesBRAVE_ANSWERS_API_KEY— web search: AI-generated answers (separate subscription)SEARXNG_URL— optional, self-hosted SearXNG base URL (e.g.http://localhost:8888)EXCHANGE_SERVER— Exchange server hostname (for Exchange integration)EXCHANGE_USER— Exchange username (DOMAIN\useror UPN)EXCHANGE_PASSWORD— Exchange passwordEXCHANGE_EMAIL— Exchange primary SMTP address
Define these variables before following the deployment steps. All bash blocks below reference them:
APP_DIR=/var/www/aitoolkit # ← change to your install path
APP_USER=www-data # ← service account (non-root)
DOMAIN=ai.yourdomain.de # ← your FQDN
STORAGE_MOUNT=/mnt/storage # ← Storage Box mount point
VENV=$APP_DIR/venvCRITICAL: FFmpeg must be in PATH. cifs-utils is required for Storage Box. OCR and document tools are required for the Content Extractor.
sudo apt update
sudo apt upgrade -y
# Install all required system packages
sudo apt install -y \
wget \
python3-pip \
ffmpeg \
apache2 \
certbot \
python3-certbot-apache \
sqlite3 \
fail2ban \
cifs-utils \
tesseract-ocr \
tesseract-ocr-deu \
poppler-utils \
pandoc \
cmake \
g++ \
build-essential \
gitCrispASR:
cmake,g++,build-essential, andgitare required to auto-build the local ASR binary. If omitted, CrispASR is silently unavailable while all cloud transcription engines keep working.
CRITICAL: You must allow SSH before enabling the firewall, or you will lock yourself out.
# 1. Allow incoming SSH connections
sudo ufw allow OpenSSH
# 2. Allow Web Traffic (HTTP/HTTPS)
sudo ufw allow 'Apache Full'
# 3. Enable the Firewall
sudo ufw enable
# 4. Verify Status
sudo ufw statussudo fallocate -l 4G /swapfile
sudo chmod 600 /swapfile
sudo mkswap /swapfile
sudo swapon /swapfile
echo '/swapfile none swap sw 0 0' | sudo tee -a /etc/fstabVerify critical installations:
which ffmpeg # Must show: /usr/bin/ffmpeg
sudo systemctl status fail2ban # Must be: active (running)sudo mkdir -p "$STORAGE_MOUNT"sudo nano /etc/cifs-credentialsContent:
username=u12345-sub1
password=YOUR_SUBACCOUNT_PASSWORDsudo chmod 600 /etc/cifs-credentialssudo nano /etc/fstabAdd (single line):
//u12345-sub1.your-storagebox.de/u12345-sub1 /mnt/storage cifs credentials=/etc/cifs-credentials,uid=$(id -u www-data),gid=$(id -g www-data),file_mode=0770,dir_mode=0770,nounix,vers=3.0,x-systemd.automount,x-systemd.idle-timeout=60 0 0
Note: Replace
www-datawith your$APP_USERif different. Theuid/gidvalues ensure the service account owns mounted files. Useid -u $APP_USERandid -g $APP_USERto look up the numeric IDs, then paste those literal numbers into fstab (shell expansions do not work in fstab). Replace/mnt/storagewith your$STORAGE_MOUNTpath.
sudo systemctl daemon-reload
sudo systemctl restart remote-fs.target
ls -la "$STORAGE_MOUNT" # Should list files from Storage Boxsudo mkdir -p "$APP_DIR"
sudo mkdir -p "$APP_DIR/static"
sudo mkdir -p "$APP_DIR/generated_images"
sudo mkdir -p "$APP_DIR/jobs"
cd "$APP_DIR"Option A: Clone from GitHub
git clone https://github.com/YOUR_USERNAME/YOUR_REPO.git .
# Optional sibling repos (auto-detected by path, no extra config needed):
# CrispLib — bibliographic search tools
git clone https://github.com/CrispStrobe/CrispLib.git ../CrispLib
# CrispASR — offline/local ASR (auto-built on first use if cmake is available)
git clone https://github.com/CrispStrobe/CrispASR.git ../CrispASROption B: Manual File Upload
Upload these files:
app.py— Top-level Gradio Blocks assembly + login/logout wiring (now ~2 900 lines, was ~6 800 before the Phase 7 per-tab split)app_helpers.py— Pure-Python helpers (login, file-explorer paths, provider dropdown updaters)config.py— Central provider & key configurationcrypto_utils.py— Encryption logic & key wrappingdb_models.py— SQLAlchemy models & schema migrationsdb_ops.py— Database CRUD operationsprovider_utils.py— LLM provider clients, compliance, routingcontext_utils.py— Token estimation, context pruning, chunkingimage_utils.py— Image encoding & resize helpersimage_gen_utils.py— Image generation & vision analysistranscription_utils.py— Transcription dispatcher + per-provider runnerstranscription_youtube.py— YouTube + generic-URL audio download (split out from transcription_utils)storage_utils.py— File storage, pCloud, per-user directorieschat_handlers.py— Chat logic, content extraction (UniversalExtractor), attachmentsocr_utils.py— OCR engines (Mistral, Groq, OpenRouter, Ollama/GLM, Vision LLM)translation_utils.py— Async DOCX translation with progress streamingexport_utils.py— Export to .docx / .md / .txttool_executor.py— Tool-calling router (web search, Exchange, image gen, …)exchange_cli.py— Microsoft Exchange integration (EWS: mail, calendar, contacts)translator.py— DOCX translation engine (NMT + LLM)format_transplant.py— DOCX format-transplant enginedictation_manager.py— Real-time dictation (Deepgram, AssemblyAI, Gladia, Mistral Realtime, Faster-Whisper, CrispASR local + HTTP)crispasr_tool.py— CrispASR adapter: binary discovery, auto-clone, auto-buildui/package — Per-tab Gradio modules. Eachui/<tab>_tab.pyexposes abuild_<tab>_tab(session_state)function called fromapp.py. Currently 9 modules:chat_tab,transcription_tab,vision_tab,image_gen_tab,ocr_tab,dictation_tab,doc_translator_tab,format_transplant_tab,admin_tab.tests/— pytest suite (76 tests). Run withpython -m pytest tests/.pyproject.toml— ruff config (line length, ignored rules) + Python tooling.requirements.txt— Python dependenciesstatic/folder:custom.css,manifest.json,pwa.js,service-worker.js,icon-192.png,icon-512.png
CRITICAL: Store all API keys in a secured .env file.
sudo nano "$APP_DIR/.env"Required content:
# LLM Providers
MISTRAL_API_KEY=...
SCALEWAY_API_KEY=...
NEBIUS_API_KEY=...
OPENROUTER_API_KEY=...
REQUESTY_API_KEY=rqsty-sk-...
LANGDOCK_API_KEY=sk-...
# Transcription
GLADIA_API_KEY=...
DEEPGRAM_API_KEY=...
ASSEMBLYAI_API_KEY=...
GROQ_API_KEY=gsk_...
# CrispASR remote HTTP server (optional — local binary works without these)
CRISPASR_REMOTE_HTTP_URL=https://asr.example.com
CRISPASR_REMOTE_HTTP_KEY=...
# Image Generation
BFL_API_KEY=...
# Web Search (tool calling)
TAVILY_API_KEY=tvly-... # Free 1 000 req/month — https://tavily.com
BRAVE_SEARCH_API_KEY=BSA... # ~1 000 free/month via $5 credit — https://brave.com/search/api/
BRAVE_ANSWERS_API_KEY=BSA... # Separate Brave Answers subscription
# SEARXNG_URL=http://localhost:8888 # Optional: self-hosted SearXNG
# Microsoft Exchange (EWS)
EXCHANGE_SERVER=mail.company.de
EXCHANGE_USER=DOMAIN\username # or UPN: user@company.de
EXCHANGE_PASSWORD=...
EXCHANGE_EMAIL=user@company.de
# Optional / US providers (restricted in EU mode)
POE_API_KEY=...
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
# Misc
GRADIO_ANALYTICS_ENABLED=FalseSecure it:
sudo chmod 600 "$APP_DIR/.env"Local development: The app automatically loads
.envviapython-dotenvwhen running locally (python app.py). On the VPS the env vars are loaded by systemd'sEnvironmentFile=directive instead.
# 1. Download and Install Miniconda
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh -b -p /opt/miniconda
# 2. Initialize Conda
/opt/miniconda/bin/conda init bash
source ~/.bashrc
# Accept TOS
conda tos accept --override-channels --channel https://repo.anaconda.com/pkgs/main
conda tos accept --override-channels --channel https://repo.anaconda.com/pkgs/r
# 3. Create the 'ak_suite' environment with Python 3.10
/opt/miniconda/bin/conda create -n ak_suite python=3.10 -y
# 4. Link it to the app directory
cd "$APP_DIR"
rm -rf venv
ln -s /opt/miniconda/envs/ak_suite ./venv
conda activate ak_suiteconda activate ak_suite
pip install --upgrade pip wheel
pip install -r "$APP_DIR/requirements.txt"Optional but recommended:
pip install pqcrypto # Post-quantum encryption (Kyber-512)Alternatively, for single installs without activating the environment:
$VENV/bin/pip install [package]sudo chown -R $APP_USER:$APP_USER "$APP_DIR"
sudo chmod 600 "$APP_DIR/.env"
sudo chmod 640 "$APP_DIR/.master_key"
sudo chmod 660 "$APP_DIR"/*.db
sudo touch "$APP_DIR/app.log"
sudo chown $APP_USER:$APP_USER "$APP_DIR/app.log"
sudo chmod 640 "$APP_DIR/app.log"Note: Do not use
chmod -R 755orchmod 666on logs. Restrictive defaults prevent information disclosure and accidental writes.
This configuration serves PWA files directly via Apache and proxies the Gradio app with proper HTTPS headers.
sudo a2enmod proxy proxy_http proxy_wstunnel rewrite headers sslsudo certbot --apache -d "$DOMAIN"Edit /etc/apache2/sites-available/aitoolkit.conf:
<VirtualHost *:80>
ServerName ${DOMAIN}
ErrorLog ${APACHE_LOG_DIR}/aitoolkit_error.log
CustomLog ${APACHE_LOG_DIR}/aitoolkit_access.log combined
RewriteEngine On
RewriteRule ^ https://%{SERVER_NAME}%{REQUEST_URI} [END,NE,R=permanent]
</VirtualHost>Note: Replace
${DOMAIN}with your actual FQDN in the Apache config files. Apache does not expand shell variables — the${DOMAIN}notation here is a placeholder for documentation.
Edit /etc/apache2/sites-available/aitoolkit-le-ssl.conf:
<IfModule mod_ssl.c>
<VirtualHost *:443>
ServerName ${DOMAIN}
# =================================================
# SECURITY HEADERS
# =================================================
Header always unset Server
Header always unset X-Powered-By
Header unset Server
Header unset X-Powered-By
# HSTS — enforce HTTPS for 1 year, include subdomains
Header always set Strict-Transport-Security "max-age=31536000; includeSubDomains"
# Prevent clickjacking
Header always set X-Frame-Options "DENY"
# Prevent MIME-type sniffing
Header always set X-Content-Type-Options "nosniff"
# Referrer policy — send origin only on cross-origin
Header always set Referrer-Policy "strict-origin-when-cross-origin"
# Permissions policy — disable unused browser features
Header always set Permissions-Policy "camera=(), microphone=(self), geolocation=(), payment=()"
# =================================================
# BLOCK OPENAPI & SENSITIVE ENDPOINTS - FIRST PRIORITY
# =================================================
RewriteEngine On
RewriteRule ^/openapi\.json$ - [F,L]
RewriteRule ^/docs/?$ - [F,L]
RewriteRule ^/redoc/?$ - [F,L]
RewriteRule ^/api(/.*)?$ - [F,L]
RewriteRule ^/gradio_api/openapi\.json$ - [F,L]
# Block Gradio info/config endpoints (prevent app metadata leakage)
RewriteRule ^/gradio_api/info$ - [F,L]
RewriteRule ^/config$ - [F,L]
# =================================================
# 1. STATIC FILES (Served by Apache)
# =================================================
# NOTE: Replace ${APP_DIR} below with your actual install path
Alias /static ${APP_DIR}/static
<Directory ${APP_DIR}/static>
Require all granted
Options -Indexes
AddType text/css .css
AddType application/javascript .js
AddType image/png .png
Header set Cache-Control "public, max-age=31536000, immutable"
</Directory>
Alias /manifest.json ${APP_DIR}/static/manifest.json
<Files "manifest.json">
Require all granted
Header set Content-Type "application/manifest+json"
Header set Cache-Control "no-cache"
</Files>
Alias /service-worker.js ${APP_DIR}/static/service-worker.js
<Files "service-worker.js">
Require all granted
Header set Content-Type "application/javascript"
Header set Cache-Control "no-cache"
Header set Service-Worker-Allowed "/"
</Files>
# =================================================
# 2. PROXY SETTINGS (Gradio)
# =================================================
ProxyPreserveHost On
RequestHeader set X-Forwarded-Proto "https"
RequestHeader set X-Forwarded-Port "443"
RequestHeader set X-Forwarded-Host "${DOMAIN}"
RewriteCond %{HTTP:Upgrade} =websocket [NC]
RewriteRule /(.*) ws://127.0.0.1:7860/$1 [P,L]
ProxyPass /static !
ProxyPass /manifest.json !
ProxyPass /service-worker.js !
ProxyPass / http://127.0.0.1:7860/
ProxyPassReverse / http://127.0.0.1:7860/
# =================================================
# 3. SSL CONFIGURATION
# =================================================
# NOTE: Replace ${DOMAIN} with your actual FQDN in the cert paths
SSLCertificateFile /etc/letsencrypt/live/${DOMAIN}/fullchain.pem
SSLCertificateKeyFile /etc/letsencrypt/live/${DOMAIN}/privkey.pem
Include /etc/letsencrypt/options-ssl-apache.conf
LimitRequestBody 1048576000
ProxyTimeout 600
TimeOut 600
</VirtualHost>
</IfModule>sudo a2ensite aitoolkit.conf
sudo a2ensite aitoolkit-le-ssl.conf
sudo apache2ctl configtest # Should show "Syntax OK"
sudo systemctl restart apache2Create /etc/systemd/system/aitoolkit.service:
[Unit]
Description=Gradio App
After=network.target
[Service]
Type=simple
User=www-data
WorkingDirectory=${APP_DIR}
Environment="PATH=${VENV}/bin:/usr/local/bin:/usr/bin:/bin"
Environment="GRADIO_ANALYTICS_ENABLED=False"
Environment="GRADIO_SERVER_NAME=127.0.0.1"
EnvironmentFile=${APP_DIR}/.env
ExecStart=${VENV}/bin/python app.py
Restart=always
RestartSec=10
# Sandboxing — restrict what the service can do
NoNewPrivileges=true
ProtectSystem=strict
ProtectHome=true
PrivateTmp=true
ReadWritePaths=${APP_DIR}
ReadWritePaths=${STORAGE_MOUNT}
[Install]
WantedBy=multi-user.targetNote: Replace
${APP_DIR},${VENV}, and${STORAGE_MOUNT}with your actual paths in the unit file. systemd does not expand shell variables — these are placeholders for documentation.
Key Points:
EnvironmentFile=${APP_DIR}/.env— loads API keys from .env (replacesload_dotenvfor production)User=www-data— runs as a non-root service account (never run as root in production)NoNewPrivileges=true— prevents privilege escalationProtectSystem=strict— makes/usr,/boot,/etcread-onlyProtectHome=true— hides/home,/root,/run/userPrivateTmp=true— isolates/tmpto a private namespaceReadWritePaths=— only the app and storage directories are writableRestart=always— auto-restart on crashes
Enable and start:
sudo systemctl daemon-reload
sudo systemctl enable aitoolkit
sudo systemctl start aitoolkit
sudo systemctl status aitoolkitsudo cp /etc/fail2ban/jail.conf /etc/fail2ban/jail.local
sudo nano /etc/fail2ban/jail.local[sshd]
enabled = true
[apache-auth]
enabled = true
[apache-badbots]
enabled = true
[apache-noscript]
enabled = true
[apache-overflows]
enabled = true
[apache-404-scan]
enabled = true
port = http,https
filter = apache-404-scan
logpath = /var/log/apache2/*access.log
maxretry = 5
findtime = 600
bantime = 3600sudo systemctl restart fail2ban
sudo fail2ban-client statussudo nano /etc/logrotate.d/aitoolkitContent (replace the path with your $APP_DIR):
# Replace with your $APP_DIR path:
/var/www/aitoolkit/app.log {
daily
missingok
rotate 14
compress
delaycompress
notifempty
create 640 www-data www-data
copytruncate
}
Note: Replace the log path and
www-datawith your$APP_DIR/app.logand$APP_USERrespectively. Mode640ensures only the service account and its group can read the log.
Test:
sudo logrotate -d /etc/logrotate.d/aitoolkitThe app runs locally without any server configuration:
# Clone repo
git clone https://github.com/YOUR_USERNAME/YOUR_REPO.git
cd AIToolkit
# Install dependencies
pip install -r requirements.txt
pip install python-dotenv ddgs exchangelib
# Create .env with your API keys (see Step 3.3 for key names)
cp .env.example .env # or create manually
# Run
python app.py
# → http://127.0.0.1:7860Local vs VPS differences:
.envis loaded automatically viapython-dotenvlocally; on VPS it's loaded by systemd- Storage mounts to
./storage/locally instead of$STORAGE_MOUNT - Encryption keys saved to
./.master_keylocally instead of$APP_DIR/.master_key - Both directories are created automatically if missing
Reset local passwords (example usernames — replace with your own):
python -c "
import bcrypt, sqlite3
db = sqlite3.connect('aitoolkit.db')
# NOTE: 'admin123' and 'user123' are example usernames for local dev only.
# Replace with your actual usernames.
for user, pw in [('admin123', 'admin123'), ('user123', 'user123')]:
h = bcrypt.hashpw(pw.encode(), bcrypt.gensalt(12)).decode()
db.execute('UPDATE users SET password_hash=? WHERE username=?', (h, user))
print(f'Reset {user}')
db.commit(); db.close()
"Test web search standalone:
python tool_executor.py --query "latest Mistral AI news" --backend tavily
python tool_executor.py --query "..." --backend brave_web --provider ScalewayTest Exchange CLI standalone:
python exchange_cli.py list --limit 5
python exchange_cli.py calendar-list
python exchange_cli.py contact-search "Mustermann"Run tests:
# Full suite (76 tests + Exchange suite)
python -m pytest tests/ -v
# Just the gr.Blocks structural sentinel (catches dropped event bindings):
python -m pytest tests/test_app_blocks.py -v
# Skip the Exchange tests (need a real EWS server):
python -m pytest tests/ -v \
--ignore=tests/test_exchange.py \
--ignore=tests/test_exchange_calendar.py \
--ignore=tests/test_exchange_contacts.py \
--ignore=tests/test_integration.pyLint / format:
# Medium+ severity rules (matches CI gate)
ruff check --select=F821,S105,S324,B904,B008,E711,E712,E722,RUF012,S607,S101 .
# Auto-format
ruff format .CI: .github/workflows/ci.yml runs ruff (medium+ rule set), bandit (-ll),
mypy on the typed-cluster modules, and the full pytest suite (excluding
Exchange tests) on every push and PR.
$APP_DIR/
├── app.py # Top-level Blocks assembly + login/logout wiring
├── app_helpers.py # Pure-Python helpers (auth, dropdowns, file-explorer paths)
│
│ ── Per-tab UI modules (Phase 7 split — each tab is a standalone module) ──
├── ui/
│ ├── __init__.py
│ ├── chat_tab.py # 💬 Chat (provider/model + tools + history + attachments)
│ ├── transcription_tab.py # 🎙️ Transkription (upload / Storage Box / URL download)
│ ├── vision_tab.py # 👁️ Vision (image Q&A via vision LLMs)
│ ├── image_gen_tab.py # 🎨 Bilderzeugung (text→image)
│ ├── ocr_tab.py # 📄 OCR (Mistral / Groq / Vision LLM / OpenRouter)
│ ├── dictation_tab.py # 🎙️ Diktieren (real-time speech-to-text streaming)
│ ├── doc_translator_tab.py # 📄 Dokument-Übersetzer (DOCX/PPTX with format preservation)
│ ├── format_transplant_tab.py # 📋 Format-Transplant (DOCX layout transplant + LLM stylepass)
│ └── admin_tab.py # 👥 Benutzerverwaltung (admin-only user management)
│ # (📚 Verlauf & Verwaltung still inline in app.py — pending split)
│
│ ── Core ──────────────────────────────────────────────────────
├── config.py # Provider & key configuration
├── crypto_utils.py # AES-256-GCM encryption & key wrapping
├── db_models.py # SQLAlchemy models & schema migrations
├── db_ops.py # Database CRUD operations
│
│ ── Provider / Inference ──────────────────────────────────────
├── provider_utils.py # LLM provider clients, compliance, routing
├── context_utils.py # Token estimation, context pruning, chunking
├── image_utils.py # Image encoding & resize helpers
├── image_gen_utils.py # Image generation & vision analysis
│
│ ── Feature Modules ──────────────────────────────────────────
├── transcription_utils.py # Transcription dispatcher + per-provider runners
├── transcription_youtube.py # YouTube + URL download (split from transcription_utils)
├── storage_utils.py # File storage, pCloud, per-user directories
├── chat_handlers.py # Chat logic, UniversalExtractor, attachments
├── ocr_utils.py # OCR engines (Mistral, Groq, OpenRouter, Ollama/GLM, Vision LLM)
├── translation_utils.py # Async DOCX translation with progress streaming
├── export_utils.py # Export to .docx / .md / .txt
│
│ ── Standalone Tools ─────────────────────────────────────────
├── tool_executor.py # Tool-calling router: web search, Exchange, image gen, … (standalone-testable)
├── exchange_cli.py # Microsoft Exchange EWS: mail, calendar, contacts (standalone CLI + LLM tools)
├── translator.py # DOCX translation engine (NMT + LLM)
├── format_transplant.py # DOCX format-transplant engine
├── dictation_manager.py # Real-time dictation (DG, AAI, Gladia, Mistral, Whisper, CrispASR)
├── crispasr_tool.py # CrispASR adapter: binary discovery, auto-clone, auto-build
├── fn2md.py # Function-to-markdown helper
│
│ ── Tests (76 unit/integration + Exchange suite) ────────────
├── tests/
│ ├── conftest.py # sys.path + env defaults shared across tests
│ ├── test_app_blocks.py # gr.Blocks structural sentinel — catches dropped event bindings
│ ├── test_app_helpers.py # Pure-helper unit tests (auth, file-explorer paths, dispatchers)
│ ├── test_provider_utils.py # EU mode, restricted providers, role-based filtering
│ ├── test_crispasr_tool.py # Binary path probing, build dirs, model search precedence
│ ├── test_crispasr_remote.py # HTTP runner via `responses`, async dictation via `httpx.MockTransport`
│ ├── test_transcription_youtube.py # URL classifier, channel-id extraction, whitelist
│ ├── test_exchange.py # Mail/folder/send tests (stubbed EWS)
│ ├── test_exchange_calendar.py # Calendar CRUD, UID lookup, chunking (100 tests)
│ ├── test_exchange_contacts.py # Contact search, GAL, suggested, mailbox, signatures
│ └── test_integration.py # Credential encryption, modular account flow
│
│ ── Tracking docs ────────────────────────────────────────────
├── PLAN.md # Active optimization roadmap
├── HISTORY.md # Append-only log of completed work
├── LEARNINGS.md # Durable insights (loaded into every Claude session)
│
│ ── Runtime ───────────────────────────────────────────────────
├── pyproject.toml # ruff config (line-length, ignored rules)
├── requirements.txt # Python dependencies
├── .env # API keys (NOT in repo — create manually)
├── .master_key # Global encryption key (CRITICAL — backup!)
├── .pq_keypair # Optional post-quantum keypair
├── aitoolkit.db # SQLite database (auto-created)
├── app.log # Application logs
├── venv/ # Python virtual environment (symlink to conda)
├── jobs/ # Resume job manifests (auto-created)
├── generated_images/ # AI-generated images (auto-created)
├── storage/ # Local fallback when Storage Box unavailable
└── static/ # PWA assets (owned by $APP_USER)
├── custom.css
├── pwa.js
├── manifest.json
├── service-worker.js
├── icon-192.png
└── icon-512.png
$STORAGE_MOUNT/ # Hetzner Storage Box (mounted via CIFS)
└── users/[username]/ # Per-user storage
⚠️ Critical backup: Always back up.master_keytogether withaitoolkit.db. The database is encrypted — without the key, data is unrecoverable.
sudo systemctl status aitoolkit
sudo journalctl -u aitoolkit -f
tail -f "$APP_DIR/app.log"
# Common issues:
# - Missing .env → Create $APP_DIR/.env
# - Missing FFmpeg → sudo apt install ffmpeg
# - Port in use → sudo lsof -i :7860The app uses python-dotenv to load .env at startup. Verify:
python -c "from dotenv import load_dotenv; load_dotenv(); import os; print(os.environ.get('MISTRAL_API_KEY','MISSING')[:8])"# Test each backend directly
python tool_executor.py --query "test" --backend tavily
python tool_executor.py --query "test" --backend duckduckgo
# Check keys are loaded
python -c "from config import API_KEYS; print('TAVILY:', bool(API_KEYS.get('TAVILY')))"# Test CLI directly (shows detailed EWS errors)
python exchange_cli.py --debug list --limit 1
# Common issues:
# - Wrong server name → check EXCHANGE_SERVER (must be EWS host, not webmail URL)
# - Wrong username format → try both DOMAIN\user and user@company.de
# - Self-signed cert → add BaseProtocol.HTTP_ADAPTER_CLS override in exchange_cli.py
# - Exchange 2-year limit → handled automatically (720-day chunking)This happens when your OpenRouter account privacy settings conflict with the selected model. The app automatically adds allow_fallbacks: true to all OpenRouter requests to mitigate this. If it persists, check your OpenRouter privacy settings at https://openrouter.ai/settings/privacy.
curl -I "https://$DOMAIN/manifest.json"
curl -I "https://$DOMAIN/service-worker.js"
curl -I "https://$DOMAIN/static/icon-192.png"
# All should return 200 OKmount | grep storage
sudo mount -t cifs //u12345-sub1.your-storagebox.de/u12345-sub1 "$STORAGE_MOUNT" \
-o credentials=/etc/cifs-credentials,uid=$(id -u $APP_USER),gid=$(id -g $APP_USER)sudo apache2ctl configtest
sudo tail -f /var/log/apache2/error.log
sudo systemctl restart apache2cd "$APP_DIR"
git pull
source "$VENV/bin/activate"
pip install -r requirements.txt --upgrade
sudo systemctl restart aitoolkitmkdir -p /var/backups/aitoolkit
DATE=$(date +%Y%m%d)
cp "$APP_DIR/aitoolkit.db" /var/backups/aitoolkit/db-$DATE.bak
cp "$APP_DIR/.master_key" /var/backups/aitoolkit/master_key-$DATE.bak
[ -f "$APP_DIR/.pq_keypair" ] && \
cp "$APP_DIR/.pq_keypair" /var/backups/aitoolkit/pq_keypair-$DATE.baksudo certbot renew --dry-rundf -h "$STORAGE_MOUNT"
df -h "$APP_DIR"sudo systemctl start aitoolkit
sudo systemctl stop aitoolkit
sudo systemctl restart aitoolkit
sudo systemctl status aitoolkitsudo journalctl -u aitoolkit -f
tail -f "$APP_DIR/app.log"
sudo tail -f /var/log/apache2/error.log- Application:
$APP_DIR/app.py— UI & event wiring - Chat logic:
$APP_DIR/chat_handlers.py - OCR engines:
$APP_DIR/ocr_utils.py - Transcription:
$APP_DIR/transcription_utils.py - Tool Router:
$APP_DIR/tool_executor.py - Exchange CLI:
$APP_DIR/exchange_cli.py - Provider Config:
$APP_DIR/config.py - Environment:
$APP_DIR/.env— secrets - Encryption Key:
$APP_DIR/.master_key— backup this! - Service:
/etc/systemd/system/aitoolkit.service - Apache SSL:
/etc/apache2/sites-available/aitoolkit-le-ssl.conf
- ✅ Service runs as non-root user (
$APP_USER, not root) - ✅ systemd sandboxing enabled (
NoNewPrivileges,ProtectSystem=strict,ProtectHome,PrivateTmp) - ✅ Security headers configured (HSTS, X-Frame-Options DENY, X-Content-Type-Options nosniff, Referrer-Policy, Permissions-Policy)
- ✅
.envpermissions set to600 - ✅
.master_keymode640, owned byroot:$APP_USER— backup regularly with the DB - ✅ Database files (
*.db) mode660 - ✅
/etc/cifs-credentialspermissions set to600 - ✅ Fail2Ban enabled and running
- ✅ SSL certificate valid and auto-renewing
- ✅ Firewall configured (only ports
22, 80, 443open) - ✅ Apache upload limits configured (1GB)
- ✅ Storage Box mounted with restricted permissions (owned by
$APP_USER) - ✅ EU-only mode enabled (
EU_ONLY_MODE = Trueinconfig.py) - ✅ Requesty configured with EU router endpoint (
router.eu.requesty.ai) - ✅ OpenRouter
allow_fallbacks: trueset to avoid data-policy 404s - ✅ Web search keys stored in
.env, never hardcoded - ✅ Exchange credentials stored in
.envor encrypted per-user in DB (never plaintext in code) - ✅ Gradio API info/config endpoints blocked via Apache RewriteRules
Last Updated: June 2026 — Security hardening: non-root service account, systemd sandboxing, security headers (HSTS, X-Frame-Options, CSP), hardened file permissions, parameterized deployment paths