Docker-based IMAP spam filter for any IMAP server that supports IDLE.
Designed to run 24/7 on Unraid (templates included) or any Linux host with
Docker. Per-account modes, move-based Bayes training, never deletes mail.
Optional read-only web dashboard with per-user access levels.
Multi-arch image (linux/amd64, linux/arm64).
Four containers on a shared spamnet Docker network:
| Container | Image | Role |
|---|---|---|
| spamfilter-redis | redis:8-alpine |
Persists rspamd Bayes tokens, fuzzy hashes, neural weights (AOF + RDB). |
| spamfilter-unbound | mvance/unbound:latest |
Local recursive DNS. Keeps DNSBL lookups out of shared-resolver quotas. |
| spamfilter-rspamd | rspamd/rspamd:latest |
Scores messages: Bayes / fuzzy / neural / RBL. No autolearn. |
| spamfilter | this repo (custom) | Python service. One thread per account, IDLE on Inbox, polls Junk, scores, moves, learns. |
Per-account operating modes (set in accounts.yml, promoted manually):
- shadow - scan + log only, no mailbox writes
- flag - shadow + sets
\Flaggedon suspect mail - move - flag + after
move_grace_seconds, MOVEs to Junk
Move-based training (no special folders needed in daily use):
- Inbox -> Junk = learn as spam (after
learn_grace_seconds, default 300s) - Junk -> Inbox = learn as ham (after
learn_grace_seconds) - IMAP keyword
$Junk/$NotJunkskips the grace window
Folder-based training (bootstrap and bulk corrections):
- Move spam to
Junk/Train-Spam-> filter learns, moves toJunk/Trained-Spam - Copy (never move) known-good mail to
Junk/Train-Ham-> filter learns, moves toJunk/Trained-Ham. The copy is destroyed by retention; the original in your sorted folder stays untouched. Moving ham here means the only copy will eventually end up in Trash. - Both
Junk/Trained-*folders are swept to Trash aftertrained_retention_days(default 7)
Hard rules:
- Never deletes. Only IMAP MOVE. Trash retention is the mail provider's job.
- Fails closed. rspamd unreachable / parse error / folder missing => message stays put.
- No autolearn. Bayes only learns from explicit user moves or the Train-Spam / Train-Ham folders.
The filter cares about seven folders per account: Inbox, Junk, Trash, Junk/Train-Spam, Junk/Trained-Spam, Junk/Train-Ham, Junk/Trained-Ham. Day one most users don't need to touch any of them.
On connect the filter runs LIST "" "*" and reads the server's hierarchy
delimiter (typically / or .). Folder names in accounts.yml use / and
get rewritten at runtime: Junk/Train-Spam becomes Junk.Train-Spam on a
Dovecot server using .. The detected delimiter is logged once at connect:
connected, delimiter='/', mode=shadow
If your IMAP server advertises folders with RFC 6154 SPECIAL-USE attribute
flags (most modern servers do), the filter uses those names regardless of
what your mail client decided to call them locally. So if Apple Mail picked
Spam as the junk folder (and the server tagged it \Junk), the filter
follows the same folder automatically. No config edit required.
Logged at connect when override happens:
auto-detected junk folder via SPECIAL-USE: Spam (was Junk)
auto-detected trash folder via SPECIAL-USE: Bin (was Trash)
remapped spam_train: Junk/Train-Spam -> Spam/Train-Spam
remapped trained_spam: Junk/Trained-Spam -> Spam/Trained-Spam
remapped ham_train: Junk/Train-Ham -> Spam/Train-Ham
remapped trained_ham: Junk/Trained-Ham -> Spam/Trained-Ham
Controlled by auto_special_folders (default true). Set it to false
in accounts.yml if you want the filter to use the literal junk: /
trash: values you configured.
The filter only auto-creates folders it owns:
Junk/Train-Spam,Junk/Trained-Spam,Junk/Train-Ham,Junk/Trained-Ham(under the server-detected junk parent)
It refuses to create the core user folders:
INBOX(required by the IMAP RFC, must already exist)JunkandTrash
If Junk or Trash are missing after SPECIAL-USE detection, the filter
aborts with a clear error rather than silently creating duplicate
hierarchies in your mailbox (this is how an account ends up with two
parallel Junk/ and Spam/ trees and the filter trains under the wrong
one). Either enable SPECIAL-USE on your IMAP server, or set explicit
junk: / trash: names in accounts.yml that match folders that
already exist.
To pin specific folder names (e.g. you want the filter to use a folder
named Spam-quarantine rather than whatever the server thinks is \Junk):
accounts:
- name: your_name
imap_host: ...
user: ...
password: "..."
auto_special_folders: false
junk: Spam-quarantine
trash: Binspam_train, trained_spam, ham_train, and trained_ham follow the
(possibly auto-detected) junk parent automatically; you only need to
override them if you want them somewhere outside the Junk subtree.
Each container installs as a normal Unraid Docker app. No Compose Manager
required. All four containers run on a shared user-defined Docker network
called spamnet so they can resolve each other by name.
Install User Scripts from Community Apps if you don't already have it.
Settings -> User Scripts -> Add New Script, name it spamfilter-bootstrap,
paste in the contents of
unraid/bootstrap.sh:
The script is idempotent and does all of the following:
- Creates the user-defined
spamnetDocker network - Creates the
/mnt/user/appdata/spamfilter/{redis,redis-config,state,rspamd/data,rspamd/local.d}layout (theredis/andrspamd/data/dirs are owned by the images' internal uids, mode 750) - Downloads the rspamd
local.d/*configs from this repo (only if missing) - Seeds
accounts.ymlfromaccounts.yml.example(only if missing) - Generates random passwords into
state/controller.password(rspamd controller) andstate/redis.password(Redis auth), only if missing - Renders
worker-controller.inc, the rspamdredis.confclient config, and the Redis server config intoredis-config/redis.confwith those passwords substituted in
Set the schedule to "At First Array Start Only" and click Run Script once to bootstrap immediately. It'll re-run on every array start, so the network/layout are recreated automatically after a USB reformat or migration.
If you'd rather not use User Scripts, run the same script over SSH:
curl -fsSL https://raw.githubusercontent.com/marcelverdult/imap-spamfilter/main/unraid/bootstrap.sh | bashnano /mnt/user/appdata/spamfilter/accounts.ymlSet imap_host, fill in each account's user / password, leave
mode: shadow for the first week. This is the only file you have to
edit by hand.
In the Unraid web UI (Docker tab -> Add Container -> Template -> "Add" a
local template), import each XML from this repo's unraid/ directory:
unraid/spamfilter-redis.xml-> install (no prompts beyond the data path)unraid/spamfilter-unbound.xml-> installunraid/spamfilter-rspamd.xml-> install (controller password is read from the bootstrap-generatedstate/controller.passwordfile)unraid/spamfilter.xml-> setDEFAULT_JUNK_RETENTION_DAYSandDEFAULT_TRAINED_RETENTION_DAYSif you want non-defaults (defaults 10 / 7), install
Each template defaults its paths under /mnt/user/appdata/spamfilter/<service>,
matches typical Unraid conventions, and references Network=spamnet.
Tip. If you'd rather get the templates without cloning the repo, point Unraid's "Template repositories" setting (Docker tab -> Advanced view) at this GitHub repo URL. The XMLs in
unraid/will then show up in the standard "Add Container" template picker.
After spamfilter starts, watch the logs:
docker logs -f spamfilterExpect:
[main] loaded 1 account(s): your_name
[your_name] connecting to imap.your-mail-provider.example:993 as you@your-domain.example
[your_name] connected, delimiter='.', mode=shadow
The rspamd controller (port 11334) is not published to the host — the
stack deliberately keeps it on the internal spamnet network only. To
reach rspamd's own web UI, either add an 11334:11334 port mapping to the
rspamd container yourself, or use the read-only dashboard (see below). The
controller password, if you need it, is the auto-generated value:
cat /mnt/user/appdata/spamfilter/state/controller.passwordBayes is roughly useless until ~200 spam and ~200 ham are learned. Two ways to feed it in bulk:
a) Drop into the auto-created training folders (easiest, no CLI)
The filter creates four folders on each account:
Train-Spam— move spam here (originals are spam, OK to lose)Trained-Spam— filter archives here after learning; retention -> TrashTrain-Ham— copy only known-good mail here (never move)Trained-Ham— filter archives here after learning; retention -> Trash
Why copy for ham: the copy in Train-Ham is destroyed by retention.
If you move a legitimate mail into Train-Ham you lose your only
copy. Bulk-train ham by selecting a known-good folder in your mail
client (e.g. Archive/Family) and Copy to Train-Ham - the
originals in your folders stay untouched.
b) bootstrap_train.py CLI (faster for one-off bulk runs)
docker exec -it spamfilter python bootstrap_train.py your_name Train-Spam spam --dry-run
docker exec -it spamfilter python bootstrap_train.py your_name Train-Spam spam --move-to Trained-Spam
docker exec -it spamfilter python bootstrap_train.py your_name Train-Ham ham --move-to Trained-HamAfter ~1 week in shadow:
vi /mnt/user/appdata/spamfilter/accounts.yml # change mode: shadow -> flag
docker restart spamfilterAfter another week, promote to move. Promote each family member
independently. Modify the file by hand any time - changes take effect on
container restart.
The published ghcr.io/marcelverdult/imap-spamfilter image is multi-arch
(linux/amd64 and linux/arm64), so the same stack runs on Synology,
generic Linux servers, Raspberry Pi 4/5, ARM mini-PCs, etc.
git clone https://github.com/marcelverdult/imap-spamfilter
cd imap-spamfilter
# Pick a host path for appdata and point docker-compose.yml at it:
export SPAMFILTER_APP=/srv/spamfilter
sed -i "s|/mnt/user/appdata/spamfilter|$SPAMFILTER_APP|g" docker-compose.yml
# Run the bootstrap: it creates the directory layout, downloads and
# renders the rspamd + redis configs, and generates the rspamd
# controller and Redis passwords. SPAMFILTER_APP tells it where.
bash unraid/bootstrap.sh
# Edit the seeded account list (the only file you must touch by hand):
nano $SPAMFILTER_APP/accounts.yml
docker compose pull # use the prebuilt ghcr image
docker compose up -d
docker compose logs -f spamfilterbootstrap.sh is the single source of the rendered configs — the
rspamd worker-controller.inc, the rspamd Redis client config, and
the Redis server config — so the compose stack just mounts what it
produced, exactly like the Unraid path. Re-run it after pulling config
changes from the repo. .env is optional: leave RSPAMD_PASSWORD
unset and the filter reads the bootstrap-generated
state/controller.password.
The compose file matches the Unraid layout one-for-one, so backups, docs, and the SQLite audit queries all apply the same way. Pick one install path; don't run both against the same mailbox.
Items the spec did not pin. Confirm before trusting the filter in move mode.
- IMAP hostname - get from your mail provider's account/admin panel.
Set as
imap_host:inaccounts.yml. Verify by opening port 993 withopenssl s_client -connect host:993if unsure. - IMAP folder hierarchy delimiter - auto-detected on connect, logged
as
connected, delimiter='.', mode=shadow. Folder names in config use/; filter rewrites to whatever the server actually uses. $Junk/$NotJunkkeyword spelling - RFC 5788 names. Apple Mail and most modern clients set them literally. If your client uses a vendor variant, editJUNK_KEYWORD/NOTJUNK_KEYWORDinfilter/filter.py.- Filter container image - prebuilt and pushed to
ghcr.io/marcelverdult/imap-spamfilter:latestby the GitHub Actions workflow in.github/workflows/build.ymlon every push tomainand everyv*tag. After the first push you must flip the GHCR package visibility to public (GitHub -> your profile -> Packages ->imap-spamfilter-> Package settings -> Change visibility -> Public), otherwise Unraid will fail to pull withmanifest unknown. If you'd rather build locally:then edit the Unraid template'sdocker build -t imap-spamfilter:local ./filter
Repositoryfield.
Every field below is optional unless marked required. Per-account values
override defaults: values; both override built-in defaults from filter.py.
| Key | Example | Notes |
|---|---|---|
name |
marcel |
label used in logs and SQLite, must be unique |
imap_host |
imap.example.de |
hostname only, no scheme |
user |
you@example.de |
login username (usually the full address) |
password |
"..." |
quote to keep YAML happy with special chars |
| Key | Default | Notes |
|---|---|---|
imap_port |
993 |
port |
ssl |
true |
false = use port 143 with STARTTLS instead |
| Key | Default | Notes |
|---|---|---|
inbox |
INBOX |
RFC-mandated; do not change |
junk |
Junk |
auto-detected via RFC 6154 if server advertises \Junk |
trash |
Trash |
auto-detected via \Trash |
spam_train |
Junk/Train-Spam |
drop spam here for the filter to learn |
trained_spam |
Junk/Trained-Spam |
post-learn archive (auto-trashed by retention) |
ham_train |
Junk/Train-Ham |
drop (or copy) known-good mail here for ham training |
trained_ham |
Junk/Trained-Ham |
post-learn archive for ham (auto-trashed by retention) |
auto_special_folders |
true |
set false to use literal junk/trash names |
All four train/trained folders live under the server's junk parent and are
auto-relocated together when SPECIAL-USE remaps the junk name (e.g. when
the server actually flags Spam as \Junk, the four become
Spam/Train-Spam, Spam/Train-Ham, etc.). The filter refuses to create
Junk / Trash themselves to avoid duplicate junk hierarchies in the
user's mailbox.
| Key | Default | Notes |
|---|---|---|
mode |
shadow |
shadow | flag | move |
threshold |
8.0 |
rspamd score >= this counts as spam |
min_threshold_allowed |
5.0 |
startup refuses to run if threshold is below this |
reject_score_above |
100.0 |
scores outside ±this are treated as failed scan |
| Key | Default | Notes |
|---|---|---|
move_grace_seconds |
60 |
delay between flag and move (mode=move); 0 = move instantly |
learn_grace_seconds |
300 |
undo window before any Bayes update |
idle_timeout |
1500 |
IMAP IDLE re-issue interval (must be < 30 min) |
poll_interval |
600 |
fallback poll when IDLE not supported |
junk_poll_interval |
120 |
how often to scan Junk for user moves |
retention_check_interval |
3600 |
how often retention sweeps run |
| Key | Default | Notes |
|---|---|---|
max_moves_per_hour |
30 |
breach triggers safe-mode for the account |
max_learns_per_hour |
50 |
breach triggers learning-only safe-mode |
max_train_per_run |
100 |
cap per drain_train_spam batch |
safe_mode_unseen_cap |
500 |
sticky safe-mode if Inbox UNSEEN exceeds this; raise for accounts that normally keep many unread |
| Key | Default | Notes |
|---|---|---|
junk_retention_days |
10 |
Junk -> Trash after N days, 0 disables |
trained_retention_days |
7 |
Trained-Spam and Trained-Ham -> Trash after N days |
learn_from_moves |
true |
set false to disable all learning (scan-only) |
The DEFAULT_JUNK_RETENTION_DAYS and DEFAULT_TRAINED_RETENTION_DAYS
environment variables on the filter container override defaults: for
those two keys (useful for the Unraid template form).
rspamd's Bayes classifier in this project runs with users_enabled = true,
which means tokens are stored per-recipient. By default each IMAP account
in accounts.yml trains its own Bayes namespace keyed by its user.
| Key | Default | Notes |
|---|---|---|
bayes_user |
unset | rspamd User header used for both scan and learn; overrides the per-recipient default |
Use bayes_user to pool training across several of your own mailboxes
while leaving other users (e.g. family members) isolated. Set the same
bayes_user value on every account that should share data. Accounts that
omit the field stay isolated under their own IMAP user.
accounts:
- name: marcel_main
user: marcel@verdult.de
password: "..."
bayes_user: marcel-pool # shared
- name: marcel_work
user: work@verdult.de
password: "..."
bayes_user: marcel-pool # shared (same value)
- name: family_member
user: kid@verdult.de
password: "..."
# no bayes_user -> isolated, keyed by kid@verdult.deSwitching an existing account from per-recipient to a bayes_user value
(or vice versa) starts a fresh Bayes namespace. The prior tokens stay in
Redis under the old key but are no longer consulted. Either re-train under
the new identity (drop mail back into Train-Spam) or migrate the keys in
Redis manually.
Everything stateful lives under /mnt/user/appdata/spamfilter/:
/mnt/user/appdata/spamfilter/
├── accounts.yml # account list and per-account overrides (SECRETS)
├── redis/ # Bayes corpus, fuzzy hashes, neural weights
├── redis-config/redis.conf # rendered Redis server config (bootstrap.sh)
├── state/
│ ├── spamfilter.db # SQLite audit log + state
│ ├── heartbeat # epoch updated each loop (healthcheck source)
│ ├── controller.password # generated rspamd controller password
│ ├── redis.password # generated Redis password
│ ├── dashboard_secret # generated dashboard session secret
│ └── dashboard_users # dashboard logins (present once the dashboard is used)
└── rspamd/
├── local.d/ # rspamd configs (downloaded + rendered by bootstrap.sh)
└── data/ # rspamd-managed caches
Back up redis/, state/, and accounts.yml. Skip rspamd/data/ and
redis-config/ (both regenerate — the latter is re-rendered by
bootstrap.sh from state/redis.password). Unraid's built-in CA
Backup plugin pointed at the appdata path is sufficient.
A small read-only Flask dashboard is available for at-a-glance stats: a health banner, filter KPIs, a 14-day scan trend, rspamd Bayes progress, recent scans/learns, and per-account activity. Responsive, dark-mode aware. Off by default. No actions, no buttons — read-only.
The container always listens internally on port 8080; pick any free host port in your orchestrator's port mapping.
Access is gated by a real login form with a server-side session (no more browser-cached basic auth). Add users with the bundled helper inside the container:
docker exec -it spamfilter python dashboard.py
# prompts for a username + password, writes state/dashboard_usersIt adds (or updates) the user in state/dashboard_users — one
username:hash:scope line per user, passwords pbkdf2-hashed, #
comments allowed. The dashboard re-reads that file on every login, so
adding or changing a user takes effect without a restart. You can
edit the file by hand too.
Access levels. The helper also asks for a scope:
admin— sees everything: all accounts plus the system-wide rspamd lifetime totals and Bayes stats.- one or more account names (the
name:values fromaccounts.yml) — a restricted user who sees only those accounts' scans, learns, events and per-account stats, and not the global rspamd section. Handy for letting a household member watch the filter on their own mailbox. Usernames are free-form, so an email address works fine as the login name.
The helper lists the account names from accounts.yml and rejects an
unknown one, so a typo cannot silently bind a user to nothing. A line
with no scope field defaults to admin.
Two env-var alternatives also work, if you prefer config over a file:
DASHBOARD_USERS (comma-separated username:hash:scope entries) and
the legacy single-user DASHBOARD_USER + DASHBOARD_PASSWORD
(plaintext, admin). All three sources merge.
The session signing secret is generated once into
state/dashboard_secret and reused across restarts.
The dashboard starts once at least one user exists (file or env). On
Unraid set the host port in the "Dashboard port" mapping and Apply;
via docker-compose uncomment the ports: block. Open
http://<host>:<port>/.
There is no TLS in the dashboard itself — terminate HTTPS at a
reverse proxy if you expose it beyond your LAN. If the proxy forwards
HTTPS, set DASHBOARD_COOKIE_SECURE=1 so the session cookie is
marked Secure.
Pages:
/health banner + filter KPIs (24h / 7d, spam-catch rate) + 14-day scan-per-day trend + rspamd lifetime totals (scanned, spam/ham counts, fuzzy hashes, connections, action breakdown) + Bayes learn progress bars with per-class status and learn-balance check + active safe-mode + recent learns/messageslast 200 scored msgs with score-band filter/learnedlast 300 learn / learn_failed / learn_giveup events/eventstail of the full events table/accountsper-account scan / learn / fail counts, total spam & ham learns, plus safe-mode
The SQLite DB is WAL mode; safe to query while the filter is running.
sqlite3 /mnt/user/appdata/spamfilter/state/spamfilter.dbUseful queries:
-- 50 most recent events for an account
SELECT datetime(ts, 'unixepoch', 'localtime'), event, substr(message_id,1,40), detail
FROM events WHERE account='your_name' ORDER BY ts DESC LIMIT 50;
-- "I think I lost a mail" - search by subject across all folders
SELECT datetime(last_seen, 'unixepoch', 'localtime'),
current_folder, our_score, our_action, learned_as, sender, subject
FROM messages WHERE account='your_name' AND subject LIKE '%invoice%';
-- per-account rate consumption in the last hour
SELECT account, action, COUNT(*) FROM rate_limit
WHERE ts >= strftime('%s','now','-1 hour') GROUP BY account, action;Two distinct mechanisms protect the account from runaway behaviour:
max_moves_per_hour, max_learns_per_hour, and max_train_per_run
cap the number of mailbox-modifying actions per hour. When a limit is
hit the filter logs a warning once per minute, refuses that action
for the rest of the rolling-hour window, then resumes automatically as
old entries roll out of the window. No DB state, no manual recovery.
The only sticky safe-mode is scope="all" triggered when Inbox UNSEEN
exceeds safe_mode_unseen_cap (default 500) - a sanity check that
something is wrong with the mailbox (mass-import, server restored from
backup, etc.). Scanning halts for that account. Raise the per-account
override in accounts.yml for inboxes that legitimately keep a large
unread backlog.
It auto-exits on the next scan_inbox pass once UNSEEN drops back
under the cap, so the typical "I marked everything read" recovery is
hands-off. To clear manually anyway:
SELECT account, scope, datetime(entered_at, 'unixepoch', 'localtime'), reason
FROM safe_mode;
DELETE FROM safe_mode WHERE account='your_name';
-- or to clear all: DELETE FROM safe_mode;Everything stateful lives under /mnt/user/appdata/spamfilter/. Nightly
backups of that whole tree preserve:
- SQLite state DB (filter's per-account messages, events, rate-limit, safe-mode, uidvalidity tables)
- Redis AOF + RDB (Bayes tokens — the actual training)
- rspamd
/var/lib/rspamdcache (incl. neural-meta weights, which take days of confident decisions to rebuild from scratch) - accounts.yml, the rspamd controller password, and the Redis password
(
state/controller.password,state/redis.password)
On Unraid, install the Appdata Backup Community App (by KluthR)
and schedule it nightly. Set "Stop container before backup" for all
four spamfilter* containers - downtime is ~30 s while the tar runs,
and the filter reconnects automatically via IDLE. Keep e.g. 14 daily
snapshots; expect 50-150 MB raw per snapshot, ~10-40 MB after zst
compression.
Restore is the reverse: stop the four containers, extract the tar over
/mnt/user/appdata/spamfilter/, start the containers.
- No allowlist. Intentional. rspamd's DKIM/SPF symbols already give negative score to aligned mail. Fix misclassifications by training, not by allowlisting.
- Dashboard is read-only. It shows activity; it has no controls to move, learn, or change config. Inspect deeper via SQLite if needed.
- IDLE re-issued every
idle_timeout(default 1500s). Lower it if your server drops idle connections faster. - No multi-host coordination. Don't run two filter instances against the same mailbox.
.
├── README.md
├── LICENSE
├── docker-compose.yml # alt install path
├── .env.example
├── .gitignore
├── accounts.yml.example
├── filter/ # custom Python service
│ ├── Dockerfile
│ ├── requirements.txt
│ ├── filter.py
│ ├── dashboard.py
│ └── bootstrap_train.py
├── redis/ # Redis server config template
├── rspamd/local.d/ # rspamd config templates + static configs
└── unraid/ # bootstrap.sh + Unraid Docker templates
├── bootstrap.sh
├── spamfilter-redis.xml
├── spamfilter-unbound.xml
├── spamfilter-rspamd.xml
└── spamfilter.xml
.env, accounts.yml, state/, rspamd/data/, the appdata redis/ data
dir and redis-config/, and the rendered secret-bearing configs
(worker-controller.inc, rspamd/local.d/redis.conf) are gitignored. Only
the *.template files are tracked; nothing in version control contains
secrets.