Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,19 @@ node_modules
npm-debug.log
Dockerfile
docker-compose.yml
docker-compose.app.yml
.dockerignore
.git
.gitignore
.github
.idea

.env
.env.*
!.env.example

data/backups
logs
docs
*.md
INFRA.md
2 changes: 1 addition & 1 deletion .github/dependabot.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ updates:
directory: /
schedule:
interval: daily
open-pull-requests-limit: 0
open-pull-requests-limit: 5
labels:
- dependencies
- npm
5 changes: 3 additions & 2 deletions .github/workflows/provision-and-deploy.yml
Original file line number Diff line number Diff line change
Expand Up @@ -302,6 +302,7 @@ jobs:
printf 'SILENT=%s\n' "${BOT_SILENT:-false}"
printf 'CONTAINER_NAME=%s\n' "${CONTAINER_NAME}"
printf 'IMAGE_TAG=%s\n' "${IMAGE_TAG}"
printf 'GHCR_IMAGE=%s\n' "ghcr.io/$(echo '${{ github.repository }}' | tr '[:upper:]' '[:lower:]')"
} > "${RUNNER_TEMP}/bot.env"

- name: Sync deploy artifacts to VPS
Expand All @@ -325,8 +326,8 @@ jobs:
GHCR_PULL_TOKEN: ${{ secrets.GHCR_PULL_TOKEN }}
run: |
if [ -n "${GHCR_PULL_TOKEN}" ]; then
ssh -i ~/.ssh/bot_deploy_key deploy@"${BOT_HOST}" \
"echo '${GHCR_PULL_TOKEN}' | docker login ghcr.io -u '${{ github.actor }}' --password-stdin"
printf '%s\n' "${GHCR_PULL_TOKEN}" | ssh -i ~/.ssh/bot_deploy_key deploy@"${BOT_HOST}" \
"docker login ghcr.io -u '${{ github.actor }}' --password-stdin"
fi

- name: Deploy bot
Expand Down
2 changes: 2 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -13,4 +13,6 @@ FROM node:24-alpine AS runtime
WORKDIR /app
COPY --from=deps /app/node_modules ./node_modules
COPY . .
RUN chown -R node:node /app
USER node
ENTRYPOINT ["npm", "start"]
59 changes: 56 additions & 3 deletions INFRA.md
Original file line number Diff line number Diff line change
Expand Up @@ -128,9 +128,14 @@ and the droplet pulls it during deploy. Make the pull work one of two ways:
- **Public package (simplest):** in the GHCR package settings, set the package
visibility to public. No extra secret is needed.
- **Private package:** create a classic PAT with the `read:packages` scope and
save it as the repository secret `GHCR_PULL_TOKEN`. The deploy job uses it to
`docker login ghcr.io` on the droplet. If the secret is empty, the login step
is skipped (so it is safe to leave unset for a public package).
save it as the repository secret `GHCR_PULL_TOKEN`. The deploy job pipes the
token to `docker login --password-stdin` on the droplet over SSH (the token is
never embedded in the remote command string). If the secret is empty, the login
step is skipped (so it is safe to leave unset for a public package).

The deploy job also writes `GHCR_IMAGE=ghcr.io/<owner>/<repo>` into the server
`.env` so `docker-compose.app.yml` can pull the correct registry path without
hardcoding it in the compose file.

## 5. Optional GitHub Variables

Expand All @@ -147,6 +152,7 @@ default below.
| `DO_DROPLET_NAME` | repo | `event-queue-bot` |
| `DO_ENABLE_BACKUPS` | repo | `false` |
| `DO_SWAP_SIZE` | repo | `1G` |
| `SSH_ALLOW_IPS` | repo | empty (SSH open to all) |
| `APP_PATH` | env | branch-derived (see above) |
| `BOT_TOP_GG_TOKEN` | env | empty |
| `BOT_PATCH_NOTES_CHANNEL_ID` | env | empty |
Expand Down Expand Up @@ -189,6 +195,16 @@ a re-provision. Firewall rules are reconciled by `scripts/ensure-firewall.sh`
in both the `provision` and `deploy` jobs, so firewall changes apply even when
provision is skipped.

Production containers run as the `node` user (see `Dockerfile`), with per-service
CPU/memory limits in `docker-compose.app.yml` suited to a 1 GB droplet running
both prod and dev bots. Patch notes and other stdin prompts are disabled in
production compose (`stdin_open` / `tty` are local-only in `docker-compose.yml`);
set `BOT_FORCE_SEND_PATCH_NOTES=true` on the environment when you want patch
notes sent without an interactive prompt.

Local `.env` files are excluded from the Docker build context (`.dockerignore`)
so secrets are not baked into images.

Future pushes to `master` deploy to dev automatically; each run pauses at
`gate` for `dev-gate` reviewer approval before `build-and-push`, `discover`,
`provision`, and `deploy` proceed. Prod is reached only by merging
Expand Down Expand Up @@ -242,6 +258,19 @@ Prod and dev share one droplet but no state: separate containers
(`queue-bot` vs `queue-bot-nightly`), separate app dirs and `data/main.sqlite`,
separate Discord applications.

## Single-instance requirement

Each environment should run **one** bot container against **one** SQLite database.
SQLite write concurrency is poor with multiple writers on the same file.

**Event sync** (`EventSyncLock`) uses a row in `event_sync_lock` so two processes
that accidentally share a database will not run `syncEventQueues` /
`reconcileRoomChannels` in parallel. Stale locks older than 10 minutes are cleared
at startup.

**Scheduled occurrence jobs** (`node-schedule` in `event-jobs.registry`) remain
process-local — do not run multiple bot processes against the same DB.

## Re-provisioning via the CLI

When cloud-init changes (deploy script, sudoers, swap size, etc.), delete the
Expand All @@ -264,6 +293,30 @@ including **`droplet:delete`** for teardown. Typical sequence:
4. Restore each database if needed (stop container, copy `main.sqlite` back,
restart).

## SSH access and hardening

SSH (port 22) is reachable from the public internet by default. The DigitalOcean
cloud firewall created by `scripts/ensure-firewall.sh` allows inbound TCP/22 from
`0.0.0.0/0` and `::/0` unless you restrict it.

**Mitigations in this repo:**

- **fail2ban** — installed on first boot via cloud-init with an `sshd` jail
(5 failures → 1 hour ban). Requires a **re-provision** to apply on an
existing droplet.
- **Optional IP allowlist** — set the repository variable `SSH_ALLOW_IPS` to a
comma-separated list of CIDRs (e.g. `203.0.113.10/32,198.51.100.0/24`). The
next deploy run updates the DO firewall to allow SSH only from those addresses.
Useful when your admin IP or a VPN egress range is stable. GitHub Actions
runners use varying IPs, so do not rely on this alone for CI unless you also
allow the ranges you need for deploy SSH.
- **Project-level controls** — consider a DigitalOcean project firewall,
Tailscale-only SSH, or disabling password auth (already off via cloud-init).

Treat an open SSH port as a residual risk: keep the OS patched, rotate deploy
keys if compromised, and prefer restricting SSH at the network layer when
feasible.

## Connect to the Droplet

Get the droplet IPv4 from the latest workflow's `discover` job, `doctl compute
Expand Down
6 changes: 6 additions & 0 deletions README-dev.md
Original file line number Diff line number Diff line change
Expand Up @@ -159,6 +159,12 @@ npm run lint

This project is designed to run without compiling thanks to `@swc-node/register/esm`.

### Dependencies

**Dual schedulers:** Both `node-cron` and `node-schedule` are intentional. `node-cron` runs recurring cron expressions (queue `/schedule` jobs and DB maintenance in `db-scheduled-tasks.ts`). `node-schedule` runs one-shot `Date`-based jobs for event occurrence lifecycle (open, lock, cleanup, room pings/pulls in `event.utils.ts`). Consolidating would require reimplementing date-specific scheduling on top of cron or vice versa.

**`drizzle-kit`** is a dev dependency only — migrations are generated at build time and applied at runtime by `drizzle-orm`'s migrator.

## Migrating from the legacy project (pre June 2024)

Open a terminal and navigate to the following directory in this project: `data/migrations/legacy-export`.
Expand Down
38 changes: 38 additions & 0 deletions data/migrations/0015_early_black_knight.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
DROP INDEX `event_occurrence_room_ping_occurrence_id_index`;--> statement-breakpoint
ALTER TABLE `event_occurrence_room_ping` ADD `guild_id` text REFERENCES guild(guild_id);--> statement-breakpoint
UPDATE `event_occurrence_room_ping` SET `guild_id` = (SELECT `guild_id` FROM `event_occurrence` WHERE `event_occurrence`.`id` = `event_occurrence_room_ping`.`occurrence_id`);--> statement-breakpoint
PRAGMA foreign_keys=OFF;--> statement-breakpoint
CREATE TABLE `__new_event_occurrence_room_ping` (
`guild_id` text NOT NULL,
`occurrence_id` integer NOT NULL,
`event_queue_id` integer NOT NULL,
`handled_at` integer NOT NULL,
PRIMARY KEY(`occurrence_id`, `event_queue_id`),
FOREIGN KEY (`guild_id`) REFERENCES `guild`(`guild_id`) ON UPDATE no action ON DELETE cascade,
FOREIGN KEY (`occurrence_id`) REFERENCES `event_occurrence`(`id`) ON UPDATE no action ON DELETE cascade,
FOREIGN KEY (`event_queue_id`) REFERENCES `event_queue`(`id`) ON UPDATE no action ON DELETE cascade
);
--> statement-breakpoint
INSERT INTO `__new_event_occurrence_room_ping`(`guild_id`, `occurrence_id`, `event_queue_id`, `handled_at`) SELECT `guild_id`, `occurrence_id`, `event_queue_id`, `handled_at` FROM `event_occurrence_room_ping`;--> statement-breakpoint
DROP TABLE `event_occurrence_room_ping`;--> statement-breakpoint
ALTER TABLE `__new_event_occurrence_room_ping` RENAME TO `event_occurrence_room_ping`;--> statement-breakpoint
CREATE INDEX `event_occurrence_room_ping_guild_id_occurrence_id_index` ON `event_occurrence_room_ping` (`guild_id`,`occurrence_id`);--> statement-breakpoint
DROP INDEX `event_occurrence_room_pull_occurrence_id_index`;--> statement-breakpoint
ALTER TABLE `event_occurrence_room_pull` ADD `guild_id` text REFERENCES guild(guild_id);--> statement-breakpoint
UPDATE `event_occurrence_room_pull` SET `guild_id` = (SELECT `guild_id` FROM `event_occurrence` WHERE `event_occurrence`.`id` = `event_occurrence_room_pull`.`occurrence_id`);--> statement-breakpoint
CREATE TABLE `__new_event_occurrence_room_pull` (
`guild_id` text NOT NULL,
`occurrence_id` integer NOT NULL,
`event_queue_id` integer NOT NULL,
`handled_at` integer NOT NULL,
PRIMARY KEY(`occurrence_id`, `event_queue_id`),
FOREIGN KEY (`guild_id`) REFERENCES `guild`(`guild_id`) ON UPDATE no action ON DELETE cascade,
FOREIGN KEY (`occurrence_id`) REFERENCES `event_occurrence`(`id`) ON UPDATE no action ON DELETE cascade,
FOREIGN KEY (`event_queue_id`) REFERENCES `event_queue`(`id`) ON UPDATE no action ON DELETE cascade
);
--> statement-breakpoint
INSERT INTO `__new_event_occurrence_room_pull`(`guild_id`, `occurrence_id`, `event_queue_id`, `handled_at`) SELECT `guild_id`, `occurrence_id`, `event_queue_id`, `handled_at` FROM `event_occurrence_room_pull`;--> statement-breakpoint
DROP TABLE `event_occurrence_room_pull`;--> statement-breakpoint
ALTER TABLE `__new_event_occurrence_room_pull` RENAME TO `event_occurrence_room_pull`;--> statement-breakpoint
CREATE INDEX `event_occurrence_room_pull_guild_id_occurrence_id_index` ON `event_occurrence_room_pull` (`guild_id`,`occurrence_id`);--> statement-breakpoint
PRAGMA foreign_keys=ON;
6 changes: 6 additions & 0 deletions data/migrations/0016_real_loki.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
CREATE TABLE `event_sync_lock` (
`guild_id` text NOT NULL,
`event_id` integer NOT NULL,
`locked_at` integer NOT NULL,
PRIMARY KEY(`guild_id`, `event_id`)
);
Loading