Skip to content

ajax80/schema-init

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

212 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

schema-init

A minimal PID 1 init system for Linux that supervises services through a weight-state machine instead of unit files and dependency graphs.

No systemd. No OpenRC. No journal daemon. No socket activation engine. Just a statically linked binary that mounts your filesystems, spawns your services in dependency order, and watches them — then gets out of the way.

PID 1 footprint: 892 KB RSS, 1 thread.


What it gives back

systemd isn't just PID 1 — it's a constellation of always-on daemons: journald, systemd-logind, dbus-broker, systemd-resolved, resident udevd workers, timers firing on their own schedule. Each one holds RAM and wakes the CPU whether or not you're using it. schema-init replaces PID 1 with a single static binary and does none of that — no journal database, no socket-activation engine, no background event loops. What that machinery was holding comes back to you.

Your RAM comes back. On identical hardware running the identical desktop, schema-init frees roughly half a gigabyte of RAM that systemd's daemon stack was sitting on (~1.1 GB used at desktop vs ~1.6–2.0 GB — see Real numbers), and idle swap drops from hundreds of MB to zero. In lived terms that is the difference between a few browser tabs plus one other program before the machine starts thrashing and two or three browsers with ~20 tabs each and a game running at the same time — same RAM, no upgrade. The computer you already own effectively gets bigger.

Your power comes back. With no ambient timer wakeups holding the cores awake, the CPU actually reaches its deepest hardware sleep state: measured 92–99% C10 residency and ~1.25 W full-SoC package draw at a working desktop, idle load average 0.03 (vs 0.10–0.20 under systemd). Those figures are read from Intel RAPL hardware energy counters, not estimated. Per machine it is a small, honest number — but it is structural, paid back every second of every idle hour. The per-node idle delta is published above for exactly one reason: multiply it by your own fleet size and uptime and the total is yours to compute. schema-init's part is simply removing the constant wakeups that keep silicon out of deep sleep in the first place. At datacenter scale, "stop waking millions of idle cores 100+ times a second" is not a rounding error.

The machine goes quiet, not just lean. One PID-1 thread instead of 20–30. A tick loop that sleeps indefinitely once services are stable — nothing wakes it on a schedule. No journal flush, no D-Bus polling, no watchdog chatter. The hardware is allowed to actually rest.

This isn't theory or a benchmark rig — it's a salvaged Dell Inspiron (Intel i3, 4 GB) that swapped constantly under systemd and now runs a full desktop with room to spare under schema-init. Older and low-RAM machines benefit the most: the daemons you delete are the exact ones a small machine can least afford.

PID 1 footprint: 892 KB RSS, 1 thread.


How it works

Every service moves through a state machine driven by probes. Before a service is spawned, schema-init probes the system — is the binary present? Are dependencies stable? Is there enough memory? The probe returns a flag word. The state machine decides what comes next.

                  ┌─────────────────────────────────────┐
                  │                                     │
            NEW_PROCESS                                 │
                  │                                     │
           F8 probe passes                              │
                  │                                     │
            FULL_TRUST ──── stable 10s ──── FUNDAMENTAL │
                  │                         SETTLED     │
                  │                                     │
             (oneshot exit 0)                           │
                  │                                     │
              PERFECT                                   │
                                                        │
            ── on death ──                              │
                  │                                     │
             RECOVERY ◄──────────────────────────────── ┘
                  │
           F9 probe fails
                  │
             FRICTION
                  │
           F6 probe fails
                  │
             DORMANT  (75 — backoff anteroom: 5m → 10m → 20m → 40m → 60m)
                  │
         (non-critical, 5 cycles exhausted)
                  │
             EXCISED  (76 — gate closes)

Three probe families:

Probe Asked when Checks
F8 Before first spawn Binary exists, deps stable, memory safe, permissions met
F9 After death Retry budget, cooldown window, memory, escalation path
F6 After recovery fails Last-chance: can we even attempt a restart?

Services marked critical=1 never reach EXCISED — they enter DORMANT and retry at 1-hour intervals indefinitely. Non-critical services excise after 5 dormant cycles (~75 minutes). A dep marked critical=1 that is EXCISED still blocks its dependents. A non-critical EXCISED dep is skipped — dependents proceed without it.


Repository layout

If you're reading the source to evaluate it, start here. The whole init is ~2,500 lines of C with no external dependencies.

Read these first, in this order:

File Lines What it is
init.c ~1,370 PID 1 itself. Mounts pseudo-filesystems, reaps children, runs the supervise loop, handles signals and shutdown. The spine — everything below is called from here.
schema.c / schema.h ~70 The weight-state machine. Pure state transitions; a service's "weight" is the popcount of its probe flag word. This is the schema — the single source of truth for what every state means.
service.c / service.h ~680 Parses .svc files, spawns services, runs the F8/F9/F6 probes, and drives the recovery → backoff → excision arc.
group.c / group.h ~150 Aggregates a .grp of services into one worst-case state, so a stack (network, display) promotes and fails as a unit.

Supporting binaries:

File What it is
schema-ctl.c The CLI client. Talks to PID 1 over the /run/schema-init.sock UNIX socket — schema-ctl status, restart, etc.
schema-subreaper.c ~50-line helper that sets PR_SET_CHILD_SUBREAPER so a service can adopt its own orphaned grandchildren instead of dumping them on PID 1.
schema-journal-sink.c Opt-in Track B compatibility shim. Provides journald's three ingestion sockets (/dev/log, /run/systemd/journal/{socket,stdout}) and drains them to a plain logfile so foreign libsystemd/syslog software finds a journald-shaped endpoint. No journal DB, no journalctl. schema-init never needs it to boot. See docs/journal-sink-design.md.
schema_shm.h The shared-memory interface — PID 1 publishes live service state here so external tools can read it without polling the socket.

Directories:

Dir What's inside
services/ The reference service set — real .svc and .grp files for sshd, dbus, udev, network-manager, display-manager, and the network-stack / display-stack groups. Copy these as your starting templates.
desktop/ schema-desktop.c — an SDL2 live visualizer that maps schema_shm.h into an 8-node grid and shows every service's weight-state in real time. This is how you watch the state machine run.
scripts/ Build and integration tooling: make-iso*.sh / make-usb.sh / fix-usb.sh (bootable media), schema-logind.py (a logind compatibility shim), and verify_traceability.py (IEC 62304 requirement traceability).
distros/ Per-distribution profiles — fedora-kde/ and raspberry-pi-zero-w/. Each carries the service files and boot glue that distro needs.
docs/, assets/ Documentation and images.

Top-level: setup.sh (newcomer bootstrap — dep check, desktop-environment detection, GRUB entry generation) and Makefile (static build; see Building).


Service files

Drop a .svc file in /etc/schema-init/services/. One key=value per line:

name=sshd
exec=/usr/sbin/sshd
args=-D
needs_root=1
name=display-manager
exec=/usr/sbin/lightdm
dep=dbus
dep=udev
needs_root=1
critical=1
name=network
exec=/usr/local/bin/net-setup
oneshot=1

Keys:

Key Default Description
name (required) Service name — used in logs, dep resolution, and schema-ctl commands
exec (required) Absolute path to the binary to execute
args Argument string (repeat the key for multiple args)
dep Dependency by name (repeat for multiple deps; can name a service or a group)
oneshot 0 Exit 0 → PERFECT and don't restart; exit non-zero → RECOVERY arc
needs_root 0 Abort spawn if uid ≠ 0
critical 0 If 1: service never reaches EXCISED — stays DORMANT at 1h retry indefinitely. Also: if this service is EXCISED, its dependents are hard-blocked.
no_restart 0 Any death → EXCISED immediately; no recovery arc
max_restarts 5 Maximum number of times to attempt restarting a service before entering EXCISED or backoff
stable_secs 10 Seconds process must stay alive before FULL_TRUST promotes to FUNDAMENTAL. Set lower for fast services; use ready_path instead when possible
ready_path Filesystem path that, when it exists, triggers immediate FULL_TRUST→FUNDAMENTAL promotion. Falls back to stable_secs if the path never appears. In FUNDAMENTAL it also acts as a liveness probe: if the path disappears, the service is killed and backed off. The disappearance check only arms once the path has been seen at least once — a service promoted by stable_secs before its path exists won't be falsely killed. For services slower than stable_secs to come up (e.g. NetworkManager writing resolv.conf), set stable_secs generously so promotion doesn't outrun the path.
watchdog_timeout_ms 0 Dead Man Token window in milliseconds. Service must call schema-ctl pet <name> within this window or PID 1 stops kicking /dev/watchdog and the hardware resets. Use for critical=1 real-time processes. 0 = disabled.
cpu_limit 0 Percent of one CPU core (1–100) enforced via cgroupv2 cpu.max. Written before child exec. 0 = unlimited.
mem_limit 0 Memory hard cap in MB via cgroupv2 memory.max. OOM inside the cgroup kills the service, not the system. Written before child exec. 0 = unlimited.
priority standard CPU contention class via cgroupv2 cpu.weight: critical (weight 1000), standard (100), peripheral (10). Proportional share — only takes effect when cores are saturated; idle services are never penalized. The analog of systemd's CPUWeight=. Children inherit the service's cgroup, so tagging a session leader (e.g. display-manager) elevates its whole subtree, compositor included.
allowed_slot_min -1 Minimum hardware slot ID (inclusive) this service is permitted to run on. Checked against SLOT_ID env at spawn time. -1 = unconstrained.
allowed_slot_max -1 Maximum hardware slot ID (inclusive). If SLOT_ID falls outside [allowed_slot_min, allowed_slot_max], spawn is refused with a HAZARD log and SVC_NO_RESTART is set — the service will not retry. Both min and max must be ≥ 0 to activate the gate.
on_boot_sec 0 Makes the service a timer: seconds after boot before the first fire (0 = at boot). Implies oneshot=1 — the service runs, exits, and re-arms. The analog of systemd's OnBootSec=. See Timers below.
on_active_sec 0 Timer period: seconds after each completion before the next fire. Measured from completion (like systemd's OnUnitInactiveSec=), so a slow run never overlaps itself. Implies oneshot=1.
start_timeout_sec 90 for oneshots, 0 otherwise Max seconds a service may sit in FULL_TRUST without promoting before it is killed and routed into the recovery arc — so a hung boot service can't stall its dependents. Defaults on for oneshots (the only services that can hang the chain; daemons promote via stable_secs). Timers are exempt (may run long). 0 disables. The analog of systemd's TimeoutStartSec=.
(default) Services restart automatically through the F9/F6 recovery arc unless no_restart or oneshot is set

A full example using readiness probes:

name=dbus
exec=/usr/bin/dbus-daemon
args=--system
args=--nofork
needs_root=1
stable_secs=2
ready_path=/run/dbus/system_bus_socket

Service templates

For fleets of identical services — e.g. 49 joint controllers on an exoskeleton — define config once and symlink instances:

# template — write once
/etc/schema-init/services/motor@.svc

# instances — zero-byte symlinks; suffix becomes $INSTANCE in the child
ln -s motor@.svc /etc/schema-init/services/motor@0.svc
ln -s motor@.svc /etc/schema-init/services/motor@12.svc
ln -s motor@.svc /etc/schema-init/services/motor@48.svc

At boot, motor@.svc is skipped as a non-spawnable template. Each motor@N.svc symlink loads config from the template and spawns the binary with INSTANCE=N in the child environment. The motor controller reads $INSTANCE to determine its joint index, SPI bus address, or any other per-instance identity — no per-node config files required.

If a node runs the bare template directly (e.g. on a slot-detected Pi Zero W 2 where the node's identity comes from GPIO strapping), INSTANCE falls back to SLOT_ID from /run/schema-init/env. One SD card image serves the entire fleet.

AllowedSlot gate — for hardware deployments where running the wrong firmware on the wrong node is a physical hazard, add slot constraints to the template:

name=motor
exec=/usr/local/bin/motor-ctrl
allowed_slot_min=16
allowed_slot_max=27

If SLOT_ID is outside the declared range at spawn time, schema-init logs a HAZARD line, refuses the spawn, and sets SVC_NO_RESTART. The process never runs. Project Daedalus slot map:

Slot range Joint
0–7 Hip Left
8–15 Hip Right
16–21 Knee Left
22–27 Knee Right
28–33 Ankle Left
34–39 Ankle Right
40–43 Toe Left
44–47 Toe Right
48 Supervisor

Dependencies are resolved by name at load time. A service stays in NEW_PROCESS until all its deps reach FUNDAMENTAL, SETTLED, or PERFECT. A dep name can refer to either a service or a group (see below).

Group files

Drop a .grp file in the same services directory to create a named group. Services can depend on a group name just like a service name.

name=storage
member=lvm
member=cryptsetup
member=mount-data

A group's state is the worst-case view of its members:

  • Any member EXCISED → group is EXCISED
  • Any member in FRICTION/RECOVERY → group reflects that
  • All members FUNDAMENTAL or better → group is FUNDAMENTAL
  • All members PERFECT → group is PERFECT

Maximum 16 groups, 8 members per group. Names and members are matched at load time.

Timers

Add on_boot_sec and/or on_active_sec to any .svc to make it periodic — no separate .timer file, no second unit to link. The service is the timer. This replaces cron and systemd .timer units with the same .svc you already wrote.

name=fstrim
exec=/usr/sbin/fstrim
args=-a
needs_root=1
on_boot_sec=600        # first fire 10 min after boot
on_active_sec=86400    # then every 24 h after each completion

A timer is a oneshot that re-arms on a CLOCK_MONOTONIC deadline instead of staying terminal at PERFECT:

  • It boots into PERFECT (as if it already ran), first fire at boot + on_boot_sec.
  • On fire it re-enters NEW_PROCESS — so dependencies are still honored and it waits for its deps exactly like any service.
  • When it exits, it re-arms for now + on_active_sec regardless of exit code (cron semantics — a failed run is not retried in a loop; it runs again next window). The exit is logged timer-done or timer-failed.

Run-once: set only on_boot_sec (leave on_active_sec unset) and the service fires exactly once, on_boot_sec seconds after boot, then stays terminal — a deferred startup job rather than a repeating one.

The period is measured from completion, so a slow job never overlaps itself. Fires on the 250 ms tick (±1 tick) — cron-class precision, not sub-second. For real-time work use watchdog_timeout_ms and the control loop instead.

Not yet implemented: on_calendar=HH:MM wall-clock fire, and persistent catch-up of jobs missed during downtime. See docs/timers-design.md.


State glossary

State Meaning
NEW_PROCESS Queued. Waiting for all deps to reach FUNDAMENTAL. No spawn attempt yet.
FULL_TRUST Spawned. Watching — promotes to FUNDAMENTAL when ready_path exists or stable_secs elapses, whichever comes first.
FUNDAMENTAL Stable. Load-bearing. Other services can depend on it.
SETTLED Stable, non-critical. Satisfies deps but generates no friction warnings if lost.
RECOVERY Died unexpectedly. F9 probe running. May re-queue or escalate.
FRICTION Recovery failed. F6 last-chance probe running.
DORMANT F6 failed. Exponential backoff: 5m→10m→20m→40m→60m. Re-queues on wake. critical=1 services never leave this toward EXCISED.
EXCISED Permanently removed. Non-critical only, after 5 dormant cycles. Gate closes.
PERFECT Oneshot service exited 0. Terminal success.

Shutdown

schema-init handles shutdown signals from userspace or the kernel:

sudo kill -TERM 1   # poweroff
sudo kill -INT 1    # reboot

On SIGTERM, schema-init sets system state to shutdown, sends SIGTERM to all child processes, waits 500ms for clean exit, then calls reboot(RB_POWER_OFF).

On SIGINT, same sequence ends with reboot(RB_AUTOBOOT).

The 500ms hold is intentional — it gives any running desktop or display manager time to render a shutdown state before the process tree is torn down.


Known limitations

These are real gaps, not future features being teased:

  • No socket activation — services must manage their own sockets. There is no systemd-style socket hand-off (LISTEN_FDS).
  • schema-ctl add does not check for dependency cycles — cycle detection runs at load time (drops to a rescue shell) and on schema-ctl reload (the reload is rejected if the new configuration contains a cycle), but a cycle introduced via add is not guarded — the service simply stalls in NEW_PROCESS indefinitely.

Filesystem setup

schema-init does not parse /etc/fstab. On boot it mounts the pseudo-filesystems directly:

Mount Type Notes
/ remount rw Kernel mounts rootfs read-only for fsck; schema-init remounts it writable before anything else
/proc proc nosuid, nodev, noexec
/sys sysfs nosuid, nodev, noexec
/dev devtmpfs nosuid, strictatime
/run tmpfs nosuid, nodev, mode=0755
/sys/fs/cgroup cgroup2 nosuid, nodev, noexec, relatime

schema-init also creates /run/log/schema-init/ at boot. Each service's stdout and stderr are redirected there automatically (see Logs).

If your system needs additional mounts (data partitions, network filesystems), run them as oneshot services before your other services depend on them.


Building

make

Produces a fully static binary — no glibc version dependency, runs on any Linux kernel. Tested on:

  • Debian Bookworm, kernel 6.1, x86_64 — headless and Cinnamon desktop
  • Fedora 44, kernel 7.0, x86_64 — full KDE Plasma desktop, btrfs subvolume boot

Cross-compile for aarch64 (ARM — Ungulate Leg target):

make aarch64

Requires aarch64-linux-gnu-gcc. On Fedora: sudo dnf install gcc-aarch64-linux-gnu. Produces static schema-init-static, schema-ctl, and schema-subreaper binaries. Override sysroot with SYSROOT=/path/to/sysroot make aarch64.

ARM bare-metal (Pi Zero W, armv6l):

Fedora's arm-linux-gnu-gcc cross-compiler does not ship an arm sysroot. Compile natively on the Pi:

sudo apt install git gcc make
git clone https://github.com/ajax80/schema-init
cd schema-init && make

The armhf Makefile target exists for environments that have a full arm sysroot available.

schema-desktop (optional SDL2 monitor):

make desktop
sudo cp desktop/schema-desktop /usr/local/bin/schema-desktop

Requires SDL2 and SDL2_ttf. On Fedora: sudo dnf install SDL2-devel SDL2_ttf-devel. Reads live service state from PID 1's shared memory segment — run it from the desktop after login, or drop distros/*/config/autostart/schema-desktop.desktop into ~/.config/autostart/ to launch it automatically.

# install as PID 1 — symlink approach (distro-compatible)
cp schema-init /sbin/schema-init
ln -sf /sbin/schema-init /sbin/init

# or pass to kernel directly via GRUB
linux /boot/vmlinuz root=LABEL=my-root init=/sbin/schema-init

GRUB setup

Option A — symlink (/sbin/init/sbin/schema-init): works with any distro GRUB config, no kernel cmdline change needed. Replace your distro's init binary or point the symlink.

Option B — explicit init= in GRUB: add init=/sbin/schema-init to the kernel line in /etc/default/grub, then grub-mkconfig -o /boot/grub/grub.cfg (Debian/Ubuntu) or grub2-mkconfig -o /boot/grub2/grub.cfg (Fedora).

Option C — custom GRUB menu entry: create a separate entry that leaves the distro default untouched:

# /boot/grub/custom.cfg  (included automatically by grub.cfg)
menuentry 'schema-init' {
    search --no-floppy --label --set=root schema-root
    linux   /boot/vmlinuz-$(uname -r) root=LABEL=schema-root rw quiet init=/sbin/schema-init
    initrd  /boot/initramfs-$(uname -r).img
}

Option C is the safest for dual-boot or first-time installs — it leaves the existing systemd entry intact as a fallback.

Replacing a running init (without reboot)

The init binary cannot be overwritten while running (text file busy). Use the copy-then-move trick:

cp schema-init /sbin/schema-init.new
mv /sbin/schema-init.new /sbin/schema-init

mv replaces the directory entry atomically without touching the inode that the kernel holds open. The new binary takes effect on next boot.


Real numbers

Tested on Dell Inspiron 3542 (Intel Core i3, 4GB RAM) running full Cinnamon desktop:

Metric schema-init systemd (same hardware, Fedora)
PID 1 RSS 892 KB ~8–15 MB
PID 1 threads 1 20–30+
RAM used at desktop ~1.1 GB ~1.6–2.0 GB
Swap used 0 MB 200–500 MB
Time to desktop ~20.7s slower

The gap is structural. schema-init spawns your services and then sits in a 250ms tick loop. There is no journal daemon, no dbus-broker, no socket activation layer, no unit file parser running in the background.

Boot timing breakdown (Dell Inspiron 3542, Debian Bookworm, kernel 6.1.0-49, times relative to PID1 start):

kernel → PID 1:    6.968s
dbus               1.761s   (ready: /run/dbus/system_bus_socket)
elogind            2.739s   (ready: /run/systemd/seats)
polkitd            3.447s
udev               3.197s
network           10.505s   (oneshot)
network-manager   11.757s
getty-tty1        10.755s
sshd              10.755s
display-manager   13.760s   ← LightDM login screen visible

total kernel → login screen: ~20.7s

schema-ctl timing produces this output.

Architectural efficiency

Live measurements from a 9-hour uptime session (Fedora 44, KDE Plasma, GreyBox):

Metric systemd schema-init Architectural elimination
PID 1 RSS 40MB – 120MB 960KB – 1.2MB Eliminates heap allocation bloat and redundant daemon memory overhead
Idle CPU consumption Constant ambient timer wakeups ~0.03ms/min (1.06s over 9h) CPU reaches deeper C-states — hardware idle, not just low-utilization idle
State tracking D-Bus event loops, logging daemons Direct POSIX shared memory / binary flag probes Removes IPC serialization and deserialization bottlenecks entirely
Session tracking utmp/logind infrastructure Ghost sessions — who/w show 0 users Zero inode contention on /var/run/utmp; who and w are zero-overhead no-ops under concurrent logins

The load average on an idle system with schema-init as PID 1 sits at 0.03. On the same hardware with systemd, ambient timer wakeups hold it at 0.10–0.20 at idle. The difference is structural: schema-init's tick loop sleeps indefinitely once all services are stable. Nothing wakes it.

turbostat on Eli (Dell Inspiron 3542, Intel i3-4005U, Fedora 44, full Cinnamon desktop):

C10%: 92–99%    ← deepest available C-state; CPU hardware-verified
C6%:  0.00%     ← skipped; CPU goes straight to C10
Busy: 0.21–0.38%
PkgWatt: 1.23–1.32W   ← entire SoC including iGPU, read via Intel RAPL
GFX%rc6: 99.67%        ← integrated GPU in deepest sleep state

C10 is the deepest sleep state on Haswell silicon. Reaching it requires the CPU to sit undisturbed long enough to flush caches and power-gate internal voltage rails — typically blocked by the constant timer wakeups from systemd's watchdog, journal flush, and D-Bus polling infrastructure. At 95% C10 residency with a full desktop running, schema-init is generating near-zero ambient noise. The 1.25W package figure is read directly from Intel RAPL hardware energy counters, not estimated. Services with ready_path set promote the instant the path exists — no blind timer. stable_secs (default 10s) is the fallback. The remaining ~10s cluster is network/getty/sshd with no readiness path.


Runtime control

schema-ctl is a control client that communicates with the running init over a Unix domain socket at /run/schema-init.sock.

sudo schema-ctl status          # full state dump for all services
sudo schema-ctl status --json   # machine-parseable JSON — for supervisory loops and IEC 62304 audit
sudo schema-ctl status --kv     # flat key=value — grep-friendly
sudo schema-ctl list            # names and current states only
sudo schema-ctl start <name>    # start a stopped or EXCISED service
sudo schema-ctl stop <name>     # send SIGTERM to a running service
sudo schema-ctl restart <name>  # stop + re-queue through the state machine
sudo schema-ctl add <path>      # load a new .svc file at runtime, no reboot needed
sudo schema-ctl reload          # re-read the services directory (rejected if new config has a cycle)
sudo schema-ctl reload --evict  # reload + SIGTERM any running service no longer present in config
sudo schema-ctl pet <name>      # service heartbeat check-in — resets watchdog_timeout_ms window
sudo schema-ctl reset [<name>]  # reset restart/dormant counts and re-queue failed services

The socket is chmod 0600 — root only. Build alongside the init binary:

make schema-ctl
sudo cp schema-ctl /usr/local/bin/schema-ctl

Debugging

Service state

sudo schema-ctl status          # full dump: state, pid, restart count, weight
sudo schema-ctl list            # compact: name + state only
sudo schema-ctl timing          # kernel→PID1 handoff + per-service stable timestamps

A service stuck in NEW_PROCESS means its dependencies haven't stabilised. status shows the state of every dep — trace upward.

A service in FRICTION is in last-chance recovery. On the next failed F6 probe it enters DORMANT (exponential backoff) rather than going straight to EXCISED. Use sudo schema-ctl start <name> to manually re-queue it immediately instead of waiting out the backoff.

Service logs

tail -f /run/log/schema-init/<name>.log    # live stdout/stderr for a service
cat /run/log/schema-init/dbus.log          # full output since last boot

These are plain text on a tmpfs. If a service is failing silently, its output is here.

D-Bus tracing

If a desktop application hangs for exactly 25–30 seconds, D-Bus auto-activation is timing out trying to reach an unregistered interface. Trace it:

dbus-monitor --system 2>&1 | grep -A4 "method call"

The culprit will appear as a method call to a destination=org.freedesktop.SomeName that produces no method return for ~25 seconds.

Fix options:

  1. Register the interface — see schema-logind for the pattern
  2. Mask the activation file: sudo rm /usr/share/dbus-1/system-services/<name>.service

Rescue shell

If schema-init drops to a rescue shell at boot (cycle detected, or fatal probe failure), you have a minimal /bin/sh with access to the mounted filesystems. From there:

# inspect service files
ls /etc/schema-init/services/
cat /etc/schema-init/services/broken.svc

# fix and re-exec
vi /etc/schema-init/services/broken.svc
exec /sbin/schema-init

Logs

Init log — schema-init writes spawn/promote/death events to stdout, which the kernel connects to the console at boot. To persist:

exec /sbin/schema-init >/var/log/schema-init.log 2>&1

Per-service logs — each service's stdout and stderr are captured automatically to:

/var/log/schema-init/<name>.log        # preferred (persists across boots)
/run/log/schema-init/<name>.log        # fallback when /var is not writable (tmpfs, per-boot)

To read them while the system is running:

tail -f /var/log/schema-init/dbus.log
tail -f /var/log/schema-init/network-manager.log

There is no journal daemon. Logs are plain text, always.

journalctl shim (optional Track B) — software and post-install scripts that shell out to journalctl -u <svc> would fail with no journald present. scripts/journalctl is a drop-in interceptor: install it to /usr/local/bin/journalctl and it serves the matching *.log from the directories above, swallows unknown flags, supports -o json, and always exits 0 so a caller piping it to jq/awk never hard-crashes. It does not read a binary journal — there isn't one.


Shared memory interface

Running processes can read service state via POSIX shared memory at /schema-init:

#include "schema_shm.h"

int fd = shm_open("/schema-init", O_RDONLY, 0);
schema_shm_t *shm = mmap(NULL, sizeof(schema_shm_t), PROT_READ, MAP_SHARED, fd, 0);

for (int i = 0; i < shm->count; i++) {
    printf("%s state=%d weight=%d pid=%d\n",
           shm->svc[i].name,
           shm->svc[i].state,
           shm->svc[i].weight,
           shm->svc[i].child_pid);
}

D-Bus compatibility

On a no-systemd desktop, several interfaces are missing that desktop environments expect. schema-logind (distros/*/services/schema-logind.svc) handles all of them in a single Python process on the system bus.

Interface Why it matters What schema-logind returns
org.freedesktop.login1 Power/reboot buttons, session tracking, polkit seat queries PowerOff, Reboot, CanPowerOff, CanReboot, Inhibit, GetSessionByPID, mock Session/User/Seat objects
org.freedesktop.ConsoleKit Cinnamon session manager uses ConsoleKit, not logind, for CanRestart/CanStop — controls restart button visibility GetSessionForUnixProcess, CanRestart → True, CanStop → True, Restart/Stop → SIGINT/SIGTERM to PID 1
org.freedesktop.hostname1 About This System panel, network-manager display hostname, static hostname, OS pretty name, hardware vendor/model from /sys/class/dmi/
org.freedesktop.systemd1.Manager KDE System Settings queries unit state on open GetUnitFileState → "enabled"; GetUnit, ListUnits, Version/Features/Architecture properties
org.freedesktop.timedate1 Date & Time settings panel: timezone, NTP status, clock Timezone (from /etc/localtime), CanNTP/NTP/NTPSynchronized → true, TimeUSec; SetTimezone re-links /etc/localtime and writes /etc/timezone for real

Without these stubs, KDE and GNOME panels hit the D-Bus default timeout (25–30s) before giving up. With them, the same queries return in <100ms.

D-Bus policy required. The systemd-shipped org.freedesktop.login1.conf policy denies all non-root calls to login1 by default — KDE and Cinnamon will never see the power buttons without a drop-in. Install the one from this repo:

sudo cp distros/shared/dbus/schema-logind.conf /etc/dbus-1/system.d/schema-logind.conf
sudo dbus-send --system --type=method_call --dest=org.freedesktop.DBus \
    /org/freedesktop/DBus org.freedesktop.DBus.ReloadConfig

Then log out and back in (or reboot). The policy whitelists CanPowerOff, CanReboot, PowerOff, Reboot, and all session/seat methods schema-logind exports.

schema-logind is not a dependency of schema-init itself — it is a userspace service like any other. Drop its .svc file in your services directory and list it as a dep of your display manager:

name=sddm
exec=/usr/sbin/sddm
dep=dbus
dep=schema-logind
dep=polkitd
needs_root=1

The sd_booted() signal. mount_pseudo() creates /run/systemd/system at early boot (init.c). libsystemd's sd_booted() is a bare access() on that path, so any software gated on "is systemd the init?" — KService/ksycoca, elogind clients — gets a positive answer with no shim. This is what made the old LD_PRELOAD mock_sd.so workaround (which faked the check to stop KDE's ksycoca from spinning at idle) unnecessary: the signal is now native and costs one mkdir.


Porting to a new distro

Starting from scratch on a distro not in distros/:

1. Build the binary on the target (or cross-compile):

git clone https://github.com/ajax80/schema-init
cd schema-init && make

2. Install:

sudo cp schema-init /sbin/schema-init
sudo cp schema-ctl  /usr/local/bin/schema-ctl
sudo mkdir -p /etc/schema-init/services

3. Write service files. Start minimal — just enough to reach a console:

# /etc/schema-init/services/udevd.svc
name=udevd
exec=/usr/lib/systemd/udevd
args=--daemon
needs_root=1
stable_secs=3

# /etc/schema-init/services/dbus.svc
name=dbus
exec=/usr/bin/dbus-daemon
args=--system
args=--nofork
needs_root=1
stable_secs=2
ready_path=/run/dbus/system_bus_socket

The udevd path varies by distro: /usr/lib/systemd/udevd (Fedora/Debian), /lib/udev/udevd (older Debian), /usr/bin/udevd (Arch).

4. Configure GRUB (see Building → GRUB setup above). Boot with a fallback entry pointing at systemd so you can recover.

5. Boot and check:

sudo schema-ctl list       # all services should reach FUNDAMENTAL
sudo schema-ctl timing     # see where time goes
tail /run/log/schema-init/udevd.log   # if something is EXCISED, check its log

6. Add services incrementally. Bring up network, then login manager, then display manager. Add dep= links to enforce order. Add ready_path= for anything with a socket or pidfile.

7. Handle D-Bus hangs. Open your desktop's settings panel immediately after first login. If it hangs >5s, run dbus-monitor --system and identify the missing interface. Add a stub to schema-logind or mask the activation file.

Common issues by distro:

Issue Cause Fix
udevd not populating /dev/input udev not settled before display manager dep=udev in display manager svc; udevadm settle in a oneshot before it
polkit "not authorized" on NM polkit rule missing wheel group Copy distros/fedora-kde/config/polkit/10-schema-nm.rules
/etc/resolv.conf is a dead symlink systemd-resolved wrote it rm /etc/resolv.conf && echo "nameserver 1.1.1.1" > /etc/resolv.conf in your network oneshot
Plasma/GNOME hangs on settings open Missing D-Bus interface See D-Bus compatibility section above
PipeWire/PulseAudio not starting systemd user session missing Add autostart .desktop entry, or run from display manager wrapper script
display manager exits immediately No seat available Ensure elogind or schema-logind is up and answering login1 before display manager starts

Distributions

Working configurations for specific distros and desktops live in distros/.

Fedora 44 + KDE Plasma (distros/fedora-kde/)

Full KDE Plasma 6 desktop on Fedora 44 with schema-init as PID 1. Boots from a btrfs subvolume alongside a normal Fedora install — no repartitioning required.

What's running:

Service Role
udevd Device enumeration — required for libinput and /dev/input/event*
dbus System bus
network-up Loads r8152 USB ethernet module, udev settle
network-manager Owns the network interface via NM profile
polkitd Authorization — required for NM actions
schema-logind Minimal org.freedesktop.login1 D-Bus stub — restores KDE shutdown/restart buttons
sddm Display manager (via sddm-logged wrapper, no systemd session)
sound-modules oneshot — loads AMD Ryzen audio modules at boot
bluetoothd Starts bluez daemon — registers org.bluez, restores KDE Bluetooth applet
zram-swap oneshot — zstd-compressed zram swap device; replaces systemd's zram-generator

See distros/fedora-kde/README.md for full installation instructions and key fixes.

Raspberry Pi Zero W (distros/raspberry-pi-zero-w/)

WiFi headless deploy on a Pi Zero W (BCM2835, armv6l, 32-bit ARM). No Ethernet, no HDMI — schema-init as PID 1, WiFi up, SSH accessible in ~50 seconds from cold boot. First ARM bare-metal target.

Service chain:

Service Role
udev Device enumeration daemon
udev-trigger Oneshot — coldplug trigger + settle; loads brcmfmac WiFi firmware
dbus System bus — mandatory for Pi OS wpa_supplicant
wpa-supplicant WiFi association (config-file mode, not D-Bus mode)
dhcpcd DHCP client, foreground (-B), wlan0 only
sshd First usable interface — up when DHCP lease is held

See distros/raspberry-pi-zero-w/README.md for the full list of gotchas (rfkill country code, dbus privilege drop, coldplug trigger, dhcpcd forking behavior) and installation steps.


Roadmap

  • Runtime service loading — schema-ctl add <path> loads a new service at runtime
  • Runtime reload + removal — schema-ctl reload [--evict] re-reads config (cycle-checked); --evict SIGTERMs services dropped from config, no reboot
  • login1 D-Bus stub — schema-logind restores KDE shutdown/restart buttons on no-systemd systems
  • event-driven main loop — signalfd for SIGCHLD + poll() with 250ms timeout; wakes on child death and ctl commands instead of busy-polling
  • Boot hang fix — dep_idx alignment bug in group dep resolution; poll() replaces epoll (PID 1 epoll deadlock on kernel 6.1.0-49)
  • Boot timing — schema-ctl timing reports kernel→PID1 handoff and per-service FUNDAMENTAL/PERFECT timestamps (CLOCK_MONOTONIC)
  • Boot time measurement — 29.5s → 20.7s with ready_path probes; stable_secs fallback per service
  • Per-service readiness probes — ready_path= promotes on path existence; stable_secs= fallback timer
  • Cgroup assignment race fix — pipe barrier guarantees cgroup.procs written before child exec
  • Dynamic poll timeout — loop sleeps indefinitely once all services stable; 0% CPU idle
  • Service log files — stdout/stderr per service at /run/log/schema-init/<name>.log
  • D-Bus stubs — hostname1 and systemd1 Manager stubs in schema-logind; KDE Settings 25s → 2s
  • Fedora KDE distribution — GreyBox daily driver, full KDE Plasma 6 on Fedora 44
  • Fedora Cinnamon distribution — Eli (Dell Inspiron), keyboard/touchpad/ethernet working
  • STATE_DORMANT (75) — exponential backoff before 76 verdict; critical services never excise
  • Soft dep cascades — non-critical EXCISED deps skipped; dependents proceed without them
  • aarch64 cross-compile — make aarch64; all three binaries static; Ungulate Leg target ready
  • ARM bare-metal deploy — Pi Zero W (armv6l), Pi OS Trixie; SSH up in ~50s from cold boot
  • schema-desktop — SDL2 live service viewer; make desktop + autostart entry in Cinnamon and KDE distros
  • Dead Man Token hardware watchdog — /dev/watchdog driven by per-service check-in via schema-ctl pet; any critical service missing its watchdog_timeout_ms window stops WDT petting → hardware reboot; PID 1 deadlock covered implicitly
  • Symlink template instances — motor@12.svc → motor@.svc; $INSTANCE injected at spawn; $SLOT_ID fallback for GPIO-strapped nodes; one SD card image per fleet
  • Structured telemetry — schema-ctl status --json and --kv for machine-parseable supervisory loop consumption and IEC 62304 audit traceability
  • Cgroup resource limits — cpu_limit= (1–100, % of one core) and mem_limit= (MB) per .svc; written via sync-pipe window before child exec; IEC 62304 Class C blast-radius isolation
  • zram swap — zram-swap.svc boots a zstd-compressed zram swap device, replacing systemd's zram-generator; eliminates disk thrashing / periodic stutter under memory pressure

Origin

schema-init was built by Jonathan Ayers in 2026 as the operating layer for the Ungulate Leg — an exoskeleton controller that needs a deterministic, schema-driven PID 1 carrying none of systemd's assumptions about what a computer is for. (A sister project, the Samara Wing, applies the same architecture to flight.) The reference hardware that proved it — a pawn-shop Dell Inspiron named Eli — booted to a full desktop under schema-init on May 30th, 2026. It was written alongside Claire, an AI (Claude, by Anthropic) that held continuity across the work.


License

AGPL-3.0 for open source use.
Commercial license available for embedding in proprietary systems — open an issue or contact via GitHub.


Built by Jonathan Ayers. The schema was written before the code.

About

A single 892 KB static binary as PID 1 for Linux — services supervised through a weight-state machine, no systemd. Frees ~500 MB RAM and lets the CPU reach 95%+ deep sleep on the same hardware.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors