Replaces cosine similarity with Grassmann distance over local tangent spaces for high-dimensional embeddings. Builds navigable graphs where paths follow domain-coherent conceptual lanes instead of converging to vocabulary hubs. GPU-accelerated FORGE compute worker via Singularity on Nomad.
Tech stack: Julia 1.10+, KernelAbstractions.jl, AMDGPU.jl, Mosquitto.jl (MQTT), JSON3.jl, Singularity, Nomad (batch parameterized job)
- Julia 1.10 or later
libmosquitto-dev(for the Mosquitto.jl dependency)- For GPU: ROCm with a functional AMD GPU (
AMDGPU.functional()must returntrue) - For container builds:
singularitywith--fakerootcapability on the build host - For deployment:
nomadCLI andNOMAD_ADDRset to the cluster address
# Instantiate dependencies
julia --project=. -e 'using Pkg; Pkg.instantiate()'
# Run unit tests
julia --project=. test/runtests.jl
# Load the module in the REPL
julia --project=.
julia> using GrassmannDistanceFor integration testing against a real MQTT broker, copy .env.example to .env and fill in credentials. Source the file before running integration scripts:
cp .env.example .env
# edit .env
source .env
julia --project=. test_real.jlAll configuration is read from environment variables at startup via load_config() in src/config.jl.
| Variable | Required | Default | Description |
|---|---|---|---|
MQTT_BROKER |
Yes | — | MQTT broker URL, e.g. tcp://<broker-host>:1883 |
MQTT_USER |
No | — | MQTT username |
MQTT_PASSWORD |
No | — | MQTT password |
JOB_ID |
Yes | — | FORGE job ID; determines which MQTT topics to subscribe to |
WORKER_ID |
No | grassmann-{JOB_ID} |
Identifies the worker in status/log messages |
USE_GPU |
No | false |
Enable AMD GPU compute (true or 1) |
SYSIMAGE_PATH |
No | /app/sysimage.so |
Path to a precompiled Julia sysimage; falls back to JIT if absent |
JULIA_NUM_THREADS |
No | — | Julia thread count; Nomad job sets this to 1 |
JULIA_PROJECT |
— | /app |
Set inside container; do not override |
JULIA_DEPOT_PATH |
— | /opt/julia-depot |
Set inside container; do not override |
ROCM_PATH |
— | /opt/rocm |
Set inside container and Nomad job for GPU access |
Secrets (MQTT_BROKER, MQTT_USER, MQTT_PASSWORD) are injected from Vault via the Nomad job template using the forge policy. See deploy/nomad/grassmann-distance.hcl.
julia --project=. test/runtests.jlTest suites: Neighbors, TangentSpace, Distance, Ranking, Graph, Paths, Topology, Serialization, App.
# Requires MQTT_BROKER, MQTT_USER, MQTT_PASSWORD in environment
julia --project=. test_real.jl
julia --project=. test_corpus_graph.jl
julia --project=. test_corpus_paths.jl
julia --project=. test_corpus_real.jl# Build scratch must be on local disk — NFS xattrs break fakeroot unpacking
BUILD_TMPDIR="/var/tmp/singularity-build-$$"
mkdir -p "$BUILD_TMPDIR"
SINGULARITY_TMPDIR="$BUILD_TMPDIR" singularity build --fakeroot worker.sif singularity.def
rm -rf "$BUILD_TMPDIR"The build runs test/runtests.jl as a validation step during %post.
Push to main. The Gitea build workflow (.gitea/workflows/build.yml) runs on the packer self-hosted runner, builds worker.sif, writes it to the NFS image store, and registers the Nomad job:
nomad job run deploy/nomad/grassmann-distance.hclManual dispatch for testing:
nomad job dispatch -meta job_id=<uuid> grassmann-distanceThe container ships without a sysimage. See TODO.md for the procedure to build one on the GPU node using ROCm's LLVM and store it on NFS for the container to pick up via SYSIMAGE_PATH.
| File | Role |
|---|---|
src/GrassmannDistance.jl |
Module entry point; include order and all exports |
src/types.jl |
Core types: GrassmannConfig, TangentSpace, RankingEntry |
src/graph_types.jl |
Graph types: GraphConfig, Entity, GrassmannGraph, ConceptualPath, Community, Basin |
src/job_types.jl |
FORGE contract types: all JSON-serializable request/response structs |
src/config.jl |
WorkerConfig, load_config() — reads env vars at startup |
src/neighbors.jl |
knn() — brute-force k-nearest neighbors (SIMD inner loop) |
src/tangent.jl |
estimate_tangent_space(), estimate_tangent_spaces() — local PCA via SVD |
src/distance.jl |
principal_angles(), grassmann_distance() — geodesic and chordal variants |
src/ranking.jl |
rank_candidates() — full pipeline: kNN + tangent spaces + ranked distances |
src/entity_distance.jl |
entity_distance() — chunk-averaged Grassmann distance between entities |
src/graph.jl |
build_graph() — assembles GrassmannGraph from raw embeddings |
src/paths.jl |
find_greedy_path(), find_shortest_path(), reachable() |
src/topology.jl |
communities(), basins(), bridges(), hub_centrality(), hub_concentration(), bidirectional_edges() |
src/serialization.jl |
JSON3 struct mappings, parse_job(), serialize_result(), serialize_graph(), deserialize_graph() |
src/app.jl |
process_job() — dispatches build/query modes; entry logic for julia_main() |
src/mqtt.jl |
subscribe_and_process!(), julia_main() — MQTT client loop and worker entry point |
src/gpu.jl |
build_graph_gpu(), select_backend() — GPU-accelerated kNN, tangent space estimation, entity distance matrix |
The worker is a parameterized batch job — one Nomad dispatch per compute request. It does not poll for multiple jobs; it exits after completing one.
On startup, the worker:
- Reads
JOB_IDfrom the environment - Subscribes to
compute/jobs/{job_id}/params(QoS 1) - Waits for a single JSON payload
- Publishes status to
compute/jobs/{job_id}/status - Processes the job (
buildorquerymode) - Publishes the result to
compute/jobs/{job_id}/result - Publishes final status and disconnects
Modes:
build— takesentities(id + embeddings) and config parameters; returns a base64-encoded serializedGrassmannGraphplus entity/chunk counts.query— takes an encoded graph and aQuerySpec; returns path, reachability, or topology results depending onquery.type.
query.type values: greedy_path, shortest_path, reachable, communities, basins, topology.
The graph blob is opaque to FORGE — it is written and read only by this Julia worker. Clients pass it through unchanged between build and query dispatches.
See docs/messaging.md for the full MQTT protocol specification, JSON examples, and topic map.
- CONCEPTS.md — mathematical background: why cosine fails, how tangent spaces and Grassmann distance work, design decisions
- ARCHITECTURE.md — system structure, component inventory, data flows, invariants
- docs/development.md — dev setup, all commands, troubleshooting
- docs/messaging.md — MQTT protocol, JSON contract, topic map
deploy/nomad/grassmann-distance.hcl— Nomad job definition (constraints, resources, Vault policy, Singularity invocation)TODO.md— sysimage build procedure and open research questions