GitHub - ion-elgreco/rivers: Rivers is an orchestration platform for data and ML pipelines, written in Rust for native performance with a Python-first development experience.

Orchestration platform for tasks and assets, fully backed by Rust.

rivers is a Rust-powered orchestration platform built around data assets. Define pipelines in Python; rivers resolves the graph, plans execution - no Python interpreter on the control plane.

Documentation · Issues · Discussions

Key features

Asset-based orchestration — define data assets as Python functions; rivers resolves the dependency graph automatically.
Rust core — graph resolution, execution planning, partition logic, and the scheduler all run in compiled Rust.
Multiple asset types — single, multi-output, graph (composing Tasks into sub-DAGs), and external assets.
Partitioning — static, time-window (daily/hourly/custom cron), multi-dimensional, and runtime-extensible dynamic partitions.
Pluggable IO — built-in handlers for in-memory, pickle (any object store), and Delta Lake with merge support.
Parallel & distributed execution — Executor.parallel() for concurrent subprocess workers, Executor.kubernetes() for one-pod-per-step on K8s.
Schedules, sensors, and automation conditions — declarative triggers (cron, event-driven, dep-aware) executed by the rivers daemon.
Backfills — partition-range execution with multi-run, single-run, and per-dimension strategies.
Persistent storage — embedded SurrealDB + RocksDB for local dev, SurrealDB server for production.
Concurrency control — run-queue limits, tag concurrency, and step-level concurrency pools.
Single-binary dev experience — rivers dev <module> boots SurrealDB (embedded RocksDB), the scheduler, and the web UI on :3000 in one process.

Performance

Hot paths run in compiled Rust: graph resolution, partition mapping, execution planning, the scheduler. Python is the API surface only. Plan times stay sub-millisecond on graphs with thousands of nodes. The UI is Rust too — Leptos SSR + WASM on axum, state read straight from SurrealDB and pushed to the browser via Server-Sent Events.

Kubernetes-native

rivers ships with a Kubernetes operator and CRDs. Declare a repo as a CodeLocation:

apiVersion: rivers.io/v1alpha1
kind: CodeLocation
metadata:
  name: analytics
spec:
  image: ghcr.io/acme/pipelines
  tag: v0.2.0
  module: pipelines.analytics

The operator resolves the image to a digest, reconciles a Deployment + Service running rivers serve, registers it with the UI's discovery registry, and re-polls the registry to keep the digest fresh. Multi-arch images (linux/amd64, linux/arm64) and Helm charts are published to ghcr.io on every release with SLSA build-provenance attestations.

See the installation guide for the full setup — helm install commands, common values, and an architecture overview with the reconciliation and run sequence diagrams.

Install

pip install rivers

Optional extras for IO handlers:

pip install rivers[delta]     # Delta Lake support
pip install rivers[pyarrow]   # PyArrow table support
pip install rivers[polars]    # Polars DataFrame support

Quick example

import rivers as rs

@rs.Asset
def raw_data():
    return {"users": 100, "events": 5000}

@rs.Asset
def summary(raw_data: dict):
    return f"{raw_data['users']} users, {raw_data['events']} events"

repo = rs.CodeRepository(assets=[raw_data, summary])
result = repo.materialize()

print(repo.load_node("summary"))  # "100 users, 5000 events"

See the Getting Started guide for partitioning, jobs, IO handlers, and the K8s executor.

Contributing

Contributions are welcome. See CONTRIBUTING.md for development setup (just develop, just test, just pre-commit), code conventions, and the test matrix. The docs/ directory hosts both the user-facing guides and architectural notes for contributors.

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
.cargo		.cargo
.github		.github
assets		assets
deploy		deploy
dev/k3d		dev/k3d
docs		docs
examples		examples
proto		proto
python		python
rust		rust
.dockerignore		.dockerignore
.gitignore		.gitignore
CLA.md		CLA.md
CONTRIBUTING.md		CONTRIBUTING.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md
_typos.toml		_typos.toml
justfile		justfile
mkdocs.yml		mkdocs.yml
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Key features

Performance

Kubernetes-native

Install

Quick example

Contributing

About

Uh oh!

Releases 3

Sponsor this project

Uh oh!

Packages

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Key features

Performance

Kubernetes-native

Install

Quick example

Contributing

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 3

Sponsor this project

Uh oh!

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages