Replayable, high-throughput blockchain indexing pipeline. Ingests blocks (mock or real chain), transforms them into domain tables, and writes to ClickHouse with checkpointing and replay support.
- Source (
src/source/): Block ingestion. Default is a deterministic mock generator for local dev; optional RPC adapter for real chains. - Processor (
src/processor/): Parse → normalize → enrich. Deterministic transforms and stable sorting. - Sink (
src/sink/): Batch writes to ClickHouse with retries and backpressure; idempotent upserts via ReplacingMergeTree. - Checkpoint (
src/checkpoint/): Last processed block height and timestamp for--resumeand--replay.
| Table | Key fields | Engine |
|---|---|---|
blocks |
height, hash, ts | MergeTree |
transactions |
tx_hash, block_height, from, to, value, fee | ReplacingMergeTree |
transfers |
tx_hash, from, to, asset, amount | ReplacingMergeTree |
contracts |
address, creator, created_height | ReplacingMergeTree |
events |
tx_hash, event_index, kind, data_json | ReplacingMergeTree |
indexwave_checkpoint |
last_block_height, last_committed_ts | ReplacingMergeTree |
-
Start ClickHouse:
make up
-
Ingest a range (creates DB and tables if missing):
cargo run -- ingest --config config.toml --from-height 1 --to-height 50
-
Stats:
cargo run -- stats --config config.toml
-
Replay a range (re-upsert):
cargo run -- replay --config config.toml --from 1 --to 50
-
Resume from checkpoint:
cargo run -- ingest --config config.toml --resume
-
Doctor (config + DB check):
cargo run -- doctor --config config.toml
- Checkpoint: Stored in
indexwave_checkpoint(latest row bylast_block_height). Updated after each successful batch. --resume: Loads checkpoint and continues fromlast_block_height + 1(respects--from-height/--to-heightif set).--replay --from N --to M: Reprocesses blocks N–M and re-inserts into ClickHouse. ReplacingMergeTree deduplicates by (block_height, tx_hash, event_index) so replays are idempotent.
config.toml:
[database]
url = "http://localhost:18123"
database = "indexwave"
batch_size = 1000
max_retries = 3
[ingest]
channel_cap = 10000
[metrics]
port = 9090make up– start ClickHouse (and wait for health)make down– stop servicesmake lint–cargo fmt --checkandcargo clippy --all-targets -- -D warningsmake test–cargo testmake run– up + ingest 1–50make e2e– up + ingest 1–30 + stats + replay 1–30 + stats
- DB connection refused: Ensure ClickHouse is up (
make up, thencurl -s http://localhost:8123/ping). - Port 8123 in use: Change
database.urlin config and theportsindocker-compose.yml. - Database / table errors: Run
doctor; ensure_schema creates the database and tables on first ingest. - Replay row count: ReplacingMergeTree merges in the background; immediate
count()may show duplicates until merge. UseFINALor wait for background merge for exact counts.
MIT. See LICENSE.