Skip to content

Add simdjson chunked parse + domain rewards optimization#77

Merged
Alann27 merged 3 commits into
mainfrom
feature/domain-service-summary
Mar 24, 2026
Merged

Add simdjson chunked parse + domain rewards optimization#77
Alann27 merged 3 commits into
mainfrom
feature/domain-service-summary

Conversation

@jorgecuesta

Copy link
Copy Markdown
Collaborator

Summary

  • simdjson integration for large block_results parsing: Settlement blocks (1.5GB+, 6-9M events) that previously crashed the indexer now parse in ~40s using a custom C++ addon with chunked JSON.parse
  • Domain-based rewards computation: Added domain rewards tracking and optimized DB batch operations
  • Refactored modToAcctTransfers handling: Improved efficiency of module-to-account transfer processing

simdjson Details

  • Added vendor/simdjson submodule (fork of simdjson_nodejs with Buffer support + findChunkBoundaries)
  • Two-path strategy: <500MB uses V8 JSON.parse, >=500MB uses simdjson chunked parse
  • No lazyParse/valueForKeyPath (segfaults on Node 18 with >1GB buffers)
  • Improved HttpClient logging: UTC timestamps, request height, full timing breakdown (headers/body/concat/parse)
  • Production verified on mainnet: block_results@679773 total=199s (headers=77s body=80s concat=2s parse=40s) size=1542.8MB

Test plan

  • Docker build passes (Node 18 Alpine)
  • simdjson loads and works in container
  • Normal blocks parse correctly (<500MB fast path)
  • Settlement blocks parse correctly (>500MB chunked path)
  • Data equivalence verified: all 6.7M events match original parser output
  • Deployed to mainnet and beta — running stable
  • Benchmarks in benchmarks/json-parse/

Alann27 and others added 3 commits March 17, 2026 15:06
- Introduced `domain_service_daily_rewards` summary table and refresh function to process daily relay and reward aggregates by domain and service.
- Added functions to fetch supplier and reward statistics grouped by domain.
- Implemented performance indexes to optimize query execution for claims, blocks, and supplier service configurations.
- Updated GraphQL schema to include extracted domains and remove unused Relay entity and references.
- Extended block processing to compute and store domain-based reward summaries.
- Introduced `createModToAcctTransfersTableFn` to define and manage the `mod_to_acct_transfers` table outside of SubQuery, removing auto-generated GiST indexes and enabling selective index addition.
- Added a `rawBulkInsert` utility for optimized batched inserts into non-SubQuery-managed tables.
- Replaced GraphQL `ModToAcctTransfer` entity with direct SQL bulk insert.
- Refactored `_handleEventClaimSettled` to use `bulkInsertModToAcctTransfers` for improved idempotency and batch handling.
- Updated associated imports, dependencies, and schema files for consistency.
- Add vendor/simdjson submodule (fork: jorgecuesta/simdjson_nodejs, branch: buffer-support)
  with Buffer support and findChunkBoundaries C++ state machine
- Add simdjson to vendor-config.yaml and root package.json
- Update subql-cosmos with simdjson-based HttpClient.ts
- Add benchmarks for JSON parse strategies (benchmarks/json-parse/)
- Production verified: 40s parse for 1.5GB/6.7M events (stream-json was crashing)
@jorgecuesta jorgecuesta requested a review from Alann27 March 19, 2026 23:07
@jorgecuesta jorgecuesta self-assigned this Mar 19, 2026
@jorgecuesta jorgecuesta added bug Something isn't working enhancement New feature or request labels Mar 19, 2026
@Alann27 Alann27 merged commit 0ab1826 into main Mar 24, 2026
3 checks passed
@Alann27 Alann27 deleted the feature/domain-service-summary branch March 24, 2026 22:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants