Skip to content

feat(index): parallelize full index build#3

Merged
mohamedameen-io merged 2 commits into
mainfrom
feat/parallel-index-build
Jun 10, 2026
Merged

feat(index): parallelize full index build#3
mohamedameen-io merged 2 commits into
mainfrom
feat/parallel-index-build

Conversation

@mohamedameen-io

Copy link
Copy Markdown
Owner

Summary

  • Parallelize full index build using forkserver pool and bulk-load FTS rebuild
  • Add config schema fields to control parallel indexing behavior
  • Update CLI commands (execute, init, plan, resume) to pass parallelism settings

Test Plan

  • New parallel index tests pass (test_state_file_index_parallel.py, 347 lines)
  • CLI init with index tests pass (test_cli_init_with_index.py)
  • All 13 tests passing locally

…d FTS rebuild)

build_full now parses files across a forkserver process Pool (CPU/GIL-bound symbol extraction) feeding a single bulk-load writer: FTS triggers dropped during load, batched executemany with periodic commits (bounds the WAL), one symbols_fts('rebuild'), then triggers recreated. Serial and parallel share the same writer, so output is identical by construction.

Adds index_build_workers and index_build_batch_size config and flips index_huge_repo_async_init default to synchronous; threads workers/batch through init/execute/plan/resume. Adds the missing __main__ 'build-full' entry (the async build path was previously a silent no-op).

Verified byte-identical files/symbols/symbols_fts/files_fts vs the serial path on a 358k-file repo; ~132x faster (45s vs ~99min).
@mohamedameen-io mohamedameen-io merged commit 67c7498 into main Jun 10, 2026
1 check passed
@mohamedameen-io mohamedameen-io deleted the feat/parallel-index-build branch June 10, 2026 20:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant