Try to reproduce the following performance on 26.02 and see if problems disappear on latest branches/PRs
Vector Dimensions: 64
Vector Similarity: DOT_PRODUCT (vectors are assumed pre-normalized (per the code comment))
GPU (cuVS CAGRA) Settings — from StandaloneIndexer.createIndexWriterConfig():
- CagraGraphBuildAlgo: IVF_PQ (the enum value, no sub-parameters like nlist/nprobe/PQ bits are set — they use cuVS defaults)
- graphDegree: 32 (default is 64)
- intermediateGraphDegree: 48 (default is 128)
- maxConn (HNSW M): 16
- beamWidth: 100
- numMergeWorkers: 16
- Merge executor: a fixed thread pool of 16
CPU (Lucene HNSW) Settings:
- Lucene99HnswVectorsFormat(maxConn=16, beamWidth=100, mergeWorkers=16, mergePool)
- Codec: Lucene101Codec with BEST_SPEED mode
IndexWriter Config (shared):
- RAMBufferSizeMB: 30 GB (30 * 1024)
- maxBufferedDocs: disabled (auto-flush off)
- useCompoundFile: false
- openMode: CREATE
Flush / Commit Behavior:
- GPU mode: commits are skipped during loading (!useGpu guard on line ~180 of StandaloneIndexer). All docs accumulate in RAM, then close() triggers a single commit() → forceMerge(4) → commit().
- CPU mode: commits every 200,000 docs (COMMIT_BATCH_SIZE = 200000)
Indexing Threads:
- GPU: 2 threads (code says useGpu ? 2 : MERGE_WORKERS), though the Quip doc notes 1 thread was optimal and 2 was slower
- CPU: 16 threads
Queue size: 150,000 documents
IVF_PQ sub-parameters: The code uses IVF_PQ as the CagraGraphBuildAlgo enum but does not set any IVF_PQ-specific parameters (like nlist, nprobe, PQ sub-quantizers, PQ bits). These all use cuVS defaults. The AcceleratedHNSWParams.Builder only exposes
withCagraGraphBuildAlgo, withGraphDegree, withIntermediateGraphDegree, withMaxConn, withBeamWidth, withNumMergeWorkers, and withMergeExecutorService.
Try to reproduce the following performance on 26.02 and see if problems disappear on latest branches/PRs
My hypothesis is that most of these issues are resolved by using better IVF_PQ settings and switching to the latest main branch for cuvs and cuvs-lucene (with my PRs for cuvs-lucene: #141, #140), but there are also some timing issues that we need to update in the searchscale benchmarking repo (SearchScale/vectorsearch-benchmarks#14, SearchScale/vectorsearch-benchmarks#16). Will verify this (#143) or continue investigating after resolving SearchScale/vectorsearch-benchmarks#17