Skip to content

Rebase 32x-default-compression on main#3371

Merged
naveentatikonda merged 19 commits into
feature/32x-default-compressionfrom
main
Jun 18, 2026
Merged

Rebase 32x-default-compression on main#3371
naveentatikonda merged 19 commits into
feature/32x-default-compressionfrom
main

Conversation

@kotwanikunal

Copy link
Copy Markdown
Member

Description

  • Rebases 32x-default-compression feature branch on main

Related Issues

  • None

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

kotwanikunal and others added 19 commits May 20, 2026 11:30
Signed-off-by: Kunal Kotwani <kkotwani@amazon.com>
Signed-off-by: Kunal Kotwani <kkotwani@amazon.com>
…fields (#3324)

Signed-off-by: Navneet Verma <navneev@amazon.com>
Signed-off-by: shreyah963 <shreyab963@gmail.com>
Signed-off-by: Kunal Kotwani <kkotwani@amazon.com>
Co-authored-by: Tejas Shah <shatejas@amazon.com>
Signed-off-by: Sayali Gaikawad <gaiksaya@amazon.com>
Signed-off-by: opensearch-ci-bot <opensearch-infra@amazon.com>
* Fix derived source for mixed-case vector fields

Signed-off-by: Wonjae Lee <wonjae.lee@dremio.com>

* Add BWC coverage for derived source field casing

Signed-off-by: Wonjae Lee <wonjae.lee@dremio.com>

* Add changelog entry for mixed-case derived source fix

Signed-off-by: Wonjae Lee <wonjae.lee@dremio.com>

* Handle case-insensitive conflicts by preferring vector field

Signed-off-by: Wonjae Lee <wonjae.lee@dremio.com>

* Avoid stream wrappers for derived field lookup

Signed-off-by: Wonjae Lee <wonjae.lee@dremio.com>

* Handle ambiguous case-insensitive matches without vector hints

Signed-off-by: Wonjae Lee <wonjae.lee@dremio.com>

* Update src/main/java/org/opensearch/knn/index/codec/KNN10010Codec/KNN10010DerivedSourceStoredFieldsFormat.java

Co-authored-by: Tejas Shah <shatejas@amazon.com>
Signed-off-by: Wonjae Lee <38933452+leewjae@users.noreply.github.com>

* Apply spotless formatting for derived source field resolution

Signed-off-by: Wonjae Lee <wonjae.lee@dremio.com>

* Avoid guessing when case-insensitive matches lack vector hints

Signed-off-by: Wonjae Lee <wonjae.lee@dremio.com>

* Simplify case-insensitive derived field matching

Signed-off-by: Wonjae Lee <wonjae.lee@dremio.com>

* Trigger CI rerun for BWC investigation

Signed-off-by: Wonjae Lee <wonjae.lee@dremio.com>

* Add native engine field info coverage

Signed-off-by: Wonjae Lee <wonjae.lee@dremio.com>

---------

Signed-off-by: Wonjae Lee <wonjae.lee@dremio.com>
Signed-off-by: Wonjae Lee <38933452+leewjae@users.noreply.github.com>
Signed-off-by: Tejas Shah <shatejas@amazon.com>
Co-authored-by: Tejas Shah <shatejas@amazon.com>
Co-authored-by: Navneet Verma <navneev@amazon.com>
* Fixes RescoreParser to pass the rescore flag

For multinode or coordinator-data node setup, rescore set to false is
not passed through streams. This causes rescoring to execute even when
its not disabled explicitly by user

Signed-off-by: Tejas Shah <shatejas@amazon.com>

* Updates Changelogs, improves code cov

Signed-off-by: Tejas Shah <shatejas@amazon.com>

* Makes the coordinator port dynamic

Signed-off-by: Tejas Shah <shatejas@amazon.com>

* Adds BWC test for mode and compression

Signed-off-by: Tejas Shah <shatejas@amazon.com>

* Does not create compressed indices before 2.18

Signed-off-by: Tejas Shah <shatejas@amazon.com>

* Fixes bwc

Signed-off-by: Tejas Shah <shatejas@amazon.com>

---------

Signed-off-by: Tejas Shah <shatejas@amazon.com>
* Rescoring after radial search on quantized index. [Task 1 - 4] (#3300)

* Bumped gradle to 9.4.1 and jacoco to 0.8.14 (#3308)

Signed-off-by: Andrew Klepchick <aklepchi@amazon.com>

* Use KNN1040ScalarQuantizedVectorsFormat for Faiss SQ flat format (#3302)

The Faiss SQ format was using Lucene's Lucene104ScalarQuantizedVectorsFormat
directly, which lacks the prefetch-enabled raw vector reader that
KNN1040ScalarQuantizedVectorsFormat provides. This meant exact search
rescoring was missing I/O prefetch during graph traversal.

Changes:
- Switch faissSqFlatFormat from Lucene104ScalarQuantizedVectorsFormat to
  KNN1040ScalarQuantizedVectorsFormat in Faiss1040ScalarQuantizedKnnVectorsFormat
- Add @VisibleForTesting getFlatVectorsReader() to
  Faiss1040ScalarQuantizedKnnVectorsReader to replace reflection in tests
- Add testGetRandomVectorScorer_returnsPrefetchableScorer in
  KNN1040ScalarQuantizedVectorsFormatTests verifying the scorer is
  PrefetchableRandomVectorScorer via a real write/read cycle
- Replace reflection with getter in
  Faiss1040ScalarQuantizedKnnVectorsFormatTests.testFieldsReader_thenWrapsFlatReaderWithPrefetchSupport

Signed-off-by: Vijayan Balasubramanian <balasvij@amazon.com>

* Allow minScore, maxDistance for 32x SQ index.

Signed-off-by: Dooyong Kim <kdooyong@amazon.com>

Pass compression and quantization config to RNN query builder.

Signed-off-by: Dooyong Kim <kdooyong@amazon.com>

Added RescoreRadialSearchQuery.

Signed-off-by: Dooyong Kim <kdooyong@amazon.com>

Wiring `RescoreRadialSearchQuery` wrapper in `RNNQueryFactory`

Signed-off-by: Dooyong Kim <kdooyong@amazon.com>

---------

Signed-off-by: Andrew Klepchick <aklepchi@amazon.com>
Signed-off-by: Vijayan Balasubramanian <balasvij@amazon.com>
Signed-off-by: Dooyong Kim <kdooyong@amazon.com>
Co-authored-by: Andrew Klepchick <aklepchi@amazon.com>
Co-authored-by: Vijayan Balasubramanian <balasvij@amazon.com>

* Rescore radial search quantized complete (#3337)

* Added exact search logic after radial.

Signed-off-by: Dooyong Kim <kdooyong@amazon.com>

* Adding 2nd rescoring after radial search on quantized index.

Signed-off-by: Dooyong Kim <kdooyong@amazon.com>

---------

Signed-off-by: Dooyong Kim <kdooyong@amazon.com>

* Update changelog

Signed-off-by: Dooyong Kim <kdooyong@amazon.com>

---------

Signed-off-by: Andrew Klepchick <aklepchi@amazon.com>
Signed-off-by: Vijayan Balasubramanian <balasvij@amazon.com>
Signed-off-by: Dooyong Kim <kdooyong@amazon.com>
Co-authored-by: Andrew Klepchick <aklepchi@amazon.com>
Co-authored-by: Vijayan Balasubramanian <balasvij@amazon.com>
Signed-off-by: Navneet Verma <navneev@amazon.com>
Signed-off-by: Divya Madala <divyaasm@amazon.com>
Co-authored-by: Tejas Shah <shatejas@amazon.com>
Signed-off-by: Andrew Klepchick <aklepchi@amazon.com>
Vectors can now be indexed as base64-encoded strings in addition to JSON
arrays. Float vectors use little-endian byte encoding (symmetric with
the doc_values binary output format), while byte/binary vectors use raw
byte encoding. This enables efficient bulk ingestion pipelines that
avoid JSON array serialization overhead.

Signed-off-by: Navneet Verma <navneev@amazon.com>
…NotSupportedException. (#3344)

Signed-off-by: Dooyong Kim <kdooyong@amazon.com>
Signed-off-by: Doo Yong Kim <kdooyong@amazon.com>
…3367)

Signed-off-by: Navneet Verma <navneev@amazon.com>
@github-actions

Copy link
Copy Markdown

PR Code Analyzer ❗

AI-powered 'Code-Diff-Analyzer' found issues on commit 80c40d9.

'Diff too large, requires skip by maintainers after manual review'


Pull Requests Author(s): Please update your Pull Request according to the report above.

Repository Maintainer(s): You can bypass diff analyzer by adding label skip-diff-analyzer after reviewing the changes carefully, then re-run failed actions. To re-enable the analyzer, remove the label, then re-run all actions.


⚠️ Note: The Code-Diff-Analyzer helps protect against potentially harmful code patterns. Please ensure you have thoroughly reviewed the changes beforehand.

Thanks.

@kotwanikunal kotwanikunal added the skip-diff-analyzer Maintainer to skip code-diff-analyzer check, after reviewing issues in AI analysis. label Jun 17, 2026
@github-actions

Copy link
Copy Markdown

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

🧪 PR contains tests
🔒 No security concerns identified
✅ No TODO sections
🔀 No multiple PR themes
⚡ Recommended focus areas for review

Possible Issue

In testConstructor_whenExactSearcherNotInitialized_thenThrows, the test temporarily sets the singleton to null and then restores it in a finally block. If the test fails before reaching the finally block (e.g., due to an assertion failure), the singleton remains null, breaking all subsequent tests in the suite. This is a test isolation issue that can cause cascading failures.

// Temporarily set singleton to null
RescoreRadialSearchQuery.initialize(null);
try {
    NullPointerException e = expectThrows(
        NullPointerException.class,
        () -> new RescoreRadialSearchQuery(
            new MatchAllDocsQuery(),
            FIELD_NAME,
            QUERY_VECTOR,
            RADIUS,
            false,
            MAX_RESULTS_RADIAL_RESCORING
        )
    );
    assertTrue(e.getMessage().contains("Exact searcher was not initialized"));
} finally {
    // Restore for other tests
    RescoreRadialSearchQuery.initialize(new ExactSearcher(mock(ModelDao.OpenSearchKNNModelDao.class)));
}
Possible Issue

In getMergeTopN, when expandNestedDocs is true OR (knnQuery.isMemoryOptimizedSearch() && effectiveK == k), the method returns the sum of all doc counts. However, the condition logic appears inverted for the memory-optimized case. If effectiveK > k (ef_search expansion occurred), we want to trim to k, so we should return k. If effectiveK == k (no expansion), we want to preserve all results, so we should return sum. The current condition returns sum when effectiveK == k, which is correct, but the comment on line 234 says 'For MOS with ef_search expansion: returns k to trim excess results', which contradicts the code. Either the comment or the condition is wrong.

private int getMergeTopN(TopDocs[] topDocs, int k, int effectiveK) {
    if (expandNestedDocs || (knnQuery.isMemoryOptimizedSearch() && effectiveK == k)) {
        int sum = 0;
        for (TopDocs topDoc : topDocs) {
            sum += topDoc.scoreDocs.length;
        }
        return sum;
    }
Possible Issue

In testDoToQuery_whenRadialSearchOnDiskMode_thenException, the test now expects radial search on BQ (binary quantization) to throw an exception, but the test setup creates a mapping config with QuantizationConfig.builder().quantizationType(ScalarQuantizationType.ONE_BIT), which is BQ. However, the test name and old code suggest this was originally testing 'disk mode' (on_disk), not BQ specifically. After the PR changes, the guard logic now distinguishes between BQ (blocked) and 32x SQ (allowed). The test should be renamed to testDoToQuery_whenRadialSearchOnBinaryQuantization_thenException to reflect what it actually tests, or the test should be updated to test the original 'on_disk' scenario if that was the intent.

public void testDoToQuery_whenRadialSearchOnDiskMode_thenException() {
    // Given: a radial search query on a BQ quantized index (QuantizationConfig != EMPTY)
    float[] queryVector = { 1.0f };
    KNNQueryBuilder knnQueryBuilder = KNNQueryBuilder.builder()
        .fieldName(FIELD_NAME)
        .vector(queryVector)
        .maxDistance(MAX_DISTANCE)
        .build();

    Index dummyIndex = new Index("dummy", "dummy");
    QueryShardContext mockQueryShardContext = mock(QueryShardContext.class);
    KNNVectorFieldType mockKNNVectorField = mock(KNNVectorFieldType.class);
    when(mockQueryShardContext.index()).thenReturn(dummyIndex);
    when(mockQueryShardContext.fieldMapper(anyString())).thenReturn(mockKNNVectorField);

    MethodComponentContext methodComponentContext = new MethodComponentContext(
        org.opensearch.knn.common.KNNConstants.METHOD_HNSW,
        ImmutableMap.of()
    );
    KNNMethodContext knnMethodContext = new KNNMethodContext(KNNEngine.FAISS, SpaceType.L2, methodComponentContext);
    KNNMappingConfig bqMappingConfig = new KNNMappingConfig() {
        @Override
        public Optional<KNNMethodContext> getKnnMethodContext() {
            return Optional.of(knnMethodContext);
Possible Issue

In testBase64Indexing_invalidInput_throwsException, the test for binary data type creates a vector with 3 bytes for dimension=16 bits (which expects 2 bytes). The test expects an IllegalArgumentException, but the error message assertion is missing. If the implementation changes to throw a different exception or a different message, the test will pass incorrectly. The test should assert on the exception message to ensure it's the expected dimension mismatch error, not some other error like a base64 decoding failure.

// --- Binary: wrong dimension (dimension=16 bits expects 2 bytes, provide 3) ---
int binaryDimension = 16;
byte[] wrongDimBinaryVector = new byte[] { 1, 2, 3 };
String wrongDimBinaryBase64 = java.util.Base64.getEncoder().encodeToString(wrongDimBinaryVector);
EngineFieldMapper binaryMapper = createFieldMapperForBase64Test(VectorDataType.BINARY, binaryDimension);
ParseContext binaryCtx = createParseContextForBase64(wrongDimBinaryBase64);
expectThrows(
    IllegalArgumentException.class,
    () -> binaryMapper.getBytesFromContext(binaryCtx, binaryDimension, VectorDataType.BINARY)
);

@github-actions

Copy link
Copy Markdown

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Possible issue
Remove invalid nextValue() call

Calling nextValue() on an empty leaf (after advanceExact returns false) violates the
DocValueFetcher contract and may throw an exception or return undefined behavior.
Only call nextValue() after a successful advanceExact(docId) that returns true.
Remove the invalid nextValue() call.

src/test/java/org/opensearch/knn/index/KNNVectorDVLeafFieldDataTests.java [465]

-public void testGetLeafValueFetcher_fieldNotInSegment_returnsEmptyLeaf() throws IOException {
-    ...
-    for (int docId = 0; docId < numDocs; docId++) {
-        assertFalse("advanceExact should fail for doc " + docId + " in segment without vector field", leaf.advanceExact(docId));
-        assertEquals("docValueCount should be 0 for doc " + docId, 0, leaf.docValueCount());
-    }
+for (int docId = 0; docId < numDocs; docId++) {
+    assertFalse("advanceExact should fail for doc " + docId + " in segment without vector field", leaf.advanceExact(docId));
+    assertEquals("docValueCount should be 0 for doc " + docId, 0, leaf.docValueCount());
+}
 
-    // Verify nextValue returns null on the empty leaf
-    assertNull("Empty leaf nextValue should return null", leaf.nextValue());
-
Suggestion importance[1-10]: 9

__

Why: Calling nextValue() on an empty leaf after advanceExact returns false violates the DocValueFetcher contract and may cause exceptions or undefined behavior. This is a correctness issue that could lead to test instability or incorrect validation.

High
Use anyLong() for primitive long

The test mocks innerScorerSupplier.get(any(Long.class)) but the actual code calls
get(long leadCost) with a primitive long, not Long. Mockito's any(Long.class) may
not match primitive long arguments, causing the mock to return null instead of the
expected empty scorer. Use anyLong() instead to match primitive long arguments.

src/test/java/org/opensearch/knn/index/query/RescoreRadialSearchQueryTests.java [119]

-public void testScorerSupplier_whenInnerReturnsEmpty_thenEmptyScorer() throws IOException {
-    // Inner query returns an empty scorer
-    Scorer innerScorer = KNNScorer.emptyScorer();
+when(innerScorerSupplier.get(anyLong())).thenReturn(innerScorer);
 
-    ScorerSupplier innerScorerSupplier = mock(ScorerSupplier.class);
-    when(innerScorerSupplier.get(any(Long.class))).thenReturn(innerScorer);
-    when(innerScorerSupplier.cost()).thenReturn(0L);
-    ...
-
Suggestion importance[1-10]: 8

__

Why: The mock uses any(Long.class) which may not match primitive long arguments in the actual code. Using anyLong() ensures the mock correctly matches the method signature, preventing potential null returns that would cause test failures.

Medium
General
Validate maxResultsSize is positive

The maxResultsSize parameter should be validated to ensure it is positive. Allowing
zero or negative values could lead to unexpected behavior in the TopKnnCollector or
downstream processing. Add a validation check to prevent invalid values.

src/main/java/org/opensearch/knn/index/query/RescoreRadialSearchQuery.java [95-110]

 public RescoreRadialSearchQuery(
     final Query innerQuery,
     final String field,
     final float[] queryVector,
     float radius,
     final boolean memoryOptimizedSearchEnabled,
     final int maxResultsSize
 ) {
     this.innerQuery = Objects.requireNonNull(innerQuery);
     this.field = Objects.requireNonNull(field);
     this.queryVector = Objects.requireNonNull(queryVector);
     this.radius = radius;
     this.memoryOptimizedSearchEnabled = memoryOptimizedSearchEnabled;
+    if (maxResultsSize <= 0) {
+        throw new IllegalArgumentException("maxResultsSize must be positive, got: " + maxResultsSize);
+    }
     this.maxResultsSize = maxResultsSize;
     Objects.requireNonNull(EXACT_SEARCHER_SINGLETON, "Exact searcher was not initialized.");
 }
Suggestion importance[1-10]: 8

__

Why: The constructor accepts maxResultsSize without validating it is positive. Allowing zero or negative values could cause issues in TopKnnCollector or downstream processing. Adding validation prevents invalid configurations and improves error handling.

Medium
Remove production-unsafe assertion

The assertion assert (iterator.cost() > maxResultsSize) may fail in production if
the cost estimate is incorrect or if the iterator returns fewer documents than
expected. Consider replacing the assertion with a conditional check or removing it,
as assertions are typically disabled in production environments.

src/main/java/org/opensearch/knn/index/query/RescoreRadialSearchQuery.java [295-304]

 private TopDocs collectTopDocs(final Scorer scorer) throws IOException {
     final TopKnnCollector collector = new TopKnnCollector(maxResultsSize, Integer.MAX_VALUE);
     final DocIdSetIterator iterator = scorer.iterator();
-    assert (iterator.cost() > maxResultsSize);
     int docId;
     while ((docId = iterator.nextDoc()) != DocIdSetIterator.NO_MORE_DOCS) {
         collector.collect(docId, scorer.score());
     }
     return collector.topDocs();
 }
Suggestion importance[1-10]: 7

__

Why: The assertion assert (iterator.cost() > maxResultsSize) on line 298 may fail in production if the cost estimate is incorrect. Since assertions are typically disabled in production, this could lead to unexpected behavior. Removing or replacing with a conditional check improves robustness.

Medium
Reorder validation checks for clarity

The validation logic checks knnEngine first, then vectorDataType, then
quantizationConfig. If the engine check fails, the method throws immediately without
checking other conditions. Consider reordering checks to validate all conditions and
provide a comprehensive error message, or ensure the order reflects the most
critical validation first.

src/main/java/org/opensearch/knn/index/mapper/KNNVectorFieldType.java [263-269]

 public void validateSupportRadialSearch(final KNNEngine knnEngine) {
+    if (getVectorDataType() == VectorDataType.BINARY) {
+        throw new UnsupportedOperationException("Binary data type does not support radial search");
+    }
     if (ENGINES_SUPPORTING_RADIAL_SEARCH.contains(knnEngine) == false) {
         throw new UnsupportedOperationException(String.format(Locale.ROOT, "Engine [%s] does not support radial search", knnEngine));
-    }
-    if (getVectorDataType() == VectorDataType.BINARY) {
-        throw new UnsupportedOperationException(String.format(Locale.ROOT, "Binary data type does not support radial search"));
     }
     ...
 }
Suggestion importance[1-10]: 4

__

Why: The validation order checks knnEngine before vectorDataType. While functionally correct, reordering to check vectorDataType first (a simpler, field-level check) before engine-level checks could improve code clarity and provide more intuitive error messages. However, the current order is also reasonable, so the impact is minor.

Low
Clarify operator precedence with parentheses

The condition candidate.hasVectorValues() && !matchedFieldInfo.hasVectorValues()
lacks parentheses, making operator precedence ambiguous. Add explicit parentheses to
clarify the intended logic and prevent potential bugs if the condition is modified
later.

src/main/java/org/opensearch/knn/index/codec/KNN10010Codec/KNN10010DerivedSourceStoredFieldsFormat.java [105]

 FieldInfo matchedFieldInfo = null;
 for (FieldInfo candidate : fieldInfos) {
     if (candidate.name.equalsIgnoreCase(field) == false) {
         continue;
     }
-    if (matchedFieldInfo == null || candidate.hasVectorValues() && !matchedFieldInfo.hasVectorValues()) {
+    if (matchedFieldInfo == null || (candidate.hasVectorValues() && !matchedFieldInfo.hasVectorValues())) {
         matchedFieldInfo = candidate;
     } else if (matchedFieldInfo.hasVectorValues() && candidate.hasVectorValues()) {
         log.warn(
             "Skipping derived vector field [{}] because field infos [{}] and [{}] both match case-insensitively",
             field,
             matchedFieldInfo.name,
             candidate.name
         );
         return null;
     }
 }
Suggestion importance[1-10]: 4

__

Why: The suggestion correctly identifies that adding parentheses around candidate.hasVectorValues() && !matchedFieldInfo.hasVectorValues() improves code clarity. However, the current code is already correct due to Java's operator precedence (logical AND binds tighter than OR), so this is a minor style improvement rather than a bug fix.

Low
Remove unnecessary empty block

The empty block { } after opening the IndexReader serves no purpose and reduces code
clarity. Remove the unnecessary braces to improve readability and maintain
consistent code structure across the test methods.

src/test/java/org/opensearch/knn/index/query/RescoreRadialSearchQueryTests.java [426-428]

-public void testRescore_withLuceneIndex_filtersOutsideRadius() {
+try (IndexReader reader = DirectoryReader.open(directory)) {
+    IndexSearcher searcher = new IndexSearcher(reader);
     ...
-    try (Directory directory = newDirectory()) {
-        try (IndexWriter w = new IndexWriter(directory, newIndexWriterConfig())) {
-            for (float[] vector : vectors) {
-                Document doc = new Document();
-                doc.add(new KnnFloatVectorField(FIELD_NAME, vector, VectorSimilarityFunction.EUCLIDEAN));
-                w.addDocument(doc);
-            }
-            w.commit();
-        }
 
-        try (IndexReader reader = DirectoryReader.open(directory)) {
-            {
-
-                IndexSearcher searcher = new IndexSearcher(reader);
-                ...
-
Suggestion importance[1-10]: 3

__

Why: The empty block { } after opening the IndexReader serves no functional purpose and slightly reduces readability. Removing it improves code clarity without affecting behavior.

Low
Use imported class names consistently

The test uses fully qualified class names (java.nio.ByteBuffer, java.nio.ByteOrder,
java.util.Base64) instead of imports. Since ByteBuffer and ByteOrder are already
imported at the top of the file, use the short names for consistency and
readability.

src/test/java/org/opensearch/knn/index/mapper/KNNVectorFieldMapperTests.java [2871-2874]

-public void testBase64Indexing_allDataTypes_parsesCorrectly() throws IOException {
-    // --- Float ---
-    float[] expectedFloat = TEST_VECTOR;
-    java.nio.ByteBuffer buffer = java.nio.ByteBuffer.allocate(expectedFloat.length * Float.BYTES)
-        .order(java.nio.ByteOrder.LITTLE_ENDIAN);
-    buffer.asFloatBuffer().put(expectedFloat);
-    String floatBase64 = java.util.Base64.getEncoder().encodeToString(buffer.array());
-    ...
+ByteBuffer buffer = ByteBuffer.allocate(expectedFloat.length * Float.BYTES)
+    .order(ByteOrder.LITTLE_ENDIAN);
+buffer.asFloatBuffer().put(expectedFloat);
+String floatBase64 = Base64.getEncoder().encodeToString(buffer.array());
Suggestion importance[1-10]: 2

__

Why: The test uses fully qualified class names (java.nio.ByteBuffer, java.nio.ByteOrder, java.util.Base64) instead of the imported short names. Using the imported names improves consistency and readability, though this is a minor style issue.

Low
Verify TotalHits semantics for accuracy

The TotalHits constructor is called with topDocs.scoreDocs.length as the hit count,
but this may not accurately reflect the total number of vectors visited during
search. Verify that this aligns with the intended semantics for hitCount in the
context of memory-optimized search.

src/main/java/org/opensearch/knn/index/query/memoryoptsearch/MemoryOptimizedKNNWeight.java [214]

+// Preserve the original totalHits relation to indicate if the search budget was exhausted
 topDocs = new TopDocs(new TotalHits(topDocs.scoreDocs.length, topDocs.totalHits.relation()), topDocs.scoreDocs);
Suggestion importance[1-10]: 2

__

Why: The suggestion asks to verify the semantics, but the PR code already includes a comment explaining the decision to preserve totalHits.relation() to track if the search budget was exhausted. The suggestion's improved code only adds a comment that essentially duplicates what's already documented in the PR, providing minimal value.

Low

@codecov

codecov Bot commented Jun 18, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 96.71053% with 10 lines in your changes missing coverage. Please review.
✅ Project coverage is 83.73%. Comparing base (67033ce) to head (80c40d9).

Files with missing lines Patch % Lines
...arch/knn/index/query/RescoreRadialSearchQuery.java 91.66% 4 Missing and 2 partials ⚠️
...java/org/opensearch/knn/index/query/KNNWeight.java 87.50% 1 Missing and 1 partial ⚠️
...Codec/KNN10010DerivedSourceStoredFieldsFormat.java 96.29% 0 Missing and 1 partial ⚠️
...rg/opensearch/knn/index/query/RNNQueryFactory.java 97.36% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@                          Coverage Diff                          @@
##             feature/32x-default-compression    #3371      +/-   ##
=====================================================================
+ Coverage                              83.53%   83.73%   +0.19%     
- Complexity                              4287     4358      +71     
=====================================================================
  Files                                    450      453       +3     
  Lines                                  15552    15773     +221     
  Branches                                2015     2056      +41     
=====================================================================
+ Hits                                   12991    13207     +216     
- Misses                                  1770     1782      +12     
+ Partials                                 791      784       -7     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@kotwanikunal kotwanikunal marked this pull request as ready for review June 18, 2026 16:00
@naveentatikonda naveentatikonda merged commit d5f120f into feature/32x-default-compression Jun 18, 2026
98 of 100 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

skip-diff-analyzer Maintainer to skip code-diff-analyzer check, after reviewing issues in AI analysis.

Projects

None yet

Development

Successfully merging this pull request may close these issues.