Skip to content

SAI-6282: preferredMergePolicyFactory to allow canary merge policy#321

Open
patsonluk wants to merge 5 commits into
patsonluk/SAI-6231/time-routed-segmentfrom
patsonluk/SAI-6282/canary
Open

SAI-6282: preferredMergePolicyFactory to allow canary merge policy#321
patsonluk wants to merge 5 commits into
patsonluk/SAI-6231/time-routed-segmentfrom
patsonluk/SAI-6282/canary

Conversation

@patsonluk

@patsonluk patsonluk commented Jun 12, 2026

Copy link
Copy Markdown
Collaborator

Description

Currently we cannot easily test different merge policy on a node, as:

  1. We don't want to (probably cannot) change the existing collection to another config set, so we are stuck with _FS7
  2. If we directly change the solrconfig.xml under _FS7 then nodes that are NOT being canaried, then Solr will run into classloading issue on restart/new collections as they are not aware of TemporalMergePolicy

Solution

In solrconfig.xml, under <indexConfig>, add a new node <preferredMergePolicyFactory> which has exactly the same structure and parse logic as <mergePolicyFactory>, and if both defined <preferredMergePolicyFactory> would take precendence

For node that still runs the old hash, such <preferredMergePolicyFactory> would just be ignored. if a node is running the new hash (selected for canary), then they will pick up the preferred factory and log the contents on startup Using <preferredMergePolicyFactory>...

Remarks

This is likely NOT something we want in long term. For simplicity we can still merge to fs/branch_9_7 (there's no harm at all). But after time routed segment project concludes, we should just revert this change here?

Test

  1. Updated the _FS7 solrconfig.xml on playpen c92 to below
 <indexConfig aggregateNodeLevelMetricsEnabled="true">
        <mergePolicyFactory class="org.apache.solr.index.SortingMergePolicyFactory">
            ...
         </mergePolicyFactory>
	 <!-- preferredMergePolicyFactory is for canary test. Non test solr hash should just ignore it -->
        <preferredMergePolicyFactory class="org.apache.solr.index.SortingMergePolicyFactory">
            <str name="sort">IndvId asc, EventStart asc</str>
            <str name="wrapped.prefix">inner</str>
            <str name="inner.class">mn.fs.solr.index.TemporalMergePolicyFactory</str>
            <str name="inner.temporalField">EventStart</str>
            <int name="inner.minThreshold">4</int>
            <int name="inner.maxThreshold">8</int>
            <double name="inner.compactionRatio">1.5</double>
            <bool name="inner.useExponentialBuckets">true</bool>
        </preferredMergePolicyFactory> 
  1. Without updating fs-solr hash, restart playpen c92 cluster. This emulates nodes that are NOT selected for canary, which the existing hash reading the new config (should be ignored)
  2. All cores should load and query on local returns same # of docs
  3. Now add fs-solr hash override (3843d3c071917900b7077ada0d0ecfdf7d534da9) to playpen c92-1 to use this PR. Restart the cluster again. This emulates a canary node
  4. solr-c92-1 startup log prints out o.a.s.u.SolrIndexConfig Using <preferredMergePolicyFactory> ({type = preferredMergePolicyFactory,class = org.apache.solr.index.SortingMergePolicyFactory,attributes = {class=org.apache.solr.index.SortingMergePolicyFactory},args = {inner.useExponentialBuckets=true, inner.minThreshold=4, inner.maxThreshold=8, sort=IndvId asc, EventStart asc, wrapped.prefix=inner, inner.class=mn.fs.solr.index.TemporalMergePolicyFactory, inner.temporalField=EventStart, inner.compactionRatio=1.5}}) instead of <mergePolicyFactory>
  5. Commented out overrides. So others can test on c92 playpen w/o surprises

Note

Medium Risk
Changes which merge policy factory builds the index on cores that enable preferred config, affecting merge and indexing behavior; backward compatibility relies on older binaries ignoring the new XML element.

Overview
Adds optional preferredMergePolicyFactory under indexConfig in solrconfig.xml, parsed like mergePolicyFactory. When it has a class, SolrIndexConfig uses it for mergePolicyFactoryInfo and logs that choice at startup; otherwise behavior stays on mergePolicyFactory.

Configsets can define both so older Solr builds (that do not know the new element) keep the legacy factory, while newer canary nodes run a different merge policy (e.g. custom temporal policy) without changing configset for everyone.

Includes a test fixture and testPreferredMergePolicyFactoryOverridesMergePolicyFactory. PrometheusMetricsServlet only reformats some CoreMetric enum constructor arguments (no metric behavior change).

Reviewed by Cursor Bugbot for commit a117e89. Bugbot is set up for automated code reviews on this repo. Configure here.

@patsonluk patsonluk force-pushed the patsonluk/SAI-6282/canary branch from 7729001 to 94f6b0e Compare June 26, 2026 18:19
@patsonluk patsonluk changed the base branch from patsonluk/SAI-6231/time-routed-segment to fs/branch_9_7 June 26, 2026 19:43
@patsonluk patsonluk changed the base branch from fs/branch_9_7 to patsonluk/SAI-6231/time-routed-segment June 26, 2026 19:44
@patsonluk patsonluk force-pushed the patsonluk/SAI-6282/canary branch from 0fea567 to a117e89 Compare June 26, 2026 19:45
@patsonluk patsonluk marked this pull request as ready for review June 26, 2026 19:47

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes using default effort and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Want higher recall? High effort reviews run extra passes and find more bugs. A team admin can switch effort levels in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit a117e89. Configure here.

preferredMergePolicyFactory);
} else {
mergePolicyFactoryInfo = mergePolicyFactory;
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing class still overrides factory

Medium Severity

The new javadoc says mergePolicyFactory is used when preferredMergePolicyFactory is absent or lacks a class, but the constructor treats any existing preferred node as authoritative. A preferred element without a usable class replaces a valid mergePolicyFactory and core load fails instantiating a null factory class.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit a117e89. Configure here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants