RANGER-5647: Fix date-bound tag policy engine test fixtures by ramackri · Pull Request #1018 · apache/ranger

ramackri · 2026-06-15T05:44:18Z

This PR contains two CI fixes for agents-common:

Date-bound tag test fixtures — testPolicyEngine_hiveForTag_filebased fails on/after 2026-06-15
Thread-safe dynamic path matching — intermittent testPolicyEngine_hdfs_resourcespec failure under concurrent policy evaluation

1. Tag fixture dates

Tag policy tests use RESTRICTED-FINAL with a deny exception conditioned on ctx.isAccessedBefore('activation_date'). Fixture dates were 2026/06/15; once the system clock passes that day, CI fails.

Fix: change 2026/06/15 → 2099/12/31 in:

agents-common/src/test/resources/policyengine/resourceTags.json
agents-common/src/test/resources/policyengine/ACLResourceTags.json
agents-common/src/test/resources/policyengine/plugin/resourceTags.json
agents-common/src/test/resources/policyengine/descendant_tags.json
agents-common/src/test/resources/policyengine/test_policyengine_tag_hive.json

2. Concurrent `{USER}` path matching (`hdfs_resourcespec`)

Symptom

TestPolicyEngine.testPolicyEngine_hdfs_resourcespec
ALLOW 'read /home/user1/tmp/sales.db' for user=user1
expected: true but was: false

TestPolicyEngine.runTestCaseTests() uses parallelStream() on a shared RangerPolicyEngine. That is intentional — production plugins evaluate concurrently on one engine. The fix belongs in product code, not in serializing tests.

Code path (end-to-end)

Policy under test (test_policyengine_hdfs_resourcespec.json, policy id 2):

"resources": { "path": { "values": ["/home/{USER}/"], "isRecursive": true } },
"policyItems": [ { "users": ["{USER}"], "accesses": [ { "type": "read" } ] } ]

Step 1 — Matcher built once at policy-engine init

RangerPathResourceMatcher.buildResourceMatchers() turns /home/{USER}/ into /home/{USER}/* (recursive + trailing /), then creates one RecursiveWildcardResourceMatcher per policy value. Because {USER} is present, setDelimiters() sets tokenReplacer, so getNeedsDynamicEval() returns true — the path cannot be pre-expanded at init.

// RangerPathResourceMatcher.java — constructor: cache split only for STATIC paths
if (!getNeedsDynamicEval()) {
    wildcardPathElements = StringUtils.split(value, pathSeparatorChar);
}
// For /home/{USER}/* → getNeedsDynamicEval()==true → wildcardPathElements stays null

This matcher object is stored inside the policy evaluator and reused for every request that may hit policy id 2.

Step 2 — Each request gets its own token context

RangerDefaultRequestProcessor.preProcess() sets per-request context:

RangerAccessRequestUtil.setCurrentUserInContext(request.getContext(), request.getUser());
// → context["token:USER"] = "user1" or "user2"

Each RangerAccessRequest has its own context Map — that part is thread-safe.

Step 3 — Evaluation reaches the shared matcher

RangerDefaultPolicyEvaluator.evaluate() calls:

resourceMatcher.getMatchType(request.getResource(), scopes, request.getContext());

which eventually calls RecursiveWildcardResourceMatcher.isMatch(resourceValue, evalContext) where:

resourceValue = "/home/user1/tmp/sales.db" (same for both parallel sub-tests)
evalContext = that request's context (different token:USER per thread)

Step 4 — Token expansion (per-request, always correct)

ResourceMatcher.getExpandedValue(evalContext) replaces {USER} from context:

if (tokenReplacer != null) {
    ret = tokenReplacer.replaceTokens(ret, evalContext);
    // user1 → "/home/user1/*"
    // user2 → "/home/user2/*"
}

Step 5 — Segment-by-segment matching

isMatch() passes the expanded path and split segments into RangerPathResourceMatcher.isRecursiveWildCardMatch(), which walks path components:

// isRecursiveWildCardMatch() — compares each segment of the request path
// against wildcardPathElements[pathElementIndex]
for (String p : pathElements) {          // request: "", "home", "user1", "tmp", "sales.db"
    String wp = wildcardPathElements[pathElementIndex];  // policy: "", "home", "user1", "*"
    // segment "user1" must equal wp "user1" for user1 to match
    // segment "user1" vs wp "user2" → mismatch → deny
}

The bug was not in token expansion or in per-request context — it was in where the split policy segments were stored between steps 4 and 5.

Before fix — race on shared instance field

RecursiveWildcardResourceMatcher declared split segments as an instance field:

static class RecursiveWildcardResourceMatcher extends AbstractPathResourceMatcher {
    String[] wildcardPathElements;   // ONE field per matcher object, shared by all threads

    boolean isMatch(String resourceValue, Map<String, Object> evalContext) {
        String expandedValue;
        if (getNeedsDynamicEval()) {
            expandedValue        = getExpandedValue(evalContext);   // per-thread: correct
            wildcardPathElements = StringUtils.split(expandedValue, pathSeparatorChar); // BUG: writes shared field
        } else {
            expandedValue = value;
        }
        return function.apply(resourceValue, expandedValue, pathSeparatorChar, ioCase, wildcardPathElements);
        //                                                      reads shared field
    }
}

Two threads evaluating policy id 2 concurrently both enter isMatch() on the same RecursiveWildcardResourceMatcher instance:

Shared object: RecursiveWildcardResourceMatcher for policy "/home/{USER}/*"
               instance field: wildcardPathElements  ← single slot, both threads write here

Thread A (user1)                              Thread B (user2)
─────────────────────────────────────────────────────────────────────────────
isMatch("/home/user1/tmp/sales.db", ctxA)

expandedValue = getExpandedValue(ctxA)
  ctxA["token:USER"] = "user1"
  → expandedValue = "/home/user1/*"           isMatch("/home/user1/tmp/sales.db", ctxB)

                                              expandedValue = getExpandedValue(ctxB)
                                                ctxB["token:USER"] = "user2"
                                                → expandedValue = "/home/user2/*"

wildcardPathElements =                        wildcardPathElements =
  ["", "home", "user1", "*"]                    ["", "home", "user2", "*"]
  ▲ write to shared field                       ▲ overwrites A's array

function.apply(
  "/home/user1/tmp/sales.db",
  "/home/user1/*",              ← A's expandedValue is still correct (local var)
  wildcardPathElements)         ← but reads ["", "home", "user2", "*"]  WRONG

isRecursiveWildCardMatch compares:
  request segment[2] = "user1"
  policy  segment[2] = "user2"   ← from B's overwrite
  → mismatch at index 2
  → return false
  → isAllowed = false            ✗  (expected true for user1)

Why expandedValue being local wasn't enough: expandedValue is a local String and stays correct per thread, but isRecursiveWildCardMatch() uses both the full wildcardPath string and the wildcardPathElements array for the optimized segment walk. When the array is corrupted by another thread, the segment comparison fails even though the string /home/user1/* on the stack is right.

Why it is intermittent: This is a data race — it only fails when Thread B writes wildcardPathElements between Thread A's write and Thread A's function.apply(). Single-threaded runs and low-contention local tests usually pass; CI full-suite load (agents-common, 502 tests, ForkJoin pool) hits it more often.

After fix — local `pathElements` per `isMatch()` call

boolean isMatch(String resourceValue, Map<String, Object> evalContext) {
    String   expandedValue;
    String[] pathElements;                        // local variable: one per call, per thread stack

    if (getNeedsDynamicEval()) {
        expandedValue = getExpandedValue(evalContext);
        pathElements  = StringUtils.split(expandedValue, pathSeparatorChar);  // no shared write
    } else {
        expandedValue = value;
        pathElements  = wildcardPathElements;     // static paths: read-only cache from init
    }
    return function.apply(resourceValue, expandedValue, pathSeparatorChar, ioCase, pathElements);
}

Static paths (no {USER}): unchanged — wildcardPathElements is still computed once in the constructor and reused read-only.

Dynamic paths ({USER}): split result lives on the thread stack for the duration of one isMatch() call; the instance field is never written during evaluation.

Shared object: RecursiveWildcardResourceMatcher for policy "/home/{USER}/*"
               instance field wildcardPathElements = null (never written for dynamic paths)

Thread A (user1)                              Thread B (user2)
─────────────────────────────────────────────────────────────────────────────
expandedValue_A = "/home/user1/*"             expandedValue_B = "/home/user2/*"

pathElements_A =                              pathElements_B =
  ["", "home", "user1", "*"]                    ["", "home", "user2", "*"]
  (stack-local)                                 (stack-local)
  A cannot see B's array                        B cannot see A's array

function.apply(..., pathElements_A)           function.apply(..., pathElements_B)

isRecursiveWildCardMatch:                      isRecursiveWildCardMatch:
  "user1" == "user1" ✓                            "user1" vs "user2" ✗
  under /home/user1/* ✓                           not under /home/user2/*
  → MATCH → isAllowed=true ✓                      → NO MATCH → isAllowed=false ✓

No shared mutable state between threads during the match.

Static vs dynamic paths — which failed?

The CI failure was on a dynamic path only. Static paths in the same test file were never affected.

Failed sub-test (dynamic path — policy id 2):

"resources": { "path": { "values": ["/home/{USER}/"], "isRecursive": true } },
"policyItems": [ { "users": ["{USER}"], "accesses": [ { "type": "read" } ] } ]

{USER} is a runtime token → getNeedsDynamicEval() returns true
Path is expanded per request from context["token:USER"]:
- user1 → /home/user1/*
- user2 → /home/user2/*
Split segments were written to the shared instance field wildcardPathElements on every isMatch() call → race under parallel evaluation
This is the sub-test that failed: ALLOW 'read /home/user1/tmp/sales.db' for user=user1

Passing sub-test in same file (static path — policy id 1):

"resources": { "path": { "values": ["/finance/rest*ricted/"], "isRecursive": true } },
"policyItems": [ { "groups": ["finance"], "accesses": [ { "type": "read" } ] } ]

No {USER} or other tokens → getNeedsDynamicEval() returns false
Segments split once in the constructor and stored in wildcardPathElements read-only
Never rewritten during isMatch() → no race, unaffected by this bug

	Static path	Dynamic path (`{USER}`)
Example in `hdfs_resourcespec`	`/finance/rest*ricted/` (policy id 1)	`/home/{USER}/` (policy id 2)
`getNeedsDynamicEval()`	`false`	`true`
When segments are split	Once at matcher init	Every `isMatch()` call
Where segments stored (before fix)	Instance field, write-once	Instance field, rewritten per request
CI failure?	No	Yes (intermittent under `parallelStream`)
Fix changes this path?	No — still uses init-time cache	Yes — uses local `pathElements` per call

Behaviour impact

Scenario	Before	After
Single-threaded evaluation	Correct	Correct (identical logic)
Static paths (no tokens)	Correct	Correct (same init-time cache)
Concurrent `{USER}` evaluation	Intermittently wrong	Correct

No intentional semantic change — only removes incorrect cross-thread corruption of split path segments on dynamic paths.

File: agents-common/src/main/java/org/apache/ranger/plugin/resourcematcher/RangerPathResourceMatcher.java

Test plan

mvn test -pl agents-common -Dtest=TestPolicyEngine#testPolicyEngine_hiveForTag_filebased
mvn test -pl agents-common -Dtest=TestPolicyEngine#testPolicyEngine_hdfs_resourcespec (30 consecutive runs)
mvn test -pl agents-common -Dtest=RangerPathResourceMatcherTest
CI build-17 green on agents-common

RESTRICTED-FINAL deny-exception uses isAccessedBefore(activation_date). Fixture dates were 2026/06/15, so TestPolicyEngine_hiveForTag_filebased failed on/after that day. Use 2099/12/31 in tag test JSON so CI stays stable without changing policy semantics under test. Co-authored-by: Cursor <cursoragent@cursor.com>

RecursiveWildcardResourceMatcher reused a shared wildcardPathElements field when expanding {USER} tokens, causing intermittent deny results under concurrent policy evaluation (e.g. hdfs_resourcespec in CI). Co-authored-by: Cursor <cursoragent@cursor.com>

PR apache#1018 updated slash-format tag fixture dates to 2099/12/31 but left 2026-06-15 ISO expiry_date values in test_policyengine_tag_hive.json. After 2026-06-15, TestPolicyEngine.testPolicyEngine_hiveForTag fails CI with isAllowed expected true but was false for EXPIRES_ON SELF match. Co-authored-by: Cursor <cursoragent@cursor.com>

…1021) PR #1018 updated slash-format tag fixture dates to 2099/12/31 but left 2026-06-15 ISO expiry_date values in test_policyengine_tag_hive.json. After 2026-06-15, TestPolicyEngine.testPolicyEngine_hiveForTag fails CI with isAllowed expected true but was false for EXPIRES_ON SELF match. Co-authored-by: ramk <ramk@cloudera.com> Co-authored-by: Cursor <cursoragent@cursor.com>

ramackri mentioned this pull request Jun 15, 2026

RANGER-5645: Add audit-ingestor service-user allowlist for Docker plugins #1017

Merged

3 tasks

ramackri requested a review from mneethiraj June 15, 2026 05:45

mneethiraj approved these changes Jun 15, 2026

View reviewed changes

ramackri merged commit 38d2153 into master Jun 15, 2026
11 of 12 checks passed

This was referenced Jun 16, 2026

RANGER-5642: Exclude duplicate Jersey JARs from Kafka plugin packaging #1020

Open

RANGER-5647: Fix remaining ISO EXPIRES_ON dates in hive tag tests #1021

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RANGER-5647: Fix date-bound tag policy engine test fixtures#1018

RANGER-5647: Fix date-bound tag policy engine test fixtures#1018
ramackri merged 2 commits into
masterfrom
RANGER-5647-patch

ramackri commented Jun 15, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ramackri commented Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

1. Tag fixture dates

2. Concurrent {USER} path matching (hdfs_resourcespec)

Symptom

Code path (end-to-end)

Before fix — race on shared instance field

After fix — local pathElements per isMatch() call

Static vs dynamic paths — which failed?

Behaviour impact

Test plan

Related

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ramackri commented Jun 15, 2026 •

edited

Loading

2. Concurrent `{USER}` path matching (`hdfs_resourcespec`)

After fix — local `pathElements` per `isMatch()` call