Skip to content

HPCC-32908 Initial stab at implementing a IPropertyTree visitor#21054

Open
ghalliday wants to merge 19 commits into
hpcc-systems:masterfrom
ghalliday:ptree_visitor
Open

HPCC-32908 Initial stab at implementing a IPropertyTree visitor#21054
ghalliday wants to merge 19 commits into
hpcc-systems:masterfrom
ghalliday:ptree_visitor

Conversation

@ghalliday

Copy link
Copy Markdown
Member

Type of change:

  • This change is a bug fix (non-breaking change which fixes an issue).
  • This change is a new feature (non-breaking change which adds functionality).
  • This change improves the code (refactor or other change that does not change the functionality)
  • This change fixes warnings (the fix does not alter the functionality or the generated code)
  • This change is a breaking change (fix or feature that will cause existing behavior to change).
  • This change alters the query API (existing queries will have to be recompiled)

Checklist:

  • My code follows the code style of this project.
    • My code does not create any new warnings from compiler, build system, or lint.
  • The commit message is properly formatted and free of typos.
    • The commit message title makes sense in a changelog, by itself.
    • The commit is signed.
  • My change requires a change to the documentation.
    • I have updated the documentation accordingly, or...
    • I have created a JIRA ticket to update the documentation.
    • Any new interfaces or exported functions are appropriately commented.
  • I have read the CONTRIBUTORS document.
  • The change has been fully tested:
    • I have added tests to cover my changes.
    • All new and existing tests passed.
    • I have checked that this change does not introduce memory leaks.
    • I have used Valgrind or similar tools to check for potential issues.
  • I have given due consideration to all of the following potential concerns:
    • Scalability
    • Performance
    • Security
    • Thread-safety
    • Cloud-compatibility
    • Premature optimization
    • Existing deployed queries will not be broken
    • This change fixes the problem, not just the symptom
    • The target branch of this pull request is appropriate for such a change.
  • There are no similar instances of the same problem that should be addressed
    • I have addressed them here
    • I have raised JIRA issues to address them separately
  • This is a user interface / front-end modification
    • I have tested my changes in multiple modern browsers
    • The component(s) render as expected

Smoketest:

  • Send notifications about my Pull Request position in Smoketest queue.
  • Test my draft Pull Request.

Testing:

ghalliday added 16 commits March 3, 2026 14:19
…IPropertyTree

- Add VisitResult enum class (Continue, Stop, SkipChildren) to jptree.hpp
- Add IPropertyTreeVisitor interface with single visit(node, attrName) method
  where attrName is nullptr for element matches and the attribute name when
  the xpath tail resolves to an attribute
- Add pure virtual visit() declaration to IPropertyTree interface
- Add visit() override declaration to PTree in jptree.ipp
Add PTree::visit(const char *xpath, IPropertyTreeVisitor &) which walks all
nodes matching the xpath and calls visitor.visit() for each match without
creating any heap-allocated IPropertyTreeIterator objects.

Xpath dispatch mirrors getElements():
  - null/empty      -> visit this node
  - .               -> self reference, continue with remainder
  - /               -> separator, skip; error if root
  - //              -> recursive descent: match at this level then descend
                       into all children via visitAllChildren()
  - [N]             -> numeric index into CPTArray
  - [expr]          -> checkPattern() qualifier on this node
  - tag / tag*      -> ChildMap lookup (exact or wildcard), CPTArray inline
                       expansion, optional qualifier via checkPattern(),
                       recurse on remainder
  - attr tail       -> pass attrName to visitor.visit() instead of nullptr

Add two private static PTree helpers (declared in jptree.ipp):
  - PTree::visitChildContainer() - expands CPTArray or visits single element
  - PTree::visitAllChildren()    - iterates ChildMap for // descent

VisitResult::Stop propagates immediately up the call stack.
VisitResult::SkipChildren is respected at the leaf visitNode level by not
recursing further (the caller controls recursion via the return value).

Note: checkPattern() still uses getElements() internally for complex
qualifiers ([childTag], [childTag=val]); this is acceptable for v1 and
will be addressed in a future stage using nested visit() calls.
SkipChildren fix:
  SkipChildren is now consumed at the level where it is returned, preventing
  it from propagating up to a sibling loop and incorrectly halting iteration.
  Fixed in visitChildContainer (array and single-element paths), in the [N]
  index case inside visit(), and in the visitContainer lambda array loop.
  Only Stop is allowed to propagate upward across container/sibling boundaries.

Stage 4 comment (in checkPattern):
  Documents how the two getElements()/getElements() iterator patterns inside
  checkPattern() can be replaced with nested visit() calls using lightweight
  boolean/comparison callback visitors, eliminating all iterator allocation
  from qualifier evaluation.

Stage 5 comment (before PTree::visit):
  Documents the XPathSegments pre-parsed structure design: parse xpath once
  into a vector of XPathSegment values, then walk by segment index on
  recursive calls to eliminate repeated string scanning for child nodes.
  Includes the proposed struct layout, parseXPath() signature, and the
  single-segment fast-path that avoids any allocation.
Signed-off-by: Gavin Halliday <gavin.halliday@lexisnexis.com>
Signed-off-by: Gavin Halliday <gavin.halliday@lexisnexis.com>
Signed-off-by: Gavin Halliday <gavin.halliday@lexisnexis.com>
Signed-off-by: Gavin Halliday <gavin.halliday@lexisnexis.com>
Signed-off-by: Gavin Halliday <gavin.halliday@lexisnexis.com>
Signed-off-by: Gavin Halliday <gavin.halliday@lexisnexis.com>
Signed-off-by: Gavin Halliday <gavin.halliday@lexisnexis.com>
Signed-off-by: Gavin Halliday <gavin.halliday@lexisnexis.com>
Signed-off-by: Gavin Halliday <gavin.halliday@lexisnexis.com>
Signed-off-by: Gavin Halliday <gavin.halliday@lexisnexis.com>
Copilot AI review requested due to automatic review settings March 3, 2026 21:02

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Introduces an IPropertyTree::visit() API and supporting visitor interface to traverse IPropertyTree nodes matching an xpath, with internal optimizations to avoid iterator allocations for common cases.

Changes:

  • Added VisitResult and IPropertyTreeVisitor, plus a new IPropertyTree::visit() method.
  • Implemented PTree::visit() with helpers for tags, qualifiers, wildcards, and // recursive descent.
  • Added ChildMap helpers to iterate buckets directly (wildcard/all) without allocating IPropertyTreeIterator.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.

File Description
system/jlib/jptree.hpp Adds visitor types (VisitResult, IPropertyTreeVisitor) and the new IPropertyTree::visit() API.
system/jlib/jptree.ipp Declares visit() in PTree and adds private helper declarations plus ChildMap visit helpers.
system/jlib/jptree.cpp Implements traversal logic (PTree::visit*, ChildMap::visit*) and refactors qualifier matching via MatchVisitor.

Comment thread system/jlib/jptree.cpp
Comment on lines +2930 to +2938
if (qualifier)
{
const char *q = qualifier + 1; // skip '['
if (isdigit(*q))
{
// Numeric [N] on an array is handled in the outer [N] case of visit();
// reaching here means no numeric qualifier applies per-element.
break;
}

Copilot AI Mar 3, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The numeric qualifier [N] on a tag segment that resolves to a CPTArray will result in visiting no elements because the loop breaks. Since visitWithTag() forwards the qualifier to visitMatchedNode() (and does not otherwise handle [N] on that segment), foo[3]-style xpaths against arrays won’t work. Parse the numeric index here and visit only the selected element (1-based), then continue evaluation on remainder and propagate Stop if requested.

Copilot uses AI. Check for mistakes.
Comment thread system/jlib/jptree.cpp
Comment on lines +3096 to +3123
// Scan past all chained qualifiers to find where the real remainder starts
const char *scan = remainder + 1;
for (;;)
{
char quote = 0;
while (']' != *scan || quote)
{
switch (*scan)
{
case '\"':
case '\'':
if (quote)
{
if (*scan == quote) quote = 0;
}
else
quote = *scan;
break;
case '\0':
throw MakeXPathException(_xpath, PTreeExcpt_XPath_ParseError, scan-_xpath, "Qualifier brace unclosed");
}
++scan;
}
++scan; // skip ']'
if ('[' != *scan)
break;
++scan; // skip '[' of next qualifier
}

Copilot AI Mar 3, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code explicitly supports chained qualifiers (e.g. foo[@a][@b]/...) when computing remainder, but only a single qualifier pointer (to the first [) is forwarded into visitMatchedNode(). As a result, only the first qualifier is evaluated and subsequent qualifiers are ignored. Consider passing the full qualifier span (from first [ up to remainder) and iterating through each [...] in visitMatchedNode() (or adding a helper that applies all qualifiers sequentially) so chained qualifiers behave as expected.

Suggested change
// Scan past all chained qualifiers to find where the real remainder starts
const char *scan = remainder + 1;
for (;;)
{
char quote = 0;
while (']' != *scan || quote)
{
switch (*scan)
{
case '\"':
case '\'':
if (quote)
{
if (*scan == quote) quote = 0;
}
else
quote = *scan;
break;
case '\0':
throw MakeXPathException(_xpath, PTreeExcpt_XPath_ParseError, scan-_xpath, "Qualifier brace unclosed");
}
++scan;
}
++scan; // skip ']'
if ('[' != *scan)
break;
++scan; // skip '[' of next qualifier
}
// Scan past the first qualifier to find where the remainder starts
const char *scan = remainder + 1;
char quote = 0;
while (']' != *scan || quote)
{
switch (*scan)
{
case '\"':
case '\'':
if (quote)
{
if (*scan == quote)
quote = 0;
}
else
quote = *scan;
break;
case '\0':
throw MakeXPathException(_xpath, PTreeExcpt_XPath_ParseError, scan-_xpath, "Qualifier brace unclosed");
}
++scan;
}
// 'scan' now points at the closing ']'; move remainder to first char after it
++scan;

Copilot uses AI. Check for mistakes.
Comment thread system/jlib/jptree.cpp
Comment on lines +3157 to +3161
// NULL or empty xpath — visit this node directly (no further navigation)
if (nullptr == xpath || '\0' == *xpath)
return visitor.visit(*this, nullptr);
if ('@' == *xpath)
return visitor.visit(*this, xpath);

Copilot AI Mar 3, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PTree::visit() can return VisitResult::SkipChildren directly to callers in these base cases, but the new IPropertyTree::visit() contract in the header comment says it returns Stop for early stop and otherwise Continue. If SkipChildren is intended to be purely an internal traversal control, normalize it to Continue at the API boundary (including these base cases) to match the documented behavior.

Copilot uses AI. Check for mistakes.
Comment thread system/jlib/jptree.hpp
// attrName - nullptr for element matches; the attribute name (e.g. "@foo") when the
// xpath tail resolves to an attribute
interface jlib_decl IPropertyTreeVisitor
{

Copilot AI Mar 3, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IPropertyTreeVisitor is a polymorphic interface but does not declare a virtual destructor. If a visitor is ever deleted via an IPropertyTreeVisitor*, that would be undefined behavior. Add a virtual ~IPropertyTreeVisitor() = default; (or an equivalent protected virtual destructor) to make the interface safe for typical C++ usage.

Suggested change
{
{
virtual ~IPropertyTreeVisitor() = default;

Copilot uses AI. Check for mistakes.
Comment thread system/jlib/jptree.cpp
Comment on lines +3814 to +3817
// MatchVisitor encapsulates the parsing and evaluation of a single qualifier expression
// of the form used inside [...] in an xpath. Call matches() once; it parses the expression
// from xxpath, evaluates it against the supplied node, advances xxpath past the closing ']',
// and returns true if the node satisfies the qualifier.

Copilot AI Mar 3, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment says matches() “advances xxpath past the closing ']'”, but the implementation sets path = xpath; where xpath is positioned at ']' (and callers like visitWithQualifier() explicitly check ']' after checkPattern()). Update the comment to reflect the actual contract (i.e., leaves the pointer at ']' for the caller to consume), or adjust the code to match the documented behavior.

Copilot uses AI. Check for mistakes.
@github-actions

github-actions Bot commented Mar 4, 2026

Copy link
Copy Markdown

Jira Issue: https://hpccsystems.atlassian.net//browse/HPCC-32908

Jirabot Action Result:
Assigning user: gavin.halliday@lexisnexisrisk.com
Workflow Transition To: Merge Pending
Updated PR

Signed-off-by: Gavin Halliday <gavin.halliday@lexisnexis.com>
Signed-off-by: Gavin Halliday <gavin.halliday@lexisnexis.com>
Signed-off-by: Gavin Halliday <gavin.halliday@lexisnexis.com>
@ghalliday ghalliday requested a review from Copilot March 4, 2026 17:09

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 9 comments.

Comment thread system/jlib/jptree.cpp
Comment on lines +3153 to +3177
// If the remainder is just '/' with nothing after, treat as empty
if ('/' == *remainder && '\0' == *(remainder+1))
remainder = remainder + 1;

if (!checkChildren())
return VisitResult::Continue;

if (wild)
{
// Wildcard — walk every bucket in the ChildMap directly (no iterator allocation)
VisitResult r = children->visitWild(id, isnocase(), qualifier, remainder, visitor);
if (r == VisitResult::Stop)
return VisitResult::Stop;
}
else
{
// Exact name lookup
IPropertyTree *child = children->query(id);
if (child)
{
VisitResult r = static_cast<PTree *>(child)->visitMatchedNode(qualifier, remainder, visitor);
if (r == VisitResult::Stop)
return VisitResult::Stop;
}
}

Copilot AI Mar 4, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

visitWithTag() passes remainder through to deeper visit() calls including the leading '/' separator for multi-segment xpaths (e.g. "a/b" produces remainder "/b"). In PTree::visit(), a leading '/' with root==true throws "Root specifier "/" is not supported", so normal multi-segment paths will fail. Fix by stripping a single leading '/' from remainder before recursing (but preserve the '//' case), and apply the same normalization for remainders produced after qualifier parsing.

Copilot uses AI. Check for mistakes.
Comment thread system/jlib/jptree.cpp
return r;
}
else if (root)
throw MakeXPathException(xpath, PTreeExcpt_XPath_Unsupported, 0, "Root specifier \"/\" is not supported");

Copilot AI Mar 4, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

visitWithTag() passes remainder through to deeper visit() calls including the leading '/' separator for multi-segment xpaths (e.g. "a/b" produces remainder "/b"). In PTree::visit(), a leading '/' with root==true throws "Root specifier "/" is not supported", so normal multi-segment paths will fail. Fix by stripping a single leading '/' from remainder before recursing (but preserve the '//' case), and apply the same normalization for remainders produced after qualifier parsing.

Suggested change
throw MakeXPathException(xpath, PTreeExcpt_XPath_Unsupported, 0, "Root specifier \"/\" is not supported");
{
// Treat a single leading '/' at the root as a normal path separator.
// This avoids rejecting multi-segment paths where recursion passes
// a remainder like "/b" into visit(), while preserving the '//' case above.
root = false;
if ('\0' == *xpath)
return visitor.visit(*this, nullptr);
goto restart;
}

Copilot uses AI. Check for mistakes.
Comment thread system/jlib/jptree.cpp
Comment on lines +2953 to +2957

if (']' == *q) q++; // skip ']'
}
else
matched = elem->newCheckPattern(q); // advances q past ']'

Copilot AI Mar 4, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MatchVisitor::matches() claims (and call sites assume) it advances the input pointer past the closing ], but it currently assigns path = xpath without consuming the ]. This breaks chained qualifiers for non-numeric expressions (e.g. tag[@a][@b]), because q won't advance to the next '[' and the loop will stop after the first qualifier. Make pointer advancement consistent: either (a) have matches() consume the closing ] before updating path, or (b) update all callers/comments to explicitly consume ] and continue scanning for the next qualifier.

Suggested change
if (']' == *q) q++; // skip ']'
}
else
matched = elem->newCheckPattern(q); // advances q past ']'
if (']' == *q) q++; // skip ']'
}
else
{
matched = elem->newCheckPattern(q); // evaluate non-numeric qualifier
// Ensure we always consume the closing ']' before looking for the next qualifier
if (matched && ']' == *q)
q++; // skip ']'
}

Copilot uses AI. Check for mistakes.
Comment thread system/jlib/jptree.cpp
Comment on lines +3841 to +3843
// of the form used inside [...] in an xpath. Call matches() once; it parses the expression
// from xxpath, evaluates it against the supplied node, advances xxpath past the closing ']',
// and returns true if the node satisfies the qualifier.

Copilot AI Mar 4, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MatchVisitor::matches() claims (and call sites assume) it advances the input pointer past the closing ], but it currently assigns path = xpath without consuming the ]. This breaks chained qualifiers for non-numeric expressions (e.g. tag[@a][@b]), because q won't advance to the next '[' and the loop will stop after the first qualifier. Make pointer advancement consistent: either (a) have matches() consume the closing ] before updating path, or (b) update all callers/comments to explicitly consume ] and continue scanning for the next qualifier.

Copilot uses AI. Check for mistakes.
Comment thread system/jlib/jptree.cpp
else
visit(node, nullptr);
}

Copilot AI Mar 4, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MatchVisitor::matches() claims (and call sites assume) it advances the input pointer past the closing ], but it currently assigns path = xpath without consuming the ]. This breaks chained qualifiers for non-numeric expressions (e.g. tag[@a][@b]), because q won't advance to the next '[' and the loop will stop after the first qualifier. Make pointer advancement consistent: either (a) have matches() consume the closing ] before updating path, or (b) update all callers/comments to explicitly consume ] and continue scanning for the next qualifier.

Suggested change
if (*xpath == ']')
xpath++;

Copilot uses AI. Check for mistakes.
Comment thread system/jlib/jptree.cpp
Comment on lines +3054 to +3055
if (']' != *xpath)
throw MakeXPathException(_xpath, PTreeExcpt_XPath_ParseError, xpath-_xpath, "Qualifier brace unclosed");

Copilot AI Mar 4, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error text says "Qualifier brace unclosed" but the syntax uses brackets [...]. Consider changing "brace" to "bracket" to make the message more precise and consistent with the XPath syntax being parsed.

Copilot uses AI. Check for mistakes.
Comment thread system/jlib/jptree.cpp
quote = *scan;
break;
case '\0':
throw MakeXPathException(_xpath, PTreeExcpt_XPath_ParseError, scan-_xpath, "Qualifier brace unclosed");

Copilot AI Mar 4, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error text says "Qualifier brace unclosed" but the syntax uses brackets [...]. Consider changing "brace" to "bracket" to make the message more precise and consistent with the XPath syntax being parsed.

Suggested change
throw MakeXPathException(_xpath, PTreeExcpt_XPath_ParseError, scan-_xpath, "Qualifier brace unclosed");
throw MakeXPathException(_xpath, PTreeExcpt_XPath_ParseError, scan-_xpath, "Qualifier bracket unclosed");

Copilot uses AI. Check for mistakes.
Comment thread system/jlib/jptree.cpp
{
// Multiple siblings in a CPTArray: check qualifier per-element, then recurse
unsigned n = val->elements();
unsigned indexCounts[8] = {0}; // Support up to 8 chained numeric qualifiers

Copilot AI Mar 4, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The limit 8 for chained numeric qualifiers is a magic number duplicated in multiple places. Consider introducing a named constant (e.g. constexpr unsigned maxChainedIndexQualifiers = 8;) to centralize the limit and avoid accidental divergence if it needs to change.

Copilot uses AI. Check for mistakes.
Comment thread system/jlib/jptree.cpp
StringAttr idxstr;
q = readIndex(q, idxstr);
unsigned requiredIdx = atoi(idxstr.get());
if (numQualIndex < 8)

Copilot AI Mar 4, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The limit 8 for chained numeric qualifiers is a magic number duplicated in multiple places. Consider introducing a named constant (e.g. constexpr unsigned maxChainedIndexQualifiers = 8;) to centralize the limit and avoid accidental divergence if it needs to change.

Copilot uses AI. Check for mistakes.
@GordonSmith GordonSmith force-pushed the master branch 3 times, most recently from e724e28 to cc22aa6 Compare May 23, 2026 07:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants