Skip to content

Fix C/C++ debug hover on intermediate members of a dereferenced expression#14540

Open
tieo wants to merge 5 commits into
microsoft:mainfrom
tieo:cpp-debug-hover-leading-deref
Open

Fix C/C++ debug hover on intermediate members of a dereferenced expression#14540
tieo wants to merge 5 commits into
microsoft:mainfrom
tieo:cpp-debug-hover-leading-deref

Conversation

@tieo

@tieo tieo commented Jun 23, 2026

Copy link
Copy Markdown

Problem

While debugging C/C++, hovering an intermediate member of a */&-prefixed access chain (e.g. a or b in if (*a.b.c == X) where a.b is a struct) shows no value, and hovering a member that comes after an array subscript (e.g. c in a.b[i].c) shows no value either.

Cause

No EvaluatableExpressionProvider is registered for C/C++, so VS Code's built-in data-tip expression detection is used. It keeps the leading */& and clips on the right, so hovering b in *a.b.c evaluates *a.b — a dereference of the non-pointer member a.b — which errors. It also stops at [ and ], so hovering c in a.b[i].c evaluates a broken fragment after the ].

Fix

Register an EvaluatableExpressionProvider for C/C++. Once a provider is registered, VS Code uses it instead of its built-in detection (it does not fall back to the default when the provider returns nothing), so the provider reproduces the built-in behavior for ordinary tokens and additionally fixes the access-chain cases:

  • A leading */& applies to the whole access chain (the postfix ., ->, [] operators bind tighter). It is dropped for an interior member of a . chain, where *a.b would dereference the struct a.b, and kept on the final member and before -> (so *ptr->member is unchanged).
  • Array subscripts and :: are kept in the token, so members after a subscript (a.b[i].c) and scoped names (ns::var) resolve. Hovering a subscript bracket evaluates the indexed element; hovering inside [...] evaluates the index on its own.

The provider is registered during debugger activation (so it also works when cpptools IntelliSense is disabled), and the expression computation lives in a vscode-free module (evaluatableExpression.ts) with unit tests.

Note: I created the code in this PR with the help of an LLM, and tested it manually in VS Code plus with the added unit tests.

…enced members

VS Code's default debug data-tip keeps a leading `*`/`&` and clips on the right, so hovering
an intermediate member of e.g. `*a.b.c` evaluates `*a.b` (a dereference of the struct `a.b`)
and shows no value.

Register an EvaluatableExpressionProvider that, only for that case, returns the expression
without the leading operator. Every other expression returns undefined so the default
behavior is unchanged.
@tieo tieo requested a review from a team as a code owner June 23, 2026 10:35
@github-project-automation github-project-automation Bot moved this to Pull Request in cpptools Jun 23, 2026
@tieo

tieo commented Jun 23, 2026

Copy link
Copy Markdown
Author

@microsoft-github-policy-service agree

Comment thread Extension/src/LanguageServer/client.ts Outdated
Comment thread Extension/src/LanguageServer/Providers/evaluatableExpressionProvider.ts Outdated
sean-mcmanus

This comment was marked as resolved.

This comment was marked as resolved.

@sean-mcmanus sean-mcmanus added this to the 1.33.2 milestone Jun 23, 2026
@tieo

tieo commented Jun 25, 2026

Copy link
Copy Markdown
Author

Addressed the review feedback in the latest commit:

  • Moved the provider to Extension/src/Debugger/ and register it during debugger activation in Debugger/extension.ts instead of the language client.

On the Copilot suggestion to gate with line.charAt(clipEnd) === '.': I went a different way. The problem with that gate is it leaves *ptr->member doing what the default does, but that's already broken — -> has higher precedence than the unary *, so the expression is really *(ptr->member), and when you hover ptr you want to see ptr, not *ptr.

What it does now is keep the leading */& only when you're hovering the last segment of the chain, and drop it everywhere in the middle. That ends up being right whether the chain uses ., ->, or []. I also had to keep subscripts in the token because the default detection stops at the [, which is why hovering something like the c in a.b[i].c gave you nothing before. Function calls never get pulled in, so there's no risk of evaluating something with side effects.

This comment was marked as resolved.

sean-mcmanus

This comment was marked as resolved.

@tieo tieo force-pushed the cpp-debug-hover-leading-deref branch from 5433ec8 to b080bf6 Compare June 26, 2026 11:47
@tieo

tieo commented Jun 26, 2026

Copy link
Copy Markdown
Author

Thanks — addressed the feedback and rewrote the description, since the old one no longer matched the code.

On the "returns undefined otherwise" point: that was the original intent, but it doesn't actually work. Once an EvaluatableExpressionProvider is registered, VS Code uses it instead of its built-in detection and does not fall back to the default when the provider returns undefined — the data-tip just shows nothing. I verified this by running the resolver's own logic: with a provider registered that returns undefined, it returns null (no tip); the built-in extraction only runs when no provider is registered at all. So the provider has to return a sensible expression for ordinary tokens, not undefined. I updated the description and comments to say this.

On ::: added to the tokenizer, so ns::var stays together.

On <= vs < in the hit-test: I kept <=. VS Code's own getExactExpressionStartAndEnd uses an inclusive end (l >= e), which maps to <= here — so <= matches the built-in detection, and < would diverge by failing to select a token when hovering its trailing edge.

On tests: moved the computation into a vscode-free module (evaluatableExpression.ts) and added unit tests covering *a.b.c interior vs final, ->, ::, subscripts (a.b[i].c, bracket vs index), and the no-token case.

@sean-mcmanus sean-mcmanus self-requested a review June 26, 2026 16:14
Comment thread Extension/test/unit/evaluatableExpression.test.ts Fixed

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

Comment thread Extension/src/Debugger/evaluatableExpression.ts Outdated
Comment thread Extension/test/unit/evaluatableExpression.test.ts
sean-mcmanus

This comment was marked as resolved.

Register the EvaluatableExpressionProvider during debugger activation
instead of through the language client, so it also works when IntelliSense
is disabled, and move it next to the other debugger code.

The expression computation lives in a vscode-free module
(evaluatableExpression.ts) so it can be unit tested directly. Registering
a provider replaces VS Code's built-in data-tip expression detection, so
it reproduces that for ordinary tokens and additionally handles access
chains the built-in detection gets wrong for C/C++:

- A leading */& binds to the whole access chain (the postfix ., ->, []
  operators bind tighter), so it is dropped for an interior member of a
  dot chain, where *a.b would dereference the struct a.b, and kept on the
  final member and before ->.

- Array subscripts and :: are kept in the token, so members after a
  subscript (a.b[i].c) and scoped names (ns::var) resolve instead of
  producing a broken fragment.
@tieo tieo force-pushed the cpp-debug-hover-leading-deref branch from b080bf6 to e21453c Compare June 29, 2026 08:42
The leading */& is kept before -> / [] and at the end of a chain because
those left operands are provably pointers/arrays, so *ptr->m, *a.b[i] and
&a[i] evaluate without error; it is dropped only before . where the left
operand may be a struct. Documents these keep-leading outcomes as deliberate
per review feedback.
@tieo

tieo commented Jun 29, 2026

Copy link
Copy Markdown
Author

Thanks for the detailed review. I've addressed the remaining points.

Connector-leading fragments. The provider now declines these explicitly: after any leading */&, if the token begins with . or ->, it returns undefined. So foo().bar, (*this).member, and obj->fn()->field no longer produce .bar, .member, or ->field. I added regression tests for all three. I guard at the start of the token rather than anchoring the regex to an identifier, because that also keeps a legitimate leading :: global-scope qualifier (e.g. ::var) working, which an identifier anchor would reject.

The .-only drop of the leading operator. This is intentional, and I've added a comment explaining why. . is the only connector whose left operand is not provably a pointer or array, so keeping the leading operator there (*a.b) would dereference a struct and error. Before -> or [], and at the end of the chain, the left operand is a pointer or array, so the leading operator stays valid. That makes *ptr->member, and deliberately *a.b[i] and &a[i], evaluate without error (the array decays); they show the element or address rather than the hovered operand, which matches the built-in keep-leading-and-clip-right behavior.

CodeQL. The test helper now slices at the marker index instead of using String.replace, and I tightened the tokenizer to avoid a nested-quantifier pattern that the ReDoS query flags.

On the optional suggestion to require an identifier start so numeric literals like 1.5 or 0x1F aren't tokenized as member chains: I left this as-is, since those evaluate without error in a data-tip and match the built-in detection. Happy to add it if you'd prefer the stricter parity.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

Comment thread Extension/src/Debugger/evaluatableExpression.ts Outdated
tieo added 2 commits June 29, 2026 10:59
Inside [...] the token spans whitespace and operators (e.g. a[i + j]), so
returning the nearest identifier evaluated the wrong expression when the
cursor was on an operator or space. Return undefined unless the cursor is on
the index identifier itself, and add tests.
Replace the token regex with a manual scanner so nested subscripts (a[b[i]])
stay in one token, fix hovering past the last identifier truncating a trailing
subscript, and decline non-token positions inside [...] (operators/whitespace).
Keep the leading * only on the final dereferenced segment (including a final
subscript element like *a.b[i]); drop it on interior segments and drop a leading
& always. Removes the previous regex (and its ReDoS surface).
@tieo

tieo commented Jun 29, 2026

Copy link
Copy Markdown
Author

Pushed a follow-up that reworks the token detection after testing against real debug sessions. Summary of what changed and why:

Balanced-bracket scanner instead of a regex. The token is now found by a small manual scanner rather than tokenRegExp. This fixes nested subscripts (a[b[i]], which the regex split at the inner [), and it removes the previous regex entirely — so the earlier js/redos concern is moot.

Leading-operator rule, corrected. My earlier comment described *a.b[i] / &a[i] keeping the leading operator as "deliberate." Testing in a real debugger showed that was wrong: *a.b[i] dereferences the array base (element 0) and &a[i] shows an address, neither of which is what you hovered. The rule is now:

  • A leading * is kept only on the final dereferenced segment*a.b.c (final member), and the final subscript element *a.b[i] (the dereferenced element you're pointing at).
  • It's dropped on any interior segment: hovering b in *a.b.ca.b, and b in *a.b[i]a.b (the array, not *a.b).
  • A leading & is always dropped, so hovering the variable shows its value, not its address.

Subscript edge cases. Non-token positions inside [...] (operators/whitespace, e.g. a[i + j]) return undefined; hovering past the last identifier no longer truncates a trailing subscript.

Unit tests cover all of the above (interior vs final *, &, nested subscripts, the index, connector-leading fragments, and the operator/whitespace cases).

Separately: while testing I hit a data-tip rendering issue where a uint32_t scalar shows as an address (the memory icon reads the value as an address). That turned out to be in the debug adapter's memoryReference handling, not expression detection — cpptools' TypeScript never sets memoryReference, so it's out of scope for this PR. I'll raise it separately.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated no new comments.

@sean-mcmanus sean-mcmanus modified the milestones: 1.33.3, 1.33.4 Jun 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Pull Request

Development

Successfully merging this pull request may close these issues.

4 participants