[cDAC] Stackwalk DacDbi APIs#129091
Conversation
|
Tagging subscribers to this area: @steveisok, @tommcdon, @dotnet/dotnet-diag |
There was a problem hiding this comment.
Pull request overview
This PR expands the cDAC StackWalk and Debugger contracts to support runtime hijack-stub detection and hijacked-context recovery, updates context retrieval to use caller-provided aligned buffers with HRESULT-based error reporting, and wires up corresponding DacDbi stack-walk APIs with updated tests and documentation.
Changes:
- Add Debugger contract support for hijack stub range table (
RgHijackFunction+MemoryRange) and implementIDebugger.IsRuntimeUnwindableStub. - Extend StackWalk contract with hijack-context recovery and seed-context stack-walk support; change
GetContextto fill a caller-provided, 16-byte-aligned buffer and return HRESULTs. - Implement DacDbi stack-walk methods (create/delete/set/get/unwind/check) on top of the cDAC contracts, with test and dump-test updates for alignment and signature changes.
Show a summary per file
| File | Description |
|---|---|
| src/native/managed/cdac/tests/UnitTests/DebuggerTests.cs | Adds unit tests and mock-target plumbing for hijack-stub detection via RgHijackFunction/MemoryRange. |
| src/native/managed/cdac/tests/UnitTests/DacDbiImplTests.cs | Updates tests to match CheckContext signature change (byte* instead of nint). |
| src/native/managed/cdac/tests/TestInfrastructure/TestPlaceholderTarget.cs | Updates TryGetThreadContext override to return HRESULT (int). |
| src/native/managed/cdac/tests/DumpTests/StackWalkDumpTests.cs | Updates dump test to use aligned native memory and validate HRESULT-based GetContext. |
| src/native/managed/cdac/tests/DumpTests/DacDbi/DacDbiStackWalkDumpTests.cs | Updates dump tests to use aligned buffers when calling contract GetContext. |
| src/native/managed/cdac/tests/DataGenerator/TestTarget.cs | Updates TryGetThreadContext override to return HRESULT (int). |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader/ContractDescriptorTarget.cs | Changes TryGetThreadContext to return HRESULT from the underlying data-target delegate. |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Legacy/Dbi/IDacDbiInterface.cs | Updates COM surface signatures for stack-walk context pointers (byte*) and adds CorDebugSetContextFlags. |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Legacy/Dbi/Helpers/StackWalkHandleData.cs | Introduces managed handle state for cDAC-backed DacDbi stack walking. |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Legacy/Dbi/DacDbiImpl.cs | Implements DacDbi stack-walk APIs using cDAC StackWalk + Debugger contracts, with optional legacy mirroring/debug validation. |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/Microsoft.Diagnostics.DataContractReader.Contracts.csproj | Adds System.HResults to enable HRESULT constants in contracts. |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/DataType.cs | Adds DataType.MemoryRange for hijack table entries. |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/Data/MemoryRange.cs | Adds cDAC MemoryRange data descriptor type (start + size). |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/Data/Debugger.cs | Adds RgHijackFunction field for hijack function range table pointer. |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/Contracts/StackWalk/StackWalk_1.cs | Adds seed-context CreateStackWalk, hijacked-context recovery, HRESULT-buffer GetContext, and aligned scratch usage for OS context retrieval. |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/Contracts/StackWalk/Context/IPlatformAgnosticContext.cs | Adds ContextAlignment constant (16) for CONTEXT buffer alignment requirements. |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/Contracts/Debugger_1.cs | Implements IDebugger.IsRuntimeUnwindableStub using hijack range table scanning. |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/Constants.cs | Adds MaxHijackFunctions global name used by Debugger hijack-table scanning. |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Abstractions/Target.cs | Changes TryGetThreadContext to return HRESULT instead of bool (API contract shift). |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Abstractions/Contracts/IStackWalk.cs | Adds seed-context CreateStackWalk overload and changes GetContext to HRESULT + caller buffer; adds RetrieveHijackedContext. |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Abstractions/Contracts/IDebugger.cs | Adds IsRuntimeUnwindableStub API to the Debugger contract abstraction. |
| src/coreclr/vm/datadescriptor/datadescriptor.inc | Exposes Debugger RgHijackFunction, introduces MemoryRange type descriptor, and adds MaxHijackFunctions global literal. |
| src/coreclr/inc/memoryrange.h | Adds cDAC exposure (cdac_data<MemoryRange>) for MemoryRange field offsets. |
| src/coreclr/debug/ee/debugger.h | Annotates hijack enum constant dependency; exposes hijack table pointer and constant via cdac_data<Debugger>. |
| src/coreclr/debug/di/rsstackwalk.cpp | Avoids fetching current context after unwind when already at end-of-stack. |
| docs/design/datacontracts/StackWalk.md | Updates contract documentation for new StackWalk APIs and aligned, caller-buffer GetContext. |
| docs/design/datacontracts/Debugger.md | Documents IsRuntimeUnwindableStub and the hijack-table globals/data descriptors. |
Copilot's findings
- Files reviewed: 27/27 changed files
- Comments generated: 6
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
| handleData = new StackWalkHandleData(_target.Contracts.StackWalk, threadData); | ||
| SeedHandleFromNativeContext(handleData, pInternalContextBuffer, isFirst: true); | ||
| *ppSFIHandle = handleData.GetHandle(); | ||
| } | ||
| catch (System.Exception ex) | ||
| { | ||
| hr = ex.HResult; | ||
| } |
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
| | --- | --- | --- | --- | | ||
| | `ExceptionFlags` (`exstatecommon.h`) | `Ex_UnwindHasStarted` | `0x00000004` | Bit flag in `ExceptionInfo.ExceptionFlags` indicating exception unwinding (2nd pass) has started. Used by `IsInStackRegionUnwoundBySpecifiedException` to skip ExInfo trackers still in the 1st pass. | | ||
| | `InlinedCallFrameMarker` (`exceptionhandling.h`) | `ExceptionHandlingHelper` | `2 (64-bit), 1(32-bit)` | Used to determine whether an active call on an InlinedCallFrame is an EH helper. | | ||
| | `Debugger::s_hijackFunction` (`debugger.h`) | `UnhandledExceptionHijackIndex` | `0` | Index of the unhandled-exception hijack entry in the runtime's hijack-function table. Used by `RetrieveHijackedContext` to choose between the SP-based and FP/SP-offset-based CONTEXT recovery paths, and by `IDebugger.IsRuntimeUnwindableStub` to set its `isUnhandledException` out parameter. | |
| | `CLRJitAttachState` | TargetPointer | Pointer to the CLR JIT attach state flags | | ||
| | `CORDebuggerControlFlags` | TargetPointer | Pointer to `g_CORDebuggerControlFlags` | | ||
| | `MetadataUpdatesApplied` | TargetPointer | Pointer to the g_metadataUpdatesApplied flag | | ||
| | `MaxHijackFunctions` | uint32 | Number of entries in the hijack function array. Zero on platforms/configurations where the hijack-function table is not present. | |
| /// <returns>true if successful, false otherwise</returns> | ||
| public abstract bool TryGetThreadContext(ulong threadId, uint contextFlags, Span<byte> buffer); | ||
| /// <returns>HRESULT indicating success or failure</returns> | ||
| public abstract int TryGetThreadContext(ulong threadId, uint contextFlags, Span<byte> buffer); |
There was a problem hiding this comment.
I'd like to be consistent about the return type across the reading APIs. Is there a reason we need change behavior on E_NOTIMPL vs another failure code?
There was a problem hiding this comment.
Yes. If the failure code is E_NOTIMPL it means we have to try again via the Frames. If the failure code is anything else (for example, E_FAIL from improperly aligned bytes), we should not retry.
There was a problem hiding this comment.
Are there actual cases when GetThreadContext will be E_NOTIMPL? Or can we always fall back on failure? I'm wondering if we can simplify this.
There was a problem hiding this comment.
Yes, I tried to remove the E_NOTIIMPL case and then we realized it was required for non-Windows. #128499
| // Required alignment, in bytes, of any memory buffer passed to Win32 GetThreadContext | ||
| // (or its cross-platform equivalents) as a CONTEXT*. | ||
| public const uint ContextAlignment = 16; |
There was a problem hiding this comment.
I'm not sure the cDAC should care about this. I think the underlying reader should account for this.
There was a problem hiding this comment.
How do you think the reader should account for this? Would you have the reader allocate aligned memory and then copy? I believe the allocator of context-sized bytes should be responsible for aligning them, as it currently is by wrapping the CONTEXT structs in declspec(16).
|
|
||
| return RunStackWalk(context, state, frameIterator, threadData, isFirst, matchedIsFirst, matchedIsInterrupted, skipFrames); | ||
| } | ||
| private IEnumerable<IStackDataFrameHandle> RunStackWalk( |
There was a problem hiding this comment.
nit: missing space between methods
| FrameIterator frameIterator, | ||
| ThreadData threadData, | ||
| bool initialIsFirst = true, | ||
| bool? isFirstOverride = null, |
There was a problem hiding this comment.
Why do we need this override? Can't the caller just set initialIsFirst = false?
| bool initialIsFirst = true, | ||
| bool? isFirstOverride = null, | ||
| bool? isInterruptedOverride = null, | ||
| bool skipFrames = true) |
There was a problem hiding this comment.
I'm not sure I understand this/what it mirrors in the native SFI. It seems to only contract checking for the first skipped frame, not interior skipped frames, and I don't see how the DacDbi stackwalk differs in this behavior compared to the IXCLR stackwalk.
| if (handleData.IsValid) | ||
| { | ||
| IStackWalk sw = _target.Contracts.StackWalk; | ||
| byte[] context = sw.GetRawContext(handleData.Current!); |
There was a problem hiding this comment.
Lets avoid using the null-forgiving operator by adding a https://learn.microsoft.com/en-us/dotnet/api/system.diagnostics.codeanalysis.membernotnullwhenattribute?view=net-10.0 attribute to IsValid
| public void Reset(byte[] contextBuffer, bool isFirst) | ||
| { | ||
| _enumerator?.Dispose(); | ||
| _enumerator = _stackWalk.CreateStackWalk(_threadData, contextBuffer, isFirst, false).GetEnumerator(); |
There was a problem hiding this comment.
Instead of having the separate AdvanceForUnwind loop, can we use an IEnumerable filter?
_enumerator = _stackWalk
.CreateStackWalk(_threadData, contextBuffer, isFirst, skipFrames: false)
.Where(h => h.State is not (StackWalkState.Frame or StackWalkState.SkippedFrame))
.GetEnumerator();| byte[] contextBuf = new Span<byte>(pContext, (int)contextSize).ToArray(); | ||
| handleData.Reset(contextBuf, isFirst); |
There was a problem hiding this comment.
The underlying calls here don't require a byte[], this is eventually fed into IPlatformAgnosticContext.FillFromBuffer(Span<byte> buffer). Can we change the intermediary calls to accept a Span so we don't need to allocate?
| } | ||
|
|
||
| byte[] IStackWalk.GetContext(ThreadData threadData, ThreadContextSource contextSource, uint contextFlags) | ||
| int IStackWalk.GetContext(ThreadData threadData, ThreadContextSource contextSource, uint contextFlags, Span<byte> contextBuffer) |
There was a problem hiding this comment.
I don't think we have any other contract layer APIs that use HResults. I'm wondering if we can avoid HResults in the pure managed code, if we can't I think we should create a typed enum to better express the meaning and avoid magic constants.
There was a problem hiding this comment.
On second thought, I don't think we need to deal with the span here. The only caller that uses the span still converts it to an array with SeedHandleFromNativeContext.
There was a problem hiding this comment.
What do you mean by "deal with the span"?
There was a problem hiding this comment.
We could keep this API returning a byte[] and copy into the user provided span inside of the DacDbiImpl.cs
There was a problem hiding this comment.
We could also have it return a caller-allocated Span<> and not have to copy.
…er.Abstractions/Contracts/IStackWalk.cs Co-authored-by: Max Charlamb <44248479+max-charlamb@users.noreply.github.com>
| IPlatformAgnosticContext context = IPlatformAgnosticContext.GetContextForPlatform(_target); | ||
| byte[] bytes = new byte[context.Size]; | ||
| Span<byte> buffer = new Span<byte>(bytes); | ||
| if (contextBuffer.Length < context.Size) | ||
| return unchecked((int)0x80070057); // E_INVALIDARG | ||
| Span<byte> buffer = contextBuffer.Slice(0, (int)context.Size); |
|
|
||
| // Fall back to deriving a context from the explicit Frame chain stored in the Thread object. | ||
| return GetContextFromFrames(threadData); | ||
| else if ((uint)hr == 0x80004001 /* E_NOTIMPL */) | ||
| { | ||
| WriteContextFromFrames(threadData, buffer); | ||
| return 0; | ||
| } | ||
| return hr; |
| IEnumerable<IStackDataFrameHandle> IStackWalk.CreateStackWalk(ThreadData threadData, byte[] contextBuffer, bool isFirst, bool skipFrames) | ||
| { | ||
| IPlatformAgnosticContext context = IPlatformAgnosticContext.GetContextForPlatform(_target); | ||
| context.FillFromBuffer(contextBuffer); | ||
|
|
| // CONTEXT requires 16-byte alignment for the OS GetThreadContext path | ||
| void* scratch = NativeMemory.AlignedAlloc(context.Size, IPlatformAgnosticContext.ContextAlignment); | ||
| try | ||
| { | ||
| Span<byte> buffer = new(scratch, (int)context.Size); |
| @@ -84,7 +84,8 @@ public interface IStackWalk : IContract | |||
| bool IsExceptionHandlingHelperInlinedCallFrame(TargetPointer frameAddress) => throw new NotImplementedException(); | |||
| DebuggerEvalData GetDebuggerEvalData(TargetPointer funcEvalFrameAddress) => throw new NotImplementedException(); | |||
| TargetPointer GetRedirectedContextPointer(ThreadData threadData) => throw new NotImplementedException(); | |||
| byte[] GetContext(ThreadData threadData, ThreadContextSource contextSource, uint contextFlags) => throw new NotImplementedException(); | |||
| int GetContext(ThreadData threadData, ThreadContextSource contextSource, uint contextFlags, Span<byte> contextBuffer) => throw new NotImplementedException(); | |||
| byte[] RetrieveHijackedContext(IStackDataFrameHandle stackDataFrameHandle, bool isUnhandledException) => throw new NotImplementedException(); | |||
| } | |||
|
|
||
| public override bool TryGetThreadContext(ulong threadId, uint contextFlags, Span<byte> bufferToFill) => throw new NotImplementedException(); | ||
| public override int TryGetThreadContext(ulong threadId, uint contextFlags, Span<byte> bufferToFill) => throw new NotImplementedException(); | ||
|
|
IsRuntimeUnwindableStubDebugger cDAC APIRetrieveHijackedContextStackWalk cDAC APICreateStackWalkthat accepts an arbitrary seed contextTryGetThreadContextHRESULT for more accurate fallback handlingCreateStackWalk,DeleteStackWalk,SetStackWalkCurrentStackFrame,GetStackWalkCurrentStackFrame, andUnwindStackWalkFrameNote to Copilot code revise: Please DO NOT make ANY comments about breaking public APIs!!!