Compile runtime async versions of synchronous task-returning methods#128384
Compile runtime async versions of synchronous task-returning methods#128384jakobbotsch wants to merge 95 commits into
Conversation
There was a problem hiding this comment.
Pull request overview
This PR extends the JIT↔EE async infrastructure so the JIT can compile a dedicated “runtime async version” of synchronous Task/ValueTask-returning methods, including a tail-position optimization that turns eligible tail calls/returns into runtime-async awaits.
Changes:
- Adds a new JIT↔EE interface API (
getAwaitReturnCall) and plumbs it through SuperPMI and generated wrappers/shims. - Updates CoreCLR async thunk IL generation and
AsyncHelpersto separate “suspend” helpers from typedTransparentAwait(...)helpers used by the JIT. - Updates the JIT importer to recognize “async-version tail await” patterns and to wrap async-version returns in an await via the new EE callback.
Reviewed changes
Copilot reviewed 33 out of 33 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| src/tests/async/reflection/reflection.cs | Disables several assertions related to reflection/stack behavior for await-on-task-returning paths. |
| src/coreclr/vm/prestub.cpp | Adjusts IL header retrieval logic for async variant methods. |
| src/coreclr/vm/method.hpp | Changes IL-header eligibility rules for async/thunk/return-dropping methods; exposes GetAsyncThunkResultTypeSig. |
| src/coreclr/vm/metasig.h | Adds metasig entries for Task/ValueTask TransparentAwait helper signatures. |
| src/coreclr/vm/jitinterface.h | Adds helper declarations related to runtime lookup computation for new await-return support. |
| src/coreclr/vm/jitinterface.cpp | Implements getAwaitReturnCall and runtime-lookup computation for generic await helpers; sets CORINFO_ASYNC_VERSION for applicable methods. |
| src/coreclr/vm/corelib.h | Adds/updates CoreLibBinder entries for new suspend/await AsyncHelpers methods with proper signatures. |
| src/coreclr/vm/asyncthunks.cpp | Switches thunk-emitted calls from TransparentAwait* to TransparentSuspendFor* helpers. |
| src/coreclr/tools/superpmi/superpmi/icorjitinfo.cpp | Records/replays the new getAwaitReturnCall API for SuperPMI. |
| src/coreclr/tools/superpmi/superpmi-shim-simple/icorjitinfo_generated.cpp | Forwards the new API in the simple shim. |
| src/coreclr/tools/superpmi/superpmi-shim-counter/icorjitinfo_generated.cpp | Counts/forwards the new API in the counter shim. |
| src/coreclr/tools/superpmi/superpmi-shim-collector/icorjitinfo.cpp | Records the new API in the collector shim. |
| src/coreclr/tools/superpmi/superpmi-shared/spmirecordhelper.h | Makes lookup restore helpers take const& and updates corresponding implementations. |
| src/coreclr/tools/superpmi/superpmi-shared/methodcontext.h | Adds recording/replay plumbing for getAwaitReturnCall. |
| src/coreclr/tools/superpmi/superpmi-shared/methodcontext.cpp | Implements record/dump/replay for getAwaitReturnCall. |
| src/coreclr/tools/superpmi/superpmi-shared/lwmlist.h | Registers the new lightweight-map packet for GetAwaitReturnCall. |
| src/coreclr/tools/superpmi/superpmi-shared/agnostic.h | Adds agnostic structs for CORINFO_LOOKUP* and the await-return result payload. |
| src/coreclr/tools/Common/TypeSystem/IL/Stubs/AsyncThunks.cs | Updates IL stub emitter to use TransparentSuspendFor* helper names. |
| src/coreclr/tools/Common/JitInterface/ThunkGenerator/ThunkInput.txt | Adds the new interface method to thunk generation input. |
| src/coreclr/tools/Common/JitInterface/CorInfoImpl.cs | Adds a stub managed implementation of getAwaitReturnCall for the managed JIT interface. |
| src/coreclr/tools/Common/JitInterface/CorInfoImpl_generated.cs | Adds unmanaged callback plumbing for getAwaitReturnCall. |
| src/coreclr/tools/aot/jitinterface/jitinterface_generated.h | Adds the new callback and wrapper method to the AOT jitinterface wrapper. |
| src/coreclr/System.Private.CoreLib/src/System/Runtime/CompilerServices/AsyncHelpers.CoreCLR.cs | Renames “suspend” helpers and adds typed TransparentAwait(...) overloads used by the JIT. |
| src/coreclr/jit/importercalls.cpp | Adds tail-await handling for async-version tail-await prefix and blocks inlining of async-version callees. |
| src/coreclr/jit/importer.cpp | Adds async-version tail-call recognition, wraps async-version returns in an await via getAwaitReturnCall, and introduces impWrapTopOfStackInAwait. |
| src/coreclr/jit/ICorJitInfo_wrapper_generated.hpp | Adds wrapper forwarding for getAwaitReturnCall. |
| src/coreclr/jit/ICorJitInfo_names_generated.h | Adds name entry for getAwaitReturnCall. |
| src/coreclr/jit/fginline.cpp | Minor whitespace change near async flag handling for inlinee compilation. |
| src/coreclr/jit/compiler.h | Introduces PREFIX_IS_ASYNC_VERSION_TAIL_AWAIT, impWrapTopOfStackInAwait, and compIsAsyncVersion(). |
| src/coreclr/jit/compiler.cpp | Adds verbose printing when compiling an async-version method. |
| src/coreclr/inc/jiteeversionguid.h | Updates JIT↔EE version GUID due to interface change. |
| src/coreclr/inc/icorjitinfoimpl_generated.h | Adds getAwaitReturnCall override to the generated ICorJitInfo impl header. |
| src/coreclr/inc/corinfo.h | Adds CORINFO_ASYNC_VERSION and the new ICorStaticInfo::getAwaitReturnCall method. |
|
Another pattern we may want to recognize: |
| { 11, CEE_CALL}, | ||
| { 16, CEE_CALL }, | ||
| { 19, CEE_ILLEGAL } // End marker | ||
| { 21, CEE_ILLEGAL } // End marker |
There was a problem hiding this comment.
Not sure if I am missing something or if we really have no coverage of this case -- as far as I can see we use this value to figure out how many IL bytes to skip after the pattern match, so this would almost certainly cause problems.
| public override bool IsInitLocals => _ecmaIL.IsInitLocals; | ||
| public override int MaxStack => _ecmaIL.MaxStack; | ||
|
|
||
| public static MethodIL GetWrappedIfAsyncVersion(MethodDesc method, Compilation compilation) |
There was a problem hiding this comment.
Could this logic be moved into the NativeAotILProvider.GetMethodIL() and ReadyToRunILProvider.GetMethodIL()? Or do we run into issues elsewhere if we do that?
There was a problem hiding this comment.
This was on @MichalStrehovsky's suggestion -- since the IL from the synchronous version is mistyped (returns a Task<T> when the async variant expects T) we did not want this to leak out of the JIT-EE and IL scanner.
| private static T TransparentAwaitWithResult<T>(ValueTask<T> task) | ||
| { | ||
| if (!task.IsCompleted) | ||
| { | ||
| TailAwait(); | ||
| return TransparentAwaitValueTaskOfT(task); | ||
| } | ||
|
|
||
| return task.Result; | ||
| } |
There was a problem hiding this comment.
I want to rename these in a follow-up -- we have TransparentAwait already that always suspends and TransparentAwaitWithResult that does a check and then suspends. I think the "always suspend" variant should be called TransparentSuspend instead of TransparentAwait, and this version should be called TransparentAwait to match what the await keyword does. But I want to do that separately since this PR is already large.
| _dependencies.Add(_factory.MethodEntrypoint(asyncHelpers.GetKnownMethod("FinishSuspensionNoContinuationContext"u8, null)), asyncReason); | ||
| _dependencies.Add(_factory.MethodEntrypoint(asyncHelpers.GetKnownMethod("FinishSuspensionWithContinuationContext"u8, null)), asyncReason); | ||
| } | ||
|
|
There was a problem hiding this comment.
Previously this relied on seeing a function call with async calling convention -- normally AsyncHelpers.Await -- but for the new async versions we don't see any calls like that. Instead we now do this from inside ImportCall when we did end up with an actual async variant to import.
| // Currently crossgen2 does not support compiling async versions of synchronous Task-returning functions. | ||
| // We would compile a wrapper thunk but that comes with different perf characteristics and diagnostics | ||
| // that we do not want to deal with. | ||
| if (methodNeedingCode.SupportsAsyncVersionCodegen()) | ||
| { | ||
| return true; | ||
| } |
There was a problem hiding this comment.
Hoping to leave this for a follow-up.
| if ((codeAddr + sz < codeEndp) && (getU1LittleEndian(codeAddr + sz) == CEE_RET)) | ||
| { | ||
| JITDUMP("\nRecognized tail-call in async version\n"); | ||
| awaitOffset = (IL_OFFSET)(codeAddr - info.compCode); | ||
| isAwait = true; |
| // Get information about which await call to use to await the return type | ||
| // of the non-async version of an async call. | ||
| virtual CORINFO_METHOD_HANDLE getAwaitReturnCall(CORINFO_METHOD_HANDLE callerHandle, CORINFO_LOOKUP* instArg) = 0; | ||
|
|
| &callInfo); | ||
|
|
||
| if (isAwait && (callInfo.kind == CORINFO_CALL)) | ||
| // TODO: crossgen2 cannot handle us removing this |
There was a problem hiding this comment.
@jtschuster I was still running into some problems here in the full tests, even with the crossgen2 fix suggested. I would suggest we remove this and add the crossgen2 support separately in a follow up.
| // Consume the ret opcode. Note `codeAddr` points at the unconsumed token; | ||
| // the main loop will still do `codeAddr += sz` (token size) after this case. | ||
| codeAddrAfterMatch = codeAddr + 1; | ||
| } |
There was a problem hiding this comment.
We need handling for methods that end with tail. Call() and JMP, here and in the interpreter.
|
|
||
| asyncInfo.IsTailAwait = inlCall->GetAsyncInfo().IsTailAwait; | ||
| asyncInfo.IsTailAwait = | ||
| inlCall->GetAsyncInfo().IsTailAwait && (m_nextAwaitIsTail || (call->gtReturnType == info.compRetType)); |
There was a problem hiding this comment.
We should consider whether we should call canTailcall JIT-EE function to get these checks:
runtime/src/coreclr/vm/jitinterface.cpp
Lines 8238 to 8259 in 1547c9a
Otherwise these methods won't show up in async stacktraces. OTOH tail-awaiting is a significantly more impactful optimization than tail-calling since it removes a continuation from the chain.
|
I've marked this as ready, but there are a couple of caveats:
Task Foo()
{
if (x)
return Bar();
return Baz();
}into Task Foo()
{
Task t;
if (x)
{
t = Bar();
goto RET;
}
t = Baz();
goto RET;
RET: return t;
}In general the solution here is not ideal, but after some discussion with @davidwrighton I do believe it is the most pragmatic way of solving the problem for 11.0. We can consider some more robust solutions in the future where we provide some kind of syntax that allows users to make these wrapper functions async and still retain the old EH/context behavior. I can also do something best-effort in the JIT/interpreter to recognize some of those specific patterns emitted, even in debug IL codegen, but like @jkotas has said before we do not really want to have to start recognizing specific Roslyn IL patterns.
NativeAOT test sizes
cc @AndyAyersMS @VSadov @jkotas @davidwrighton @MichalStrehovsky @jtschuster @rcj1 |
| } | ||
|
|
||
| if (TryGenerateAsyncThunk(resolver, methodILDecoder)) | ||
| if (TryGenerateUnsafeAccessor(resolver, methodILDecoder)) |
There was a problem hiding this comment.
Why do we need to swap the order here? Unsafe accessors are less common than async thunks, so this is deoptimization.
There was a problem hiding this comment.
The situation from the test looks like:
public class PrivateAsync
{
private static async Task a_task1(int i)
{
}
}
public class PrivateAsync2
{
private static async Task a_task2(int i)
{
return await Accessors1.accessor(null, i - 1);
}
}
public class Accessors1
{
[UnsafeAccessor(UnsafeAccessorKind.StaticMethod, Name = "a_task1")]
public extern static Task accessor(PrivateAsync1 o, int i);
}
public class Accessors2
{
[UnsafeAccessor(UnsafeAccessorKind.StaticMethod, Name = "a_task2")]
public extern static Task accessor(PrivateAsync2 o, int i);
}The call in a_task2 ends up requesting an async variant of Accessors1.accessor. That was going into this path and hitting TryGenerateAsyncThunk, which ends up creating the delegating thunks that we should be able to remove the code for with this PR. If we swap the order we instead create an async unsafe accessor that directly calls the async variant of a_task1.
Let me try an alternative which instead compiles the async variant of Accessors1.accessor as an async version that is passed the IL of the Task unsafe accessor. Then we shouldn't end up in this path at all, but it needs a bit more refactoring.
There was a problem hiding this comment.
Actually there is a simpler fix in TryGenerateAsyncThunk
| // Get information about which await call to use to await the return type | ||
| // of the non-async version of an async call. | ||
| virtual CORINFO_METHOD_HANDLE getAwaitReturnCall(CORINFO_METHOD_HANDLE callerHandle, CORINFO_LOOKUP* instArg) = 0; | ||
|
|
| // For a Runtime Async method the methoddef maps to a Task-returning thunk with runtime-provided implementation, | ||
| // while the default IL belongs to the Async implementation variant. | ||
| // By default the config captures the default methoddesc, which would be a thunk, thus no IL header. | ||
| // So, if config provides no header and we see an implementation method desc, then just ask the method desc itself. | ||
| if (ilHeader == NULL && pMD->IsAsyncVariantMethod()) | ||
| { | ||
| _ASSERTE(!pMD->IsAsyncThunkMethod()); | ||
| ilHeader = pMD->GetILHeader(); | ||
| } |
| BOOL MayHaveILHeader() | ||
| { | ||
| LIMITED_METHOD_DAC_CONTRACT; | ||
|
|
||
| // methods with transient IL bodies do not have IL headers | ||
| return IsIL() && !IsUnboxingStub() && !IsAsyncThunkMethod(); | ||
| return IsIL() && !IsUnboxingStub() && (!IsAsyncThunkMethod() || SupportsAsyncVersionCodegen()); | ||
| } |
Instead of delegating from runtime async callable thunks to the original task returning methods this PR compiles a fully separate runtime async version. It then adds a guaranteed optimization to make tail calls in the synchronous task-returning methods into runtime async calls.
This is one potential approach to fix #115771.