You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We need to find a way to 'mark-up' subroutine that need to be compiled for GPU (i.e. subroutines called from a DSL kernel). While (kernel module) inlining might work, some functions are called from many different kernels, which will result in a (significant??) increase of the size of the compiled binary.
This is an issue that affects both the build system and PSyclone (probably more the build system), but I open the discussion here since we are all dealing with this issue, and it affects especially our GPU coverage for LFRic.
My approach was guided by the following requirements:
It needs to be able to run in parallel (i.e. PSyclone typically runs in parallel, so we can't e.g. write data to one file, or at least we would need synchronisation)
It should not closely connect PSyclone and Fab (or any other build system), i.e. should potentially be useful with other build systems as well, and PSyclone should have no direct dependency on something in Fab.
We don't want additional dependencies to be added to PSyclone (e.g. adding sqlite)
Given this, my current suggested solution works like this:
Any PSyclone script (ie. controlled by the user, not part of PSyclone) can write a dependency-information file. This file can for example store the list of 'secondary' subroutines to be marked up in yaml format. PSyclone might provide a convenience function to convert the static call tree information into yaml format (which a user script can use to simplify the work).
The filename might be provided as parameter (e.g. environment variable, or Parameters for transformation scripts #2757 ), or a script might use its own name and location to derive a name.
The build system will check these files once the PSyclone run is finished. It can merge the information contained in these files (Fab might contain a convenience function for that, which might be usable stand-alone by other build systems), and so get the list of files to mark up. This can then be done in a separate phase of the build system with a custom script.
The advantage of this approach:
no need to manually mark-up source files before, i.e. it follows the PSyclone philosophy.
it minimises the number of files to me compiled for GPU (and CPU) to the ones really required.
it leaves control of the markup to PSyclone
I am aware that Sergi had a different approach in mind, I would be interested to get more detailed in this discussion, and see if we can combine these, or if one solution is better, or ..
We need to find a way to 'mark-up' subroutine that need to be compiled for GPU (i.e. subroutines called from a DSL kernel). While (kernel module) inlining might work, some functions are called from many different kernels, which will result in a (significant??) increase of the size of the compiled binary.
This is an issue that affects both the build system and PSyclone (probably more the build system), but I open the discussion here since we are all dealing with this issue, and it affects especially our GPU coverage for LFRic.
My approach was guided by the following requirements:
Given this, my current suggested solution works like this:
The advantage of this approach:
I am aware that Sergi had a different approach in mind, I would be interested to get more detailed in this discussion, and see if we can combine these, or if one solution is better, or ..
@arporter , @LonelyCat124 , @sergisiso feedback welcome