Skip to content

CUDA support via the Adapt mechanism#56

Closed
RainerHeintzmann wants to merge 24 commits into
mainfrom
cuda_via_adapt
Closed

CUDA support via the Adapt mechanism#56
RainerHeintzmann wants to merge 24 commits into
mainfrom
cuda_via_adapt

Conversation

@RainerHeintzmann

Copy link
Copy Markdown
Member

This is another fresh attempt to gain CUDA support.
It needs the very recent version of NDTools.jl to work (using the select_region_view() function).
Also the classes internal to FourierTools.jl have been extended via the Adapt mechanism.
The direction paramater was also moved to a Val() type.

The CircShiftedArray.jl was copied from ShiftedArrays.jl and modified a little by also including it into the Adapt mechanism which is based on type Unions. It might be preferable to rather adapt the original implementation.

@codecov

codecov Bot commented Apr 5, 2025

Copy link
Copy Markdown

Codecov Report

Attention: Patch coverage is 69.68085% with 57 lines in your changes missing coverage. Please review.

Project coverage is 89.98%. Comparing base (60de955) to head (5220a1f).

Files with missing lines Patch % Lines
ext/CUDASupportExt_FT.jl 25.00% 30 Missing ⚠️
ext/CUDASupportExt_SA.jl 28.57% 25 Missing ⚠️
src/custom_fourier_types.jl 93.93% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #56      +/-   ##
==========================================
- Coverage   94.03%   89.98%   -4.05%     
==========================================
  Files          19       21       +2     
  Lines        1173     1278     +105     
==========================================
+ Hits         1103     1150      +47     
- Misses         70      128      +58     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Comment thread ext/CUDASupportExt_FT.jl Outdated
Comment thread ext/CUDASupportExt_FT.jl
CUDA.CuArrayStyle{N,CD}()
end

function Base.copy(s::AllShiftedAndViews)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is type piracy?

Because you define a copy method from Base with types owned by base. So this should not be done.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is type piracy here. AllShiftedAndViews only contains types or unions which are defined in this package. However there are arguments one could make with the CUDASupportExt_SA.jl. In that case, I really don't know what to do, since the author of ShiftedArrays.jl is non-responsive.
JuliaArrays/ShiftedArrays.jl#70
JuliaArrays/ShiftedArrays.jl#64
Maybe I just dublicate the code for the circshiftedArray.jl into MutableShiftedArrays.jl and host the extension there and then remove the dependence on ShiftedArrays.jl entirely.?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the necessary code to MutableShiftedArrays.jl and opened a PR there:
RainerHeintzmann/MutableShiftedArrays.jl#2
Once this is merged and a new version released, one can remove the dependency on ShiftedArrays.jl entirely.

Comment thread src/fourier_shifting.jl Outdated
@roflmaostc

Copy link
Copy Markdown
Member

I think the main issue is that you define methods such as Base.copy with types from base SubArray. This is type piracy and should be avoided at all costs.

Comment thread ext/CUDASupportExt_SA.jl

# Define the BroadcastStyle for SubArray of MutableShiftedArray with CuArray

function Base.Broadcast.BroadcastStyle(::Type{T}) where {N, CD, T<:AllSubArrayTypeCu{N, CD}}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is type piracy. Because you use Base types and a base method.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The plan is to entirely remove CUDASupportExt_SA.jl since this is then covered in MutableShiftedArrays.jl.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you remove it?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above "The plan is". Once I merged the other package, I can simply remove it here. The decision to be made: A) have circshifted array exist in ShiftedArrays.jl and MutableShiftedArrays.jl with currently identical code, or B) rename CircShiftedArray to MutableCircShiftedArray with identical functionality and currently identical code (except for the CUDA support extension. Or C) import ShiftedArrays.jl in MutableShiftedArrays.jl and reexport with the extension (I think this would still be type piracy). My preference: option A). What is yours?

Comment thread src/fourier_resizing.jl Outdated
Comment on lines +92 to +94
# tmp = similar(mat, (8, 10))
# tmp .= 1
# return rft_fix_after(tmp, old_size, new_size)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can be deleted?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

Comment thread src/fourier_resizing.jl Outdated
Comment on lines +99 to +130
# """
# select_region(mat; new_size)

# performs the necessary Fourier-space operations of resampling
# in the space of ft (meaning the already circshifted version of fft).

# `new_size`.
# The size of the array view after the operation finished.

# `center`.
# Specifies the center of the new view in coordinates of the old view. By default an alignment of the Fourier-centers is assumed.
# # Examples
# ```jldoctest
# julia> using FFTW, FourierTools

# julia> select_region(ones(3,3),new_size=(7,7),center=(1,3))
# 7×7 PaddedView(0.0, OffsetArray(::Matrix{Float64}, 4:6, 2:4), (Base.OneTo(7), Base.OneTo(7))) with eltype Float64:
# 0.0 0.0 0.0 0.0 0.0 0.0 0.0
# 0.0 0.0 0.0 0.0 0.0 0.0 0.0
# 0.0 0.0 0.0 0.0 0.0 0.0 0.0
# 0.0 1.0 1.0 1.0 0.0 0.0 0.0
# 0.0 1.0 1.0 1.0 0.0 0.0 0.0
# 0.0 1.0 1.0 1.0 0.0 0.0 0.0
# 0.0 0.0 0.0 0.0 0.0 0.0 0.0
# ```
# """
# function select_region(mat; new_size=size(mat), center=ft_center_diff(size(mat)).+1, pad_value=zero(eltype(mat)))
# new_size = Tuple(expand_size(new_size, size(mat)))
# center = Tuple(expand_size(center, ft_center_diff(size(mat)) .+ 1))
# oldcenter = ft_center_diff(new_size) .+ 1
# PaddedView(pad_value, mat, new_size, oldcenter .- center.+1);
# end

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can be deleted?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

Comment thread src/fourier_rotate.jl
end
arr = select_region(arr, new_size=old_size .+ extra_size, pad_value=pad_value)

arr = select_region_view(arr, new_size=old_size .+ extra_size, pad_value=pad_value)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this changed to the view?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... to avoid copying too much and to reduce memory requirements.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you benchmark? Did it change anything?
I would not change any unaffected code by this introduction.

Comment thread src/utils.jl
```jldoctest
julia> FourierTools.bc_size(rand(5, 2, 3), rand(1, 2))
(5, 2, 3)
"""

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing ```

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment thread src/utils.jl
end

"""
get_idxrng_around_center(arr_1, arr_2)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is idxrng?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the documentation. idxrng stands for "index_range".

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe give it the verbose name then. it's not exported anyway

Comment thread src/utils.jl Outdated
Comment on lines 373 to 378
# out_is = []
# for i = 1:ndims(arr_large)
# a, b = get_indices_around_center(size(arr_large)[i], size(arr_small)[i])
# push!(out_is, a:b)
# end

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment thread test/convolutions.jl
N = 5
psf = abs.(randn((N, N, 2)))
img = randn((N, N, 2))
psf = opt_cu(abs.(randn((N, N, 2))), use_cuda)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is opt_cu?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the mechanism used in the tests to locally test with CUDA but upon deployment test without CUDA. It is defined at the beginning of runtests.jl.

@roflmaostc

Copy link
Copy Markdown
Member

We need tests for everything and be sure that nothing breaks.

@roflmaostc

Copy link
Copy Markdown
Member

You copied some code, did you check the license? Probably MIT and we need to attribute or mention it where it was copied from.

Comment thread ext/CUDASupportExt_FT.jl
Comment on lines +71 to +75
function Base.isapprox(y::AbstractArray, x::AllShiftedAndViewsCu; atol=0, rtol=atol>0 ? 0 : sqrt(eps(real(eltype(x)))), va...)
atol = (atol != 0) ? atol : rtol * maximum(abs.(x))
return all(abs.(x .- y) .<= atol)
end

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are those methods defined again? Base should be able to handle this generically?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am pretty sure there was a reason. The CUDA-tests did not run properly without this definition, which is why I defined it.

Comment thread ext/CUDASupportExt_FT.jl
Comment on lines +62 to +64
function Base.:(==)(x::AllShiftedAndViewsCu, y::AllShiftedAndViewsCu)
return all(x .== y)
end

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't base able to handle this?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am pretty sure there was a reason. The CUDA-tests did not run properly without this definition, which is why I defined it.

struct FourierJoin{T,N, AA<:AbstractArray{T, N}, D} <: AbstractArray{T, N}
parent::AA
D::Int # dimension along which to apply to copy
# D::Int # dimension along which to apply to copy

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can be deleted?

Comment on lines +105 to +107
# function FourierJoin(parent::AA, D::Int, L1::Int, do_split::Bool) where {T,N, AA<:AbstractArray{T, N}}
# FourierJoin(parent, Val(D), L1, do_split)
# end

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete?

@roflmaostc

Copy link
Copy Markdown
Member

Overall, shouldn't this go in a separate package? It seems a bit annoying to have all this weird CUDA handling in this package instead of somewhere else.

@RainerHeintzmann

Copy link
Copy Markdown
Member Author

Overall, shouldn't this go in a separate package? It seems a bit annoying to have all this weird CUDA handling in this package instead of somewhere else.

I dont understand this extension and the changes are done to make FourierTools.jl work with CUDA, which is then does. There is very little change to the core code now and it adds a very important functionality. So what do you suggest to achieve with a different package?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants