Hi! :) I think I've stumbled across a correctness issue w/ the similar function that leads to performance problems:
As far as I can tell, even when actually running within an @diff, similar(AutoGrad.Result{Array}) returns an Array. I would think, instead that it should return a Result{Array}.
I came across this trying to help a coworker diagnose why their AD code is slow. I saw from @btime that their code has a lot of allocations, which seem to grow with the size of the data (so it's not a one-time overhead related to compiling the extra methods).
I used Cthulhu.jl to look through the @code_warntype of the functions being differentiated, and found this:
%108 = invoke similar(::AutoGrad.Result{Array{Float64,1}})::Array{Float64,1}
That seems surprising to me! They were trying to use similar to prevent type-instability in this for-loop:
# coef is passed in as an AutoGrad.Param{Array{Float64,1}}
term = fill!(similar(coef[:,ndim]), 0) # Originally this was `zeros(ndim)`, but i'm trying to prevent type instability
for (idx,feat) in enumerate(feature)
term += (feat .* coef[:,idx]) .^(iorder)
end
As the comment explains, first we had term = zeros(ndim), but were concerned about the type instability inside the for-loop (the type of term would change after adding the Result{Array} to it). So we tried using similar(..., 0) to create a Result{Array} to start with so that adding coef to it doesn't change its type. But surprisingly, this returned the same thing as zeros().
Shouldn't similar(::AutoGrad.Result{Array{Float64,1}}) also return an AutoGrad.Result{Array{Float64,1}}, or am I misunderstanding? Thanks! 🙂
Hi! :) I think I've stumbled across a correctness issue w/ the
similarfunction that leads to performance problems:As far as I can tell, even when actually running within an
@diff,similar(AutoGrad.Result{Array})returns anArray. I would think, instead that it should return aResult{Array}.I came across this trying to help a coworker diagnose why their AD code is slow. I saw from
@btimethat their code has a lot of allocations, which seem to grow with the size of the data (so it's not a one-time overhead related to compiling the extra methods).I used
Cthulhu.jlto look through the@code_warntypeof the functions being differentiated, and found this:That seems surprising to me! They were trying to use
similarto prevent type-instability in this for-loop:As the comment explains, first we had
term = zeros(ndim), but were concerned about the type instability inside the for-loop (the type oftermwould change after adding theResult{Array}to it). So we tried usingsimilar(..., 0)to create aResult{Array}to start with so that addingcoefto it doesn't change its type. But surprisingly, this returned the same thing aszeros().Shouldn't
similar(::AutoGrad.Result{Array{Float64,1}})also return anAutoGrad.Result{Array{Float64,1}}, or am I misunderstanding? Thanks! 🙂