Hello,
I noticed a possible issue with the nvbit_get_related_functions function in the 1.7.6 release. It seems to return other kernels as “related functions,” which I didn’t expect.
Here’s a minimal example:
...
__global__ void fp64_nan_to_mem(double a, double b, double* res) {
res[threadIdx.x] = a+b;
}
__global__ void fp32_nan_to_mem(float a, float b, float* res) {
res[threadIdx.x] = a+b;
}
...
int main() {
...
fp64_nan_to_mem<<<1, 2*warpSize>>>(INFINITY, -INFINITY, d_res);
...
fp32_nan_to_mem<<<1, 2*warpSize>>>(INFINITY, -INFINITY, d_res32);
...
}
I wrote a small NVBit tool (test_related_functions.tgz) to print the related functions of each kernel launch. With version 1.7.6, I see:
$ LD_PRELOAD=./test_related_functions.so test/test
...
Instrumenting kernel fp64_nan_to_mem(double, double, double*)
Instrumenting related function fp64_nan_to_mem(double, double, double*)
Instrumenting related function fp32_nan_to_mem(float, float, float*)
Instrumenting kernel fp32_nan_to_mem(float, float, float*)
Instrumenting related function fp32_nan_to_mem(float, float, float*)
Instrumenting related function fp64_nan_to_mem(double, double, double*)
With version 1.7.5, the output is:
$ LD_PRELOAD=./test_related_functions.so test/test
...
Instrumenting kernel fp64_nan_to_mem(double, double, double*)
Instrumenting kernel fp32_nan_to_mem(float, float, float*)
The 1.7.5 behavior makes sense to me since the kernels don’t call any other functions. Could you clarify what the expected behavior should be in 1.7.6?
Thanks,
Mark Baranowski
Hello,
I noticed a possible issue with the nvbit_get_related_functions function in the 1.7.6 release. It seems to return other kernels as “related functions,” which I didn’t expect.
Here’s a minimal example:
I wrote a small NVBit tool (test_related_functions.tgz) to print the related functions of each kernel launch. With version 1.7.6, I see:
With version 1.7.5, the output is:
The 1.7.5 behavior makes sense to me since the kernels don’t call any other functions. Could you clarify what the expected behavior should be in 1.7.6?
Thanks,
Mark Baranowski