c++ - How to properly link cuda header file with device functions? -
i'm trying decouple code bit , fails. compilation error:
error: calling __host__ function("decoupledcallgpu") __global__ function("kernel") not allowed
code excerpt:
main.c (has call cuda host function):
#include "cuda_compuations.h" ... computesomething(&var1,&var2); ...
cuda_computations.cu (has kernel, host master functions , includes header has device unctions):
#include "cuda_computations.h" #include "decoupled_functions.cuh" ... __global__ void kernel(){ ... decoupledcallgpu(&var_kernel); } void computesomething(int *var1, int *var2){ //allocate memory , etc.. ... kernel<<<20,512>>>(); //cleanup ... }
decoupled_functions.cuh:
#ifndef _decoupledfunctions_h_ #define _decoupledfunctions_h_ void decoupledcallgpu(int *var); #endif
decoupled_functions.cu:
#include "decoupled_functions.cuh" __device__ void decoupledcallgpu(int *var){ *var=0; } #endif
compilation:
nvcc -g --ptxas-options=-v -arch=sm_30 -c cuda_computations.cu -o cuda_computations.o -lcudart
question: why decoupledcallgpu
called host function , not kernel supposed to?
p.s.: can share actual code behind if need me to.
add __device__
decorator prototype in decoupled_functions.cuh
. should take care of error message seeing.
then you'll need use separate compilation , linking amongst modules. instead of compiling -c
compile -dc
. , link command need modified. basic example here.
your question bit confusing:
question: why decoupledcallgpu called host function , not kernel supposed to?
i can't tell if you're tripping on english or if there misunderstanding here. actual error message states:
error: calling
__host__
function("decoupledcallgpu")__global__
function("kernel") not allowed
this arising due fact within compilation unit (ie. within module, within file being compiled, ie. cuda_computations.cu), description of function decoupledcallgpu()
provided in prototype in header:
void decoupledcallgpu(int *var);
this prototype indicates undecorated function in cuda c, , such functions equivalent to __host__
(only) decorated functions:
__host__ void decoupledcallgpu(int *var);
that compilation unit has no knowledge of in decoupled_functions.cu.
therefore, when have kernel code this:
__global__ void kernel(){ //<- __global__ function ... decoupledcallgpu(&var_kernel); //<- appears __host__ function compiler }
the compiler thinks trying call __host__
function __global__
function, illegal.
Comments
Post a Comment