c++ - How to properly link cuda header file with device functions? -
i'm trying decouple code bit , fails. compilation error:
error: calling __host__ function("decoupledcallgpu") __global__ function("kernel") not allowed code excerpt:
main.c (has call cuda host function):
#include "cuda_compuations.h" ... computesomething(&var1,&var2); ... cuda_computations.cu (has kernel, host master functions , includes header has device unctions):
#include "cuda_computations.h" #include "decoupled_functions.cuh" ... __global__ void kernel(){ ... decoupledcallgpu(&var_kernel); } void computesomething(int *var1, int *var2){ //allocate memory , etc.. ... kernel<<<20,512>>>(); //cleanup ... } decoupled_functions.cuh:
#ifndef _decoupledfunctions_h_ #define _decoupledfunctions_h_ void decoupledcallgpu(int *var); #endif decoupled_functions.cu:
#include "decoupled_functions.cuh" __device__ void decoupledcallgpu(int *var){ *var=0; } #endif compilation:
nvcc -g --ptxas-options=-v -arch=sm_30 -c cuda_computations.cu -o cuda_computations.o -lcudart
question: why decoupledcallgpu called host function , not kernel supposed to?
p.s.: can share actual code behind if need me to.
add __device__ decorator prototype in decoupled_functions.cuh. should take care of error message seeing.
then you'll need use separate compilation , linking amongst modules. instead of compiling -c compile -dc. , link command need modified. basic example here.
your question bit confusing:
question: why decoupledcallgpu called host function , not kernel supposed to?
i can't tell if you're tripping on english or if there misunderstanding here. actual error message states:
error: calling
__host__function("decoupledcallgpu")__global__function("kernel") not allowed
this arising due fact within compilation unit (ie. within module, within file being compiled, ie. cuda_computations.cu), description of function decoupledcallgpu() provided in prototype in header:
void decoupledcallgpu(int *var); this prototype indicates undecorated function in cuda c, , such functions equivalent to __host__ (only) decorated functions:
__host__ void decoupledcallgpu(int *var); that compilation unit has no knowledge of in decoupled_functions.cu.
therefore, when have kernel code this:
__global__ void kernel(){ //<- __global__ function ... decoupledcallgpu(&var_kernel); //<- appears __host__ function compiler } the compiler thinks trying call __host__ function __global__ function, illegal.
Comments
Post a Comment