c++ - OpenCL struct values correct on CPU but not on GPU -


i have struct in file wich included host code , kernel

typedef struct {     float x, y, z,           dir_x, dir_y, dir_z;     int     radius; } workliststruct; 

i'm building struct in c++ host code , passing via buffer opencl kernel.

if i'm choosing cpu device computation following result:

 printf ( "item:[%f,%f,%f][%f,%f,%f]%d,%d\n", item.x, item.y, item.z, item.dir_x, item.dir_y,                  item.dir_z , item.radius ,sizeof(float)); 

host:

item:[20.169043,7.000000,34.933712][0.000000,-3.000000,0.000000]1,4 

device (cpu):

item:[20.169043,7.000000,34.933712][0.000000,-3.000000,0.000000]1,4 

and if choose gpu device (amd) computation weird things happening:

host:

item:[58.406261,57.786015,58.137501][2.000000,2.000000,2.000000]2,4 

device (gpu):

item:[58.406261,2.000000,0.000000][0.000000,0.000000,0.000000]0,0 

notable sizeof(float) garbage on gpu.

i assume there problem layouts of floats on different devices.

note: struct contained in array of structs of type , every struct in array garbage on gpu

anyone have idea why case , how can predict this?

edit added %d @ , and replaced 1, result is:1065353216

edit: here 2 structs wich i'm using

typedef struct {       float x, y, z,//base coordinates        dir_x, dir_y, dir_z;//directio       int     radius;//radius } workliststruct;  typedef struct {     float base_x, base_y, base_z; //base point      float radius;//radius      float dir_x, dir_y, dir_z; //initial direction } returnstruct; 

i tested other things, looks problem printf. values seems right. passed arguments return struct, read them , these values correct.

i don't want post of related code, few hundred lines. if noone has idea compress bit.

ah, , printing i'm using #pragma opencl extension cl_amd_printf : enable.

edit: looks problem printf. don't use anymore.

there simple method check happens:

1 - create host-side data & initialize it:

int num_points = 128;  std::vector<workliststruct> works(num_points); std::vector<returnstruct> returns(num_points);  for(workliststruct &work : works){     work = initializeitsomehow();     std::cout << work.x << " " << work.y << " " << work.z << std::endl;     std::cout << work.radius << std::endl; }  // same stuff returns ... 

2 - create device-side buffers using copy_host_ptr flag, map & check data consistency:

cl::buffer dev_works(..., copy_host_ptr, (void*)&works[0]); cl::buffer dev_rets(..., copy_host_ptr, (void*)&returns[0]);  // map check data workliststruct *mapped_works = dev_works.map(...); returnstruct *mapped_rets = dev_rets.map(...);  // output values & unmap buffers ... 

3 - check data consistency on device side did previously.

also, make sure code (presumably - header), included both kernel & host-side code pure opencl c (amd compiler can "swallow" errors) , you've imported directory includes searching, when building opencl kernel ("-i" flag @ clbuildprogramm stage)

edited: @ every step, please collect return codes (or catch exceptions). beside that, "-werror" flag @ clbuildprogramm stage can helpfull.


Comments

Popular posts from this blog

javascript - RequestAnimationFrame not working when exiting fullscreen switching space on Safari -

Python ctypes access violation with const pointer arguments -