opencl - Reduce 1024 images to one -


i read paper reducing 1d array 1 value in opencl ( http://developer.amd.com/resources/documentation-articles/articles-whitepapers/opencl-optimization-case-study-simple-reductions/ ) , understood concept of associative operators. extending concept 1 2d array should possible.

but problem different: have ~1000 images of 256x256 pixels 16bit each , sum these images have average image of them all. usual gpu should have enough memory (~130mb) perform task, don't see how implement kernel.

just 1d problem extends 2d, can extend 3d (which have: 1000x256x256).

exactly same principles apply: 1. try work in parallel can without contention other work groups. 2. reduction in stages each can parallel.

your going bandwidth limited, churning through 131 mb of memory, that's not problem. write kernels coalesced reads maximum performance.


Comments

Popular posts from this blog

javascript - RequestAnimationFrame not working when exiting fullscreen switching space on Safari -

jsf - How to ajax update an item in the footer of a PrimeFaces dataTable? -

django - CSRF verification failed. Request aborted. CSRF cookie not set -