kenobi
1/22/2015 10:36:00 AM
have maybe someone hera has some experience with gpu general computing /code optimisation? how many faster loop
throughput can be when implemented in cude/opencl? (i am suspecting from some reading that it could in the range of 10 times more throughtput than cpu core, but im not sure, i was only doing basic opencl kernel, got no much time to invest in real benchmarking or tests)