![]() |
![]() |
#1 |
Apr 2014
8510 Posts |
![]()
I want to write a simple kernel which calls out to a second kernel to perform a computation (a quadratic at an integer). Then we take the result of that operation and find it mod n (for our purposes, it's the same n in each calculation). Then, it needs to access a global variable, multiply the value by the residue we found, and take it mod n again.
The order does not matter so long as none of them get interrupted. Unfortunately it seems that every thread trying to access 1 variable is not going to yield good performance (and also it appears there's no way to do an atomic multiply). How do I solve the multiplication problem? Is it just not possible? Is there an alternative way to get around it? I feel like there should be a way to accomplish this. I just want to read-multiply-store. On the atomic blocking side, my idea was to have each block have a shared variable and do the multiplication with the block's threads on the shared var, then when all the blocks are done merge back and multiply their respective values to take the final value mod n. Will this result in better performance because less threads are blocked trying to write to the global? Is there a better implementation? |
![]() |
![]() |
![]() |
#2 |
Tribal Bullet
Oct 2004
5·709 Posts |
![]()
This is a standard example of a reduction operation, for which there's plenty of sample code available. Most reduction examples compute a sum of a bunch of numbers in a single kernel, and for a modular multiply accumulation within a single block the code is basically identical.
However, while there is an atomic add instruction, CUDA does not have an atomic modular multiply so there is no way to accumulate multiplies safely across multiple blocks. So you have two choices: - use CUDA code to simulate a mutex, or - have each block accumulate its own product and store in a global memory array, then have a single block of a separate kernel do the final accumulation of that array The second option is much preferred. |
![]() |
![]() |
![]() |
#3 |
Apr 2014
5×17 Posts |
![]()
Thanks, I figured that was the case.
|
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Anyone with access to Maple? | xilman | Programming | 28 | 2012-04-12 03:24 |
No access to mersenneforum anymore | Cedric Vonck | Forum Feedback | 2 | 2008-01-19 23:48 |
Too Much Internet Access. | M0CZY | Software | 3 | 2005-10-17 15:41 |
Need access to a PowerPC G4 and G5 | ewmayer | Hardware | 0 | 2005-05-03 22:15 |
Access violation error | Unregistered | Hardware | 7 | 2005-04-23 11:56 |