![]() |
msieve LA on CUDA
I finally got a chance to try the msieve LA on a nVidia P100. It solved a 4.5M matrix in 2 hours using 5.8 GB of GPU memory. The code needs work to bring it up to date, but it looks promising on P100 and V100.
[PASTEBIN]vivfGGRS[/PASTEBIN] |
Have you tried a MPI GPU version? Just concerned about the memory usage for this small matrix.
|
The existing CUDA code is definitely not MPI aware; MPI processes can each use a GPU for a smaller matrix multiply but data transfers to/from GPU would be required for every such operation. I've never even tried using it so the odds are 100% that it is broken.
A better implementation would host the data buffers on GPU at all times and do direct copies from one GPU to another. Latter-day CUDA makes this possible but it has to be explicitly set up. |
Hey Greg, any new updates on the above? TIA
|
All times are UTC. The time now is 21:19. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.