![]() |
![]() |
#1 |
"Oliver"
Mar 2005
Germany
11·101 Posts |
![]()
Hello,
can somebody give me a hint how to calculate a division / remainder with software integers in an efficent way? Size of the numbers: up to ~80/160 bits (Trialfactoring of factors up to ~80 bits) Limitations: - (virtually) NO if/then/else! - only 16/32bit integers - only single precission floats (non IEEE rounding) - no assembler (no carry flag, ...), only C if/then/else is OK if (and only if) the code takes in >99.9% of the cases (different input data) the same code path. Background: I'm thinking of trial factoring on graphics cards (Nvidia). There are so many limitations for running calculations on Nvidia cards but TF might be possible ;) Currently I've done a quick&dirty implemantation with the limitations above which runs on my main CPU (C2D @ 3GHz). For 65+ bit factors the performance is about 60 times slower than prime95. I think if I can improve the speed to lets say "only" 10-15 times slower than prime95 it might be worth to try it on the GFX. e.g.: Geforce 8800GTX (128 shader units, 768MiB memory) some recommendations for optimal performance: - at least 6 threads per shader unit to hide the latency of the memory (so we will need 768 threads <=> only 1MiB per thread available) - these threads should be strictly SIMD otherwise the threads won't be done in parallel... that's why I've said no if/then/else. Loops are OK if (and only if) the number ot iterations for each thread are the same. TheJudger P.S. I'm pretty sure that the lack of double precission isn't the main reason for not doing LL on the current GFX cards. The limitations mentioned above should be even more problematic... we'll need a thread-parallel multiplication algorithm which is SIMD for all threads and which doesn't need much communication between the threads... |
![]() |
![]() |
![]() |
#2 |
Nov 2003
22×5×373 Posts |
![]() |
![]() |
![]() |
![]() |
#3 |
Dec 2005
22×23 Posts |
![]()
Why not just use GMP and test a different number in each thread? You could be doing 768 tests in parallel.
Last fiddled with by ShiningArcanine on 2007-10-18 at 03:19 |
![]() |
![]() |
![]() |
#4 |
Nov 2003
22·5·373 Posts |
![]() |
![]() |
![]() |
![]() |
#5 | |
"Oliver"
Mar 2005
Germany
11·101 Posts |
![]() Quote:
I'm pretty sure the GMP won't fit to the limitations... |
|
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Sublinear complexity of trial division? | yih117 | Math | 5 | 2018-02-02 02:49 |
Mersenne trial division implementation | mathPuzzles | Math | 8 | 2017-04-21 07:21 |
trial division over a factor base | Peter Hackman | Factoring | 7 | 2009-10-26 18:27 |
P95 trial division strategy | SPWorley | Math | 8 | 2009-08-24 23:26 |
Need GMP trial-division timings | ewmayer | Factoring | 7 | 2008-12-11 22:12 |