20071017, 20:40  #1 
"Oliver"
Mar 2005
Germany
11×101 Posts 
division/remainder algorithm (trial factoring)
Hello,
can somebody give me a hint how to calculate a division / remainder with software integers in an efficent way? Size of the numbers: up to ~80/160 bits (Trialfactoring of factors up to ~80 bits) Limitations:  (virtually) NO if/then/else!  only 16/32bit integers  only single precission floats (non IEEE rounding)  no assembler (no carry flag, ...), only C if/then/else is OK if (and only if) the code takes in >99.9% of the cases (different input data) the same code path. Background: I'm thinking of trial factoring on graphics cards (Nvidia). There are so many limitations for running calculations on Nvidia cards but TF might be possible ;) Currently I've done a quick&dirty implemantation with the limitations above which runs on my main CPU (C2D @ 3GHz). For 65+ bit factors the performance is about 60 times slower than prime95. I think if I can improve the speed to lets say "only" 1015 times slower than prime95 it might be worth to try it on the GFX. e.g.: Geforce 8800GTX (128 shader units, 768MiB memory) some recommendations for optimal performance:  at least 6 threads per shader unit to hide the latency of the memory (so we will need 768 threads <=> only 1MiB per thread available)  these threads should be strictly SIMD otherwise the threads won't be done in parallel... that's why I've said no if/then/else. Loops are OK if (and only if) the number ot iterations for each thread are the same. TheJudger P.S. I'm pretty sure that the lack of double precission isn't the main reason for not doing LL on the current GFX cards. The limitations mentioned above should be even more problematic... we'll need a threadparallel multiplication algorithm which is SIMD for all threads and which doesn't need much communication between the threads... 
20071017, 20:52  #2 
Nov 2003
2^{2}×5×373 Posts 

20071018, 03:18  #3 
Dec 2005
2^{2}·23 Posts 
Why not just use GMP and test a different number in each thread? You could be doing 768 tests in parallel.
Last fiddled with by ShiningArcanine on 20071018 at 03:19 
20071018, 13:59  #4 
Nov 2003
2^{2}×5×373 Posts 

20071018, 19:01  #5  
"Oliver"
Mar 2005
Germany
11·101 Posts 
Quote:
I'm pretty sure the GMP won't fit to the limitations... 

Thread Tools  
Similar Threads  
Thread  Thread Starter  Forum  Replies  Last Post 
Sublinear complexity of trial division?  yih117  Math  5  20180202 02:49 
Mersenne trial division implementation  mathPuzzles  Math  8  20170421 07:21 
trial division over a factor base  Peter Hackman  Factoring  7  20091026 18:27 
P95 trial division strategy  SPWorley  Math  8  20090824 23:26 
Need GMP trialdivision timings  ewmayer  Factoring  7  20081211 22:12 