mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Math (https://www.mersenneforum.org/forumdisplay.php?f=8)
-   -   division/remainder algorithm (trial factoring) (https://www.mersenneforum.org/showthread.php?t=9487)

TheJudger 2007-10-17 20:40

division/remainder algorithm (trial factoring)
 
Hello,

can somebody give me a hint how to calculate a division / remainder with software integers in an efficent way?
Size of the numbers: up to ~80/160 bits (Trialfactoring of factors up to ~80 bits)

Limitations:
- (virtually) NO if/then/else!
- only 16/32bit integers
- only single precission floats (non IEEE rounding)
- no assembler (no carry flag, ...), only C

if/then/else is OK if (and only if) the code takes in >99.9% of the cases (different input data) the same code path.

Background: I'm thinking of trial factoring on graphics cards (Nvidia).
There are so many limitations for running calculations on Nvidia cards but TF might be possible ;)

Currently I've done a quick&dirty implemantation with the limitations above which runs on my main CPU (C2D @ 3GHz). For 65+ bit factors the performance is about 60 times slower than prime95. I think if I can improve the speed to lets say "only" 10-15 times slower than prime95 it might be worth to try it on the GFX.

e.g.: Geforce 8800GTX (128 shader units, 768MiB memory)
some recommendations for optimal performance:
- at least 6 threads per shader unit to hide the latency of the memory (so we will need 768 threads <=> only 1MiB per thread available)
- these threads should be strictly SIMD otherwise the threads won't be done in parallel... that's why I've said no if/then/else. Loops are OK if (and only if) the number ot iterations for each thread are the same.

TheJudger

P.S. I'm pretty sure that the lack of double precission isn't the main reason for not doing LL on the current GFX cards. The limitations mentioned above should be even more problematic... we'll need a thread-parallel multiplication algorithm which is SIMD for all threads and which doesn't need much communication between the threads...

R.D. Silverman 2007-10-17 20:52

[QUOTE=TheJudger;116554]Hello,

can somebody give me a hint how to calculate a division / remainder with software integers in an efficent way?
Size of the numbers: up to ~80/160 bits (Trialfactoring of factors up to ~80 bits)

...[/QUOTE]

Read

Knuth, The Art of Computer Programming, Vol II

ShiningArcanine 2007-10-18 03:18

Why not just use GMP and test a different number in each thread? You could be doing 768 tests in parallel.

R.D. Silverman 2007-10-18 13:59

[QUOTE=ShiningArcanine;116577]Why not just use GMP and test a different number in each thread? You could be doing 768 tests in parallel.[/QUOTE]

We must have read different posts. I could swear that the O.P. asked
"how to calculate" and not "what code do I use". :missingteeth:

TheJudger 2007-10-18 19:01

[QUOTE=ShiningArcanine;116577]Why not just use GMP and test a different number in each thread? You could be doing 768 tests in parallel.[/QUOTE]

Yes, that's what I want to do.. test alot of possible factors for a single exponent at the same time.

I'm pretty sure the GMP won't fit to the limitations...


All times are UTC. The time now is 16:46.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2022, Jelsoft Enterprises Ltd.