![]() |
![]() |
#1 |
Oct 2006
Berlin, Germany
72×13 Posts |
![]()
Hello,
I run gmp-ecm with B1=43M and different input number length. Afterwards I made a chart of estimated memory (red line) and measured run time (blue line). X: number length Estimated memory is nearly linear increasing, this shows also the chart for different B1. But why is there a big jump in the line? Runtime seems to grow potencially (bah, what is the right engl. word?). Can somobod confirm or explain it? yoyo |
![]() |
![]() |
![]() |
#2 |
Bamboozled!
"๐บ๐๐ท๐ท๐ญ"
May 2003
Down not across
262378 Posts |
![]() |
![]() |
![]() |
![]() |
#3 |
(loop (#_fork))
Feb 2006
Cambridge, England
144668 Posts |
![]()
This isn't an exponential-growth case. I can't think of a nice English word for "as some power of the input"; 'polynomially' suggests it's an integer exponent which isn't the case.
|
![]() |
![]() |
![]() |
#4 | |
"Bob Silverman"
Nov 2003
North of Boston
1D2816 Posts |
![]() Quote:
A Practical Analysis of ECM Math. Comp. |
|
![]() |
![]() |
![]() |
#5 |
"William"
May 2003
New Haven
22×593 Posts |
![]()
Bob's paper would answer your question, but it's overkill for you needs, it's hard to get hold of if you aren't associated with a university, and it has an error that made it hard, at least for me, to follow (although not relevant to your question).
The jump in memory happens when the granularity of the FFT changes. Large numbers are represented as arrays of numbers - much like decimal numbers are arrays of digits. As the number gets larger, the maximum size permitted in each array element gets smaller. To make up some numbers for the sake up discussion, as the size goes from 100 to 500 digits, you are using 3 digits per element, and the memory grows because you need 34 elements for the 100 digit numbers and 167 elements for the 500 digit number. But between 500 and 600 digits you must shift to 2 digits per element, so 600 digits requires arrays of length 300 - much longer than the 200 you would get from the earlier growth. It then grows slowly until the next jump. The change in granularity is necessary because of how FFT multiplication deals with carrying. When we do multiplication by longhand, we treat the numbers as arrays of single digits and we integrate the carries in two stages. But FFT multiplication computes a convolution - this is as if we kept the results of each multiply (for 5x5, keeping the whole 25 instead of keeping the 5 and carrying the 2), AND in the addition stage we keep the whole totals instead of keeping the last digit and carrying the rest. Hence the convolution of 153 x 718 would be Code:
1 5 3 7 1 8 ---------- 8 40 24 1 5 3 7 35 21 --------------- 7 36 34 43 24 From which you could then do the carries as 24 43 34 36 7 ------ 109854 To get the product William |
![]() |
![]() |
![]() |
#6 |
"Nancy"
Aug 2002
Alexandria
2,467 Posts |
![]()
What is the unit of "length" and on what kind of system (in particular, 32 bit or 64 bit) were the timings produced?
Alex |
![]() |
![]() |
![]() |
#7 | |
Oct 2006
Berlin, Germany
72×13 Posts |
![]() Quote:
It was produced on a 64bit Suse Linux i7 processor: http://www.rechenkraft.net/yoyo/show...hp?hostid=8036 yoyo |
|
![]() |
![]() |
![]() |
#8 |
"Nancy"
Aug 2002
Alexandria
9A316 Posts |
![]()
Consider stage 1 and stage 2 separately, their run times behave rather differently for increasing input number size.
For relatively small input, usually bounded by GMP's MUL_TOOM22_THRESHOLD (28 by default on K8/K10 systems), the multiplication is simply quadratic time grammar-school, so in theory the time grows like the square of the number of words in the input. In practice the growth is a little slower due to overhead. In GMP-ECM 6.3 we also switch from GMP-ECM's own multiplication functions to GMP's which are faster for operands of more than 15-20 words (depending on CPU), so the growth will again slow down a bit, even though each algorithm is quadratic time. Above MUL_TOOM22_THRESHOLD, GMP uses asymptotically faster algorithms like Karatsuba with complexity O(n^1.59), Toom-3, etc, and eventually Schรถnhage-Strassen with O(n log n loglog n). So for large enough input, the stage 1 time eventually becomes quasi-linear. In stage 2 the time is mostly linear in the size of the inputs. There is some modular arithmetic, but the time for polynomial multiplication normally dominates. With NTT, that time is O(deg * n^2) for conversion between Z/NZ and NTT primes, and O(n * deg * log(deg)) for the transforms. The initialisation of the NTT primes is cubic time however, which will dominate for large enough input. Without NTT, stage 2 uses Schรถnhage-Strassen which results in quasi-polynomial time for pretty much all input sizes, at the cost of using more memory. So, sadly, there's no easy "grows like n^c" answer. Stage 1 is quasi-linear eventually, but almost quadratic for most numbers that we try to factor. Stage 2 becomes cubic(!) eventually when using NTT, but is pretty much linear for numbers and parameters we typically use, and stays pretty much linear if you don't use the NTT. For stage 1, here's a graph for 1, ..., 23 words with B1=1M on a Core 2 2133 MHz.The graph fits a 51x2+106x+982 function pretty well up to 20 words, where a jump occurs since we switch to GMP functions at that point which are a no faster than our own on Core 2 but incur more overhead (Note: On AMD cpus we switch to GMP functions earlier, because they actually are faster there). I haven't made a graph for stage 2 times yet, but I expect that for reasonable B2 it will be pretty linear in the number of NTT primes, which is more-or-less proportional to the number of words in the modulus. Alex Last fiddled with by akruppa on 2010-04-09 at 17:17 |
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Is ECC memory needed? | joe a | Hardware | 35 | 2016-11-07 22:06 |
Predicting QS and NFS runtime | jasonp | Factoring | 2 | 2011-03-07 01:22 |
Memory Needed | Unregistered | Information & Answers | 2 | 2009-09-20 15:21 |
ECM Runtime and F20 | D. B. Staple | Factoring | 11 | 2007-12-12 16:52 |
amount of memory needed for Stage 2 (factor) | pin | Software | 13 | 2003-03-18 19:54 |