mersenneforum.org gmp-ecm needed memory and runtime
 Register FAQ Search Today's Posts Mark Forums Read

 2010-04-06, 20:49 #1 yoyo     Oct 2006 Berlin, Germany 72×13 Posts gmp-ecm needed memory and runtime Hello, I run gmp-ecm with B1=43M and different input number length. Afterwards I made a chart of estimated memory (red line) and measured run time (blue line). X: number length Estimated memory is nearly linear increasing, this shows also the chart for different B1. But why is there a big jump in the line? Runtime seems to grow potencially (bah, what is the right engl. word?). Can somobod confirm or explain it? yoyo Attached Thumbnails
2010-04-07, 07:14   #2
xilman
Bamboozled!

"𒉺𒌌𒇷𒆷𒀭"
May 2003
Down not across

11,423 Posts

Quote:
 Originally Posted by yoyo Runtime seems to grow potencially (bah, what is the right engl. word?).
"Exponentially".

Paul

 2010-04-07, 07:48 #3 fivemack (loop (#_fork))     Feb 2006 Cambridge, England 2×7×461 Posts This isn't an exponential-growth case. I can't think of a nice English word for "as some power of the input"; 'polynomially' suggests it's an integer exponent which isn't the case.
2010-04-07, 13:52   #4
R.D. Silverman

"Bob Silverman"
Nov 2003
North of Boston

23×3×311 Posts

Quote:
 Originally Posted by yoyo Hello, I run gmp-ecm with B1=43M and different input number length. Afterwards I made a chart of estimated memory (red line) and measured run time (blue line). X: number length Estimated memory is nearly linear increasing, this shows also the chart for different B1. But why is there a big jump in the line? Runtime seems to grow potencially (bah, what is the right engl. word?). Can somobod confirm or explain it? yoyo
read my joint paper with Sam Wagstaff:

A Practical Analysis of ECM
Math. Comp.

 2010-04-07, 16:09 #5 wblipp     "William" May 2003 New Haven 237210 Posts Bob's paper would answer your question, but it's overkill for you needs, it's hard to get hold of if you aren't associated with a university, and it has an error that made it hard, at least for me, to follow (although not relevant to your question). The jump in memory happens when the granularity of the FFT changes. Large numbers are represented as arrays of numbers - much like decimal numbers are arrays of digits. As the number gets larger, the maximum size permitted in each array element gets smaller. To make up some numbers for the sake up discussion, as the size goes from 100 to 500 digits, you are using 3 digits per element, and the memory grows because you need 34 elements for the 100 digit numbers and 167 elements for the 500 digit number. But between 500 and 600 digits you must shift to 2 digits per element, so 600 digits requires arrays of length 300 - much longer than the 200 you would get from the earlier growth. It then grows slowly until the next jump. The change in granularity is necessary because of how FFT multiplication deals with carrying. When we do multiplication by longhand, we treat the numbers as arrays of single digits and we integrate the carries in two stages. But FFT multiplication computes a convolution - this is as if we kept the results of each multiply (for 5x5, keeping the whole 25 instead of keeping the 5 and carrying the 2), AND in the addition stage we keep the whole totals instead of keeping the last digit and carrying the rest. Hence the convolution of 153 x 718 would be Code:  1 5 3 7 1 8 ---------- 8 40 24 1 5 3 7 35 21 --------------- 7 36 34 43 24 From which you could then do the carries as 24 43 34 36 7 ------ 109854 To get the product The problem is that the individual array elements must be large enough to hold the column sums before the carry stage. As the array gets longer, the potential size of the sums grows. When the sum threatens to overflow, you must make the individual elements smaller. William
 2010-04-07, 16:13 #6 akruppa     "Nancy" Aug 2002 Alexandria 1001101000112 Posts What is the unit of "length" and on what kind of system (in particular, 32 bit or 64 bit) were the timings produced? Alex
2010-04-07, 16:42   #7
yoyo

Oct 2006
Berlin, Germany

72·13 Posts

Quote:
 Originally Posted by akruppa What is the unit of "length" and on what kind of system (in particular, 32 bit or 64 bit) were the timings produced? Alex
Length=number of digits of the input number
It was produced on a 64bit Suse Linux i7 processor:
http://www.rechenkraft.net/yoyo/show...hp?hostid=8036

yoyo

 2010-04-09, 16:48 #8 akruppa     "Nancy" Aug 2002 Alexandria 2,467 Posts Consider stage 1 and stage 2 separately, their run times behave rather differently for increasing input number size. For relatively small input, usually bounded by GMP's MUL_TOOM22_THRESHOLD (28 by default on K8/K10 systems), the multiplication is simply quadratic time grammar-school, so in theory the time grows like the square of the number of words in the input. In practice the growth is a little slower due to overhead. In GMP-ECM 6.3 we also switch from GMP-ECM's own multiplication functions to GMP's which are faster for operands of more than 15-20 words (depending on CPU), so the growth will again slow down a bit, even though each algorithm is quadratic time. Above MUL_TOOM22_THRESHOLD, GMP uses asymptotically faster algorithms like Karatsuba with complexity O(n^1.59), Toom-3, etc, and eventually Schönhage-Strassen with O(n log n loglog n). So for large enough input, the stage 1 time eventually becomes quasi-linear. In stage 2 the time is mostly linear in the size of the inputs. There is some modular arithmetic, but the time for polynomial multiplication normally dominates. With NTT, that time is O(deg * n^2) for conversion between Z/NZ and NTT primes, and O(n * deg * log(deg)) for the transforms. The initialisation of the NTT primes is cubic time however, which will dominate for large enough input. Without NTT, stage 2 uses Schönhage-Strassen which results in quasi-polynomial time for pretty much all input sizes, at the cost of using more memory. So, sadly, there's no easy "grows like n^c" answer. Stage 1 is quasi-linear eventually, but almost quadratic for most numbers that we try to factor. Stage 2 becomes cubic(!) eventually when using NTT, but is pretty much linear for numbers and parameters we typically use, and stays pretty much linear if you don't use the NTT. For stage 1, here's a graph for 1, ..., 23 words with B1=1M on a Core 2 2133 MHz.The graph fits a 51x2+106x+982 function pretty well up to 20 words, where a jump occurs since we switch to GMP functions at that point which are a no faster than our own on Core 2 but incur more overhead (Note: On AMD cpus we switch to GMP functions earlier, because they actually are faster there). I haven't made a graph for stage 2 times yet, but I expect that for reasonable B2 it will be pretty linear in the number of NTT primes, which is more-or-less proportional to the number of words in the modulus. Alex Attached Thumbnails   Last fiddled with by akruppa on 2010-04-09 at 17:17

 Similar Threads Thread Thread Starter Forum Replies Last Post joe a Hardware 35 2016-11-07 22:06 jasonp Factoring 2 2011-03-07 01:22 Unregistered Information & Answers 2 2009-09-20 15:21 D. B. Staple Factoring 11 2007-12-12 16:52 pin Software 13 2003-03-18 19:54

All times are UTC. The time now is 00:47.

Fri Aug 12 00:47:34 UTC 2022 up 35 days, 19:34, 2 users, load averages: 1.20, 1.07, 1.11