mersenneforum.org Sieving with NewPGen
 User Name Remember Me? Password
 Register FAQ Search Today's Posts Mark Forums Read

 2005-10-02, 03:51 #34 TTn   234316 Posts RMA.NET speed RMA.NET timings are faster than other methods. I've used Joss's 15k example, to compare a small range of 10 thousand n. k = 210885 nMin = 60000 nMax = 70000 ---------------------------- RMA.NET: 5 1/12 hours ---------------------------- Sieve until rate is equal to an average test. Then test the file: 5 7/60 hours --------------------------- Sieve until rate is double an average LLR test Then test the file: 5 43/180 hours --------------------------- I fear that your theory, will only get worse with a larger file.... RMA.NET does not split of a band, it tests with LLR at 1% CPU, when NewPGen is sieving at 99% CPU. If a test/s is complete, only then it is split off individually. As it stands, RMA.NET is the most efficient, since it can continually measure the progress, and change accordingly. An optimization will not be in the method, but in the hardware resourcing. Last fiddled with by TTn on 2005-10-02 at 03:55
2005-10-02, 16:04   #35
Thomas11

Feb 2003

22×32×53 Posts

I have written a few tools for LLR which you might be interested in.
You can use it to get very accurate information on used FFT lengths and timings. The source code is attached and I will provide Windows binaries in a separate post below. Linux users should use the source code, simply run make on it...

Here is the content of the README file:

Code:
README file for LLRtools by Thomas Ritschel
-------------------------------------------

This is just a collection of routines to get information about FFT lengths used by LLR
and to estimate running times (total and averaged).  Originally made for the 321search
project, it has been modified to work also for general k.  Please feel free to further
development.  Perhaps Jean Penne and/or George Woltman could add some of the features
to future versions of LLR/PRP.

General information:
--------------------

There are the following files:

llrtools.c  - the tools collection
llrtools.h  - header file for llrtools.c

fft_len.c   - lists the used FFT lengths for given (nmin,nmax) range (fixed k)
av_time.c   - estimates the average time per LLR test (fixed k)
get_time.c  - estimates total and average times for given LLR input file

The following two files are also required:

maxlen.txt  - table of FFT lengths and associated max. Mersenne exponents (nmers)
times.txt   - list of timings (msecs) for different FFT lengths

Depending on your CPU (SSE2 or non-SSE2), you will need to copy either "maxlen_P4.txt"
or "maxlen_Athlon.txt" to "maxlen.txt".  (The other two files "maxlen_P4_proth.txt" and
"maxlen_Athlon_proth.txt" contain information about the FFT lengths and assouiated
max. Mersenne exponents for Proth tests, e.g. k*2^n+1, but this isn't tested so far...)

To get correct timings you will need to parametrize the file times.txt for your
specific hardware.  There are examples given for two Athlons (1 and 2 GHz) and Opteron.
There is no need to get timings for any kind of FFT length.  It's sufficient to have
data on those FFT lengths covered by your range of n.

For small and medium sized k (typically k < 2^20), you should get very accurate results
on the FFT lengths and the timings.  In cases where LLR uses zero-padding (which is the
case for very large k), the results may be wrong by about 10 percent, since the algorithm
used by George Woltmans gwnum is much more sophisticated and needs additional information,
we do not have here.

On the computation of total and average times:
----------------------------------------------

If we assume uniform sampling of our candidates in a given range of n, the total time
can be approximated by integrating the function t(n).

1           n_max
t_average = ------------- * Integral (t(n) dn)
n_max - n_min    n=n_min

Generally t(n) is neither a parabola nor a linear function.  Rather than it is a
non-steady function, partially linear but having steps at the FFT turnover points.
We can get the area under this function by integrating it partially.

For fixed FFT length we'll have:

t(n) = n * msecs_per_iteration(n) ,

where msecs_per_iteration(n) is a constant value, depending on the FFT length.
We will substitute this by m(i), where i is an index for the FFT length.

By splitting the integral into parts at the FFT turnover points we get the following
(after integration):

1
t_average = ----------------- * [ m(i_max)*n_max^2 - m(i_min)*n_min^2
2 (n_max - n_min)

i_max
-   Sum    [ (m(i+1) - m(i)) * n(i)^2 ] ]
i=i_min

Here, n(i) is the max. n for FFT length i, i_min and i_max are indexes of the min. and
max. FFT lengths used.

Therefore we need information at which n the FFT length will change as well as the
times per iteration for each of the FFT lengths.

Typical usage:
--------------

- These are command line tools, so run them manually from the command line rather than
double clicking on them.  You would'nt see any screen output in the later case.

- First run fft_len to get some insight on the FFT lengths for the (nmin,nmax) range

fft_len

Sample screen output:

k = 3
n(min) = 2000000
n(max) = 5000000

The following FFT lengths would be used:

fftlen       nmax
-----------------------
114688    2233110
131072    2560126
163840    3180158
196608    3777190
229376    4411222
262144    5056254

- Then check, if all the necessary FFT lengths are covered by the file "times.txt".
If not, get the msecs/iteration from runing LLR for a few seconds or minutes on some
appropriate (k,n) pairs.

- Run av_time, to get the average time for a given range of n.  If there is insufficient
information on the timings you will get an error.

av_time

Sample screen output:

--- Athlon 1GHz ---
k = 3
n(min) = 2000000
n(max) = 5000000

average time per LLR test: 90019.175 sec

- If you already have an input file (perhaps partially sieved) for LLR, you may want to
see, how long it will take to run LLR on the whole file.
This information can be obtained by running get_time.  This program needs the file name
of your sieve file as a command line parameter.  So simply run it like the following:

get_time input.txt          (replace "input.txt" by your file name)

Sample screen output:

--- Athlon 1GHz ---
number of (k,n) pairs in file: 9357
estimated total time for LLR testing the whole file: 100376901.970 sec
average time per LLR test: 10727.466 sec
Attached Files
 llrtools_src.zip (10.9 KB, 345 views)

Last fiddled with by Thomas11 on 2005-10-02 at 16:06

2005-10-02, 16:09   #36
Thomas11

Feb 2003

22·32·53 Posts

Okay, here come the Windows binaries, compiled by Watcom compiler under Win98.
Attached Files
 llrtools_win32.zip (90.2 KB, 375 views)

 2005-10-02, 18:24 #37 TTn   22·3·83 Posts Hi Thomas, Thanks for the details and specs. I should be able to make some use of this. Right on, Bro! Shane F.
2005-10-03, 13:26   #38
fatphil

May 2003

3×7×11 Posts

Quote:
 Originally Posted by VBCurtis Phil's explanation makes a lot of sense to me, but I think the effect I tried to explain without rigorous math is not taken into account.
Indeed. Quite what effect the change in sieving range has, I really don't want to guess. However, I get a feeling that it could encourage the transition to PRP-ing. i.e. you sieve more shallowly than what my above model indicates. This is because moving the lowest band into PRP-ing state speeds up sieving for the rest of the candidates.

TTn's continuous comparison should help keep the process close to optimal. If his moving average is smooth enough then the deviation from optimal should be negligible. That's one of the rules of thumb when it comes to multi-stage computational tasks - if it's hard to model on paper then simply program the system with adaptive behaviour such that it finds its own local optimum.

Phil

2005-10-04, 05:15   #39
TTn

22·3·347 Posts

Quote:
 TTn's continuous comparison should help keep the process close to optimal. If his moving average is smooth enough then the deviation from optimal should be negligible. That's one of the rules of thumb when it comes to multi-stage computational tasks - if it's hard to model on paper then simply program the system with adaptive behaviour such that it finds its own local optimum.
The moving average can be modified by the user, with the timer option, in the advanced section. By default it is set to 1 hour, for titanic primes.

I set the timer to 5 minutes for the timings.
This is a little smoother, but costs 4-15 seconds depending on the size of the file, and pc speed etc.

Last fiddled with by TTn on 2005-10-04 at 05:16

 2005-10-06, 05:30 #40 VBCurtis     "Curtis" Feb 2005 Riverside, CA 4,789 Posts I only claimed that when trying to decide if splitting off a band of numbers was worthwhile versus waiting until the usual sieve time=average LLR time remaining, you should not simply LLR a number when THAT number's LLR was equal to the sieve time. If you are looking to sieve and LLR a small range of exponents, the traditional average sieve=average LLR works best. If you are running a range from, say, 100k to 1 million, waiting until the sieve reaches average time for that entire range (something like time for n=700k) would cost you time over breaking off, say 100-200K piece at some point. I was only trying to take a stab at the proper time to break off a piece. Phil's point is well-taken-- sieve somewhere close, and it really doesn't matter. I've likely spent more time typing these attempts to explain than my computer would save from any such optimization! Cheers, Curtis
 2005-10-06, 06:45 #41 TTn   2·34·29 Posts Breaking off a piece can be a costly thing too, as it widens the possible error margin. RMA.NET is breaking off 1% CPU to measure the process efficiency. This accounts for large file sizes, hardware resources. The timer could use a slight optimization. Only if the rates are equal will it recieve 99% priority. You can do this manually without RMA in the same way. Set NewPGen to log the numbers removed, so we can count the progress of "NewPGen.del" file. Find the "lresults.txt" file to count the completed tests. Alg: Test and sieve, the same file at once. Every xhours/Day, stop LLR and NewPGen. Compare the count of the NewPGen.del file, versus lresults.txt, and apply handicap to the weak. (log count * 99) or (results count * 99) Change LLR's priority from 1-3, if warranted. (remember for next handicap) 1 for NewPGen priority, 2 for equal priority, and 3 for LLR priority. Delete numbers that have already been tested, from the sieve file. The completed numbers, are ofcourse in the lresults.txt. Start alg over: Last fiddled with by TTn on 2005-10-06 at 06:54
2005-10-27, 10:28   #42
fatphil

May 2003

3·7·11 Posts

Quote:
 Originally Posted by paulunderwood Phil's approach is better... I did say the model I suggested was an over-simplification but it works well enough on a very large "n" range, spanning many FFT changes... When investigating FFT changes, please note the FFT changes are different for non-SSE2 and SSE2-enabled computers
My recent back-filling of non-top-5000 primes has given me a good chance to investigate sieving strategy, and my advice _really_ falls to pieces as the number of FFT bands decreases. It doesn't become wrong, it just becomes inapplicable.

Basically, by the time you're down to only a few FFT ranges, the rate of decrease in PRP rate through PRP-ing larger and larger candidates is smaller than the rate of decrease in sieving rate caused by the same PRP-ing.
i.e. PRP-ing is making sieving much slower, and thus redundant as it's noticably decreasing the size of the remaining pool.

An adaptive technique would of course pick up on this trend, and, if possible, should be favoured.

 Similar Threads Thread Thread Starter Forum Replies Last Post fivemack And now for something completely different 3 2017-05-16 17:55 Cybertronic Factoring 0 2014-03-22 10:07 MooooMoo Riesel Prime Search 16 2008-12-11 11:46 geoff Sierpinski/Riesel Base 5 11 2006-07-10 01:10 jasong Software 18 2005-12-30 20:23

All times are UTC. The time now is 11:07.

Tue May 18 11:07:03 UTC 2021 up 40 days, 5:47, 0 users, load averages: 1.46, 1.53, 1.65