mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Software (https://www.mersenneforum.org/forumdisplay.php?f=10)

 rogue 2023-03-25 21:53

[QUOTE=storm5510;627328][CODE]...287 factors found at 234 sec per factor (last 163 min)...[/CODE]

From [I]srsieve2[/I]: What is this time keeping method? It is certainly not real-time like a clock, other than the elapsed time at the end. :ermm:[/QUOTE]

It tries to take CPU utilization into account. So if you removed 10 terms in 20 minutes then you would think that it would be 1 factor per 120 seconds, but if CPU utilization is only 50%, then it would compute 1 factor per 60 seconds.

I'm sure it isn't perfect.

 storm5510 2023-03-26 15:25

[QUOTE=rogue;627352]It tries to take CPU utilization into account. So if you removed 10 terms in 20 minutes then you would think that it would be 1 factor per 120 seconds, but if CPU utilization is only 50%, then it would compute 1 factor per 60 seconds.

I'm sure it isn't perfect.[/QUOTE]

I read a short article about CPU time vs. real time last week.

[QUOTE]For example, if a program accesses the CPU for one second every five seconds, then its total CPU time in the span of one minute is 12 seconds.[/QUOTE]

The above is pretty cut-and-dried. Perhaps the following mod might un-muddy the water a bit for others:

[CODE]...22 factors found at 214 [B]cpu[/B] sec per factor... [/CODE]

 Citrix 2023-03-26 21:02

Feature Request

Is it possible in the next release we could have a command line option to limit primes being tested to certain classes (similar to pfgw -f{n,+-1}). The program currently does not automatically catch these.

Thanks.

 henryzz 2023-03-27 09:54

It would also be nice to be able to see the sec/factor for shorter periods than from the start. Starting a sieve and then looking at average sec/factor makes no sense.

p/sec also seems to be based on time elapsed rather than cpu seconds. Some consistency would make sense.

 rogue 2023-03-27 12:14

[QUOTE=Citrix;627390]Is it possible in the next release we could have a command line option to limit primes being tested to certain classes (similar to pfgw -f{n,+-1}). The program currently does not automatically catch these. [/QUOTE]

Is this for all sieves or one some sieves?

This could be a fairly major effort.

 rogue 2023-03-27 12:22

[QUOTE=henryzz;627411]It would also be nice to be able to see the sec/factor for shorter periods than from the start. Starting a sieve and then looking at average sec/factor makes no sense.

p/sec also seems to be based on time elapsed rather than cpu seconds. Some consistency would make sense.[/QUOTE]

The program tries to compute sec/factor in a way that includes enough time to represent a meaningful average. Obviously the more time then the better the average. This time can be up to 5 days. The best way to compute is to increase the period of time for the calculation as the removal rate slows down.

If you have ideas on computing a better average, please share them.

As for p/sec, I see what you mean. It shouldn't be too hard to adjust.

 Citrix 2023-03-28 05:13

[QUOTE=rogue;627413]Is this for all sieves or one some sieves?

This could be a fairly major effort.[/QUOTE]

I mainly use srsieve2/srsieve2cl. I do not think the other sieves have special factor classes.

The changes would only require a few lines of code. We only need to filter out where p%n belongs to certain classes. The sieve of Eratosthenes code does not need to rewritten.

If n=2^x then the modulus step can be even faster
If n is extremely smooth (ex 5 smooth) then the modulus step can be fast as well.

 henryzz 2023-03-28 09:34

[QUOTE=rogue;627414]The program tries to compute sec/factor in a way that includes enough time to represent a meaningful average. Obviously the more time then the better the average. This time can be up to 5 days. The best way to compute is to increase the period of time for the calculation as the removal rate slows down.

If you have ideas on computing a better average, please share them.

As for p/sec, I see what you mean. It shouldn't be too hard to adjust.[/QUOTE]

When p changes by a significant factor including it in a longer average makes no sense. For many sieves p will change massively over a 5 day period. I recently did a sieve for 2k minutes(p increased by factor of 7 in sieve) which was estimating 59 sec per factor. I then stopped it and restarted for 100 minutes and the estimate was 152 seconds after 41 factors. This would have been even more extreme if I hadn't already restarted after the first 400 minutes of the sieve.

Something like the last 100 factors would make sense to me(configurable). Something like a std::deque would make it fairly easy to store the cpu time for the last x factors. My only concern is that this could be a bit dodgy when factors are being found very quickly near the beginning of a sieve. Maybe more factors should be considered then; however, the rate of factor finding will be changing rapidly then anyway so it probably doesn't matter too much.

Another option would be to report an expected sec/f based on recent p/sec. That would require the estimated f/p to be accurate. Some of the current sieves don't give accurate estimates for the number of factors. For example, I think last time I used ccsieve, it didn't adjust estimates for sieving multiple terms per candidate.

 rogue 2023-03-28 12:46

Citrix, if you have some code, send it my way and I will take a look at how to best integrate it.

henryzz, I might be able to do something along the line of what you suggested, i.e. track rate for the last xx factors as opposed to the last xx minutes, but I don't think that solves the problem. When one starts a new sieve p/sec is lower due to factor validation and other overhead associated with removing candidates due to a factor.

I think that the best option (whatever that option is) would require saving runtime and factor removal details in a file so that upon restart it could read from that file and continue as if the program was not stopped and restarted.

Since you have some concrete ideas regarding the calculation, would you mind experimenting with the code to find an algorithm that computes the rate in a way that makes more sense?

ccsieve and twinsieve both miscalculate since candidates can be removed "more than one way" for each p. If you have ideas on how to better compute the number of candidates to be removed for those sieves, please share.

 henryzz 2023-03-28 14:22

[QUOTE=rogue;627471]henryzz, I might be able to do something along the line of what you suggested, i.e. track rate for the last xx factors as opposed to the last xx minutes, but I don't think that solves the problem. When one starts a new sieve p/sec is lower due to factor validation and other overhead associated with removing candidates due to a factor.
[/QUOTE]
If you are tracking the cpu-time that each factor is found at shouldn't factors "age-out" pretty quickly if factors are being found quickly enough that validation is taking a meaningful amount of time. Is the issue that the cpu-time measurement doesn't include validation? I am struggling to see the issue. If validation time is having a significant impact still then it is very unlikely that the sieve should stop.

[QUOTE=rogue;627471]
I think that the best option (whatever that option is) would require saving runtime and factor removal details in a file so that upon restart it could read from that file and continue as if the program was not stopped and restarted.
[/QUOTE]
Saving this sort of information would enable more accurate estimation upon restart although I suspect implementation is overkill. If an estimate has to be provided on a reduced number of factors for a while that isn't too bad. The issue that exists currently is more in long runs where sec/fac has changed significantly and the average sec/fac over the run is meaningless.

[QUOTE=rogue;627471]
Since you have some concrete ideas regarding the calculation, would you mind experimenting with the code to find an algorithm that computes the rate in a way that makes more sense?
[/QUOTE]

I can have a look at experimenting. I plan to implement something for a sieve of my own soon so will probably experiment in that. My sieve doesn't verify factors(currently at least) so millage may vary a little.

[QUOTE=rogue;627471]
ccsieve and twinsieve both miscalculate since candidates can be removed "more than one way" for each p. If you have ideas on how to better compute the number of candidates to be removed for those sieves, please share.[/QUOTE]

If x% of candidates would be removed for one of the ways then I believe for n-ways approximately 1 - (1-x/100)^n candidates should be removed. This isn't perfect as it assumes independence but I believe it is probably a good enough approximation. My only concern is for very small p where the probability of removing both candidates if they were independent would be higher. My sieve sieves tuples so I should be able to test the formula there.

I have limitted time to spend on this so it may take a little while to get around to implementing everything. My aim for my siever is to support a number of different techniques for finding/removing factors so I can compare them in different situations. Tuple sieves can become very sparse which means a sieve array is not ideal but a list of candidates can take time checking if a candidate remains as well.

 Citrix 2023-03-29 04:32

[QUOTE=rogue;627471]Citrix, if you have some code, send it my way and I will take a look at how to best integrate it.
[/QUOTE]

Here is the simplest pseudocode
1. Get n from command line and classes to be tested (generally +1 or -1 or both); (if n is 2 all primes need to be tested and we do not need to filter as all primes are odd; 2 is the default value)
2. The location where il_PrimeList is initialized with primes we can insert the following lines of code to filter the primes.
(p is current prime)

[CODE]
while (prime<max)
{
if (n>2 && (p%n==+1 || p%n==-1)) {Enter this prime in the array}
else {skip this prime and go to next prime from iterator}
}
[/CODE]

3. p%n can be calculated faster for certain n values (ex. n=2^x)- though this might not provide any significant speed up. I will let you decide.

All times are UTC. The time now is 09:22.