mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Software (https://www.mersenneforum.org/forumdisplay.php?f=10)
-   -   mtsieve (https://www.mersenneforum.org/showthread.php?t=23042)

rogue 2023-04-16 15:41

The program will choose the q with the lowest value for work. "work" is an estimate of the effort to do a discrete log for each p. The lower the "work", then the more p can be tested per second. This is just an estimate. Reality is sometimes different. I recommend sieving to 1e9 (or deeper) to eliminate terms with small factors as they will skew the results. Take the file of remaining candidates and run a range of at least 1e9 (e.g. 10e9 to 11e9) for each q that is within 20% of the q with the lowest value for work. Look at srsieve2.log or the console output to see which value for -q executed that range in the shortest period of time. That will most often be the default value for q, but not always.

As or U/V/X those are a bit more nuanced and can impact the q which has the lowest value for work. You can play around with these if you want to squeeze out more performance, Some combinations of U/V/X won't work. In other words they might result in invalid factors (the program will terminate if that happens).

I do not have a way today to test all the various combinations to determine which is best. I have thought about adding a command line switch, but I think that would be a lot of work. For now choosing the best q/U/V/X values for each set of sequences can only be done manually.

pepi37 2023-04-16 15:50

-Q give report
-q can be used for user input

cxc 2023-04-17 08:39

I have a possibly odd question – are there versions of mtsieve binaries compiled for Mac Intel that can be downloaded from somewhere? I have a new machine (which is Metal) so it would be impossible to run mtsieve on the current setup as mtsieve doesn’t support Metal (yet); the older machine refuses to compile mtsieve with the old version of Xcode I have installed there, and updating Xcode doesn’t seem to fix the problem. (And if I were to try compiling on the new machine the binary wouldn’t execute on the old machine, so I seem to be stuck.)

After a day of bashing my head against the proverbial brick wall (Xcode) and finding it impervious, I thought I’d ask here to see if anyone has anything that might help. I’m looking to do searching on Fermat numbers, so specifically I think I’m after a binary of the gfn_divsor sieve.

rogue 2023-04-17 12:55

[QUOTE=cxc;628643]I have a possibly odd question – are there versions of mtsieve binaries compiled for Mac Intel that can be downloaded from somewhere? I have a new machine (which is Metal) so it would be impossible to run mtsieve on the current setup as mtsieve doesn’t support Metal (yet); the older machine refuses to compile mtsieve with the old version of Xcode I have installed there, and updating Xcode doesn’t seem to fix the problem. (And if I were to try compiling on the new machine the binary wouldn’t execute on the old machine, so I seem to be stuck.)

After a day of bashing my head against the proverbial brick wall (Xcode) and finding it impervious, I thought I’d ask here to see if anyone has anything that might help. I’m looking to do searching on Fermat numbers, so specifically I think I’m after a binary of the gfn_divsor sieve.[/QUOTE]

I can build on OS X. Please PM or e-mail to talk about the issues you are running into when compiling.

rogue 2023-04-21 19:37

I am working on an experimental change to srsieve2. With this change I am adding a -S parameter. With this parameter one can split the input file by q. srsieve2 will determine the best q for each sequence in the file then write that sequence (or terms for that sequence) to one file per q. In theory each file can be run with srsieve2. For example if I take a file with 6108 sequences, which will have varying best q for each k, it will spit out these files:

[code]
Split 6108 base 3 sequences into 17624 base 3^6 sequences.
1 sequences with 788 terms written to q006_b3_n.abcd
679 sequences with 873666 terms written to q012_b3_n.abcd
3 sequences with 2255 terms written to q015_b3_n.abcd
1 sequences with 1506 terms written to q016_b3_n.abcd
59 sequences with 62472 terms written to q018_b3_n.abcd
5 sequences with 10979 terms written to q020_b3_n.abcd
1579 sequences with 1635356 terms written to q024_b3_n.abcd
126 sequences with 147455 terms written to q030_b3_n.abcd
1569 sequences with 1647256 terms written to q036_b3_n.abcd
11 sequences with 17332 terms written to q040_b3_n.abcd
813 sequences with 734827 terms written to q048_b3_n.abcd
1090 sequences with 1194326 terms written to q060_b3_n.abcd
171 sequences with 112236 terms written to q072_b3_n.abcd
1 sequences with 422 terms written to q090_b3_n.abcd
[/code]

Note that only 1 sequence of the 6108 have a q of 6, yet all are being sieved with that q. Note that srsieve2 might not choose that q when running the file associated with that q. This is due to how it compute the work for the combined k for that file. With limited testing I have seen that by using the q in the file name with the -q parameter actually out-performs the one that srsieve2 would choose. For example with the q036_b3_n.abcd file above I could get 181K p/sec with -q36, but the default q of 12 only yields 142K p/sec. With all 6108 sequences srsieve2cl chooses q of 6 and gets only 35K p/sec, which is pretty much the same speed as q of 12 with 1569 sequences. So one quarter of the sequences gives over 5x of the speed with -q36. I did run q006_b3_n.abcd which uses Legendre tables. It run at about 10M p/sec, which is worse than running the entire file. I'm thinking that the best option is to "peel off" the files with the most sequences and test them with the desired q to the desired depth, then combine the remaining into a single file and sieve them to the desired depth. It might even be possible that each q has a different optimal sieving depth. This needs a lot more experimentation, so when the code is ready I will post on sourceforge.

For new sieves, it has to sieve a bit as that will remove most of the n as the remaining n have an impact on the best q. In this case it will sieve up to a maximum of 2^16. Only the sequences are output and not the file of sequences as that file could be very large. Here is what that output looks like:

[code]
Split 12000 base 3 sequences into 23223 base 3^3 sequences.
1212 sequences for q 12 written to q012_b3.in
10 sequences for q 15 written to q015_b3.in
3 sequences for q 16 written to q016_b3.in
105 sequences for q 18 written to q018_b3.in
2 sequences for q 20 written to q020_b3.in
3171 sequences for q 24 written to q024_b3.in
214 sequences for q 30 written to q030_b3.in
3076 sequences for q 36 written to q036_b3.in
17 sequences for q 40 written to q040_b3.in
1 sequences for q 45 written to q045_b3.in
1700 sequences for q 48 written to q048_b3.in
2166 sequences for q 60 written to q060_b3.in
320 sequences for q 72 written to q072_b3.in
1 sequences for q 80 written to q080_b3.in
2 sequences for q 90 written to q090_b3.in
[/code]

The reason I did this is because we have some conjectures over at CRUS with many thousands, if not tens of thousands or hundreds of thousands of sequences. Since srsieve2cl is memory constrained much more than srsieve2, finding a way to split the various sequences optimally is very important.

This could benefit those who want to split sequences across multiple CPUs or multiple computers. This could benefit those who still use sr2sieve as the logic for selection of q is the same between the programs. IIRC, sr2sieve allows you to specify q on the command line. Use srsieve2 to split the sequences and use the output file as input to sr2sieve. Note that since sr2sieve cannot start with a file of sequences you will have to presieve to some low p with srsieve2, then use -S to split the terms.

One more thing, there is a limit of 2^15 babySteps so too many sequences can yield a assertion error. So if you have tens of thousands or hundreds of thousands of sequences you will need to split into smaller sets of sequences before using -S.

storm5510 2023-04-22 15:18

It is really difficult to create a new series from scratch with [I]srsieve2[/I], example k*1923^n-1. I wrote a script to write the series one k at a time. I end up with many millions of remaining terms in a small range, like k from 2 to 1000. I suspect that I am not using the correct program. Ideas?

rogue 2023-04-22 15:29

[QUOTE=storm5510;629119]It is really difficult to create a new series from scratch with [I]srsieve2[/I], example k*1923^n-1. I wrote a script to write the series one k at a time. I end up with many millions of remaining terms in a small range, like k from 2 to 1000. I suspect that I am not using the correct program. Ideas?[/QUOTE]

Absolutely, but all sequences must have the same base. You can use -s as many times as you want to add sequences or you can use -s with an input file with a sequence on each line. I do this all of the time. Since I have a GPU, I never use srsieve, sr1sieve, or sr2sieve.

rebirther 2023-04-22 15:49

Is it possible to add a parameter to define the writing outputfile time to replace the 1h fixed code for srsieve2?

rogue 2023-04-22 17:23

[QUOTE=rebirther;629123]Is it possible to add a parameter to define the writing outputfile time to replace the 1h fixed code for srsieve2?[/QUOTE]

Yes, but I'm not inclined to add one. What is wrong with writing that file once per hour?

rebirther 2023-04-22 17:26

[QUOTE=rogue;629129]Yes, but I'm not inclined to add one. What is wrong with writing that file once per hour?[/QUOTE]

Its more userfriendly to define shorter times for small bases and longer times for bigger bases.

storm5510 2023-04-22 17:58

[QUOTE=rogue;629120]Absolutely, but all sequences must have the same base. You can use -s as many times as you want to add sequences or you can use -s with an input file with a sequence on each line. I do this all of the time. Since I have a GPU, I never use srsieve, sr1sieve, or sr2sieve.[/QUOTE]

I have a RTX-2080 and can use [I]srsieve2cl[/I]. It just seemed like what I was doing was the long way around.

My input file contained the same base for all sequences. I ended up with 46-million terms in my first try with P=1e9 for k from 2 to 1000.

I have gotten really good throughput using Legendre tables with [I]srsieve2[/I]. Many times it was faster than [I]srsieve2cl[/I]. Still, I will give it a try again.


All times are UTC. The time now is 10:26.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.