mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Software (https://www.mersenneforum.org/forumdisplay.php?f=10)
-   -   mtsieve (https://www.mersenneforum.org/showthread.php?t=23042)

kruoli 2023-04-09 21:06

Maybe adding some special character after the parameter to lock it? E.g. [C]-w 1e6![/C].

rogue 2023-04-13 22:01

I have posted 2.4.6 over at sourceforge. Here are the changes:

[code]
framework:
Support 'f' or 'F' at the end of the -w argument. This will "fix" the
number of primes per CPU workunit and not resize the workunit.

twinsieve: version 1.6.1
Do not apply -r logic to base 2 since even k are already excluded.

fbncsieve: version 1.6.1
Fix issue when generating ABCD file as it counts terms incorrectly.
[/code]

I know that 'l' was suggested and I chose 'f' instead. In any case this was a workable solution.

storm5510 2023-04-13 23:37

[QUOTE=rogue;628429]I have posted 2.4.6 over at sourceforge. Here are the changes:

[code]
framework:
Support 'f' or 'F' at the end of the -w argument. This will "fix" the
number of primes per CPU workunit and not resize the workunit.

twinsieve: version 1.6.1
Do not apply -r logic to base 2 since even k are already excluded.

fbncsieve: version 1.6.1
Fix issue when generating ABCD file as it counts terms incorrectly.
[/code]

I know that 'l' was suggested and I chose 'f' instead. In any case this was a workable solution.[/QUOTE]

Like this: -W6F?

rogue 2023-04-14 02:36

[QUOTE=storm5510;628436]Like this: -W6F?[/QUOTE]

Not quite, more like -w1e6f. Use -w to specify the number of prime per worker. -W is the number of workers and typically would not exceed the number of CPU cores.

If you are going to use that feature, then setting that value higher will improve the rate. For example if you use -w1e6f vs -w1e8f, you will see that -w1e8f is faster. I would only recommend using this under two conditions. First, if you run out of memory, which can happen with the faster sieves. Second, if you want to see if larger prime chunks provide better removal rates for the slower sieves. The downside is that chunks that need a very long time to process will require you to wait longer if you use ^C and you will also likely sieve deeper than you want without it.

storm5510 2023-04-14 14:46

[QUOTE=rogue;628441]Not quite, more like -w1e6f. Use -w to specify the number of prime per worker. -W is the number of workers and typically would not exceed the number of CPU cores.

If you are going to use that feature, then setting that value higher will improve the rate. For example if you use -w1e6f vs -w1e8f, you will see that -w1e8f is faster. I would only recommend using this under two conditions. First, if you run out of memory, which can happen with the faster sieves. Second, if you want to see if larger prime chunks provide better removal rates for the slower sieves. The downside is that chunks that need a very long time to process will require you to wait longer if you use ^C and you will also likely sieve deeper than you want without it.[/QUOTE]

-W6 -w1e8f. Sorry I fudged it. I had to go back and look at all the switches. I've seen [I]srsieve2[/I] resize. Sometimes up and other times down, or both in short order. I have never had a memory problem with it.

Should your [B]^C[/B] above be something else?

rogue 2023-04-14 15:33

[QUOTE=storm5510;628462]Should your [B]^C[/B] above be something else?[/QUOTE]

No. When using ^C, some sieves will process the entire chunk they are currently working on, then terminate. For others it can terminate in the middle of a chunk.

storm5510 2023-04-14 16:14

[QUOTE=rogue;628469]No. When using ^C, some sieves will process the entire chunk they are currently working on, then terminate. For others it can terminate in the middle of a chunk.[/QUOTE]

OK. Mine seems to always finish the chunk then drop out to the prompt. A bit of patience is required. :smile:

pepi37 2023-04-16 01:30

I do some testing with latest srsieve2cl with single sequence

Win10 , RTX3060Ti with 8 GB VRAM

g 32 0.34 core 6.632Mp/s
g 1000 1 core 13.68Mp/s
g 5000 1 core 13.89Mp/s
g 16834 1 core 13.61Mp/s
G10 g32 2.36 core 15.08MP/s
G3 g400 1 core 15.58MP/s
G30 g10 5.25 core 14.56MP/s
G2 g1782 1.1 core 15.41MP/s

On same sequence CPU with 8 Workers ( 8 cores) has around 17.8 Mp/s
If nothing else, CPU draw less then GPU :) Speed is near same

pepi37 2023-04-16 11:31

And very important additional info to my post above: you will get those values only [B][COLOR="Red"]if your GPU is in PCiex16 slot[/B][/COLOR].
I compile srsieve2cl on my small rig where cards are on risers, and fastest I can get on 2070 Super is only 172K p/sec

rogue 2023-04-16 13:37

Note that some of the command line switches might give you a performance boost. These same switches could hurt performance. Play around with the Q/U/V/X switches.

pepi37 2023-04-16 14:31

[QUOTE=rogue;628595]Note that some of the command line switches might give you a performance boost. These same switches could hurt performance. Play around with the Q/U/V/X switches.[/QUOTE]

[QUOTE]-U --bmmulitplier=U multiplied by 2 to compute BASE_MULTIPLE (default 15 for single 1 for multi
default BASE_MULTIPLE=30, BASE_MULTIPLE=2 for multi)
-V --prmmultiplier=V multiplied by BASE_MULTIPLE to compute POWER_RESIDUE_LCM (default 24 for single 360 for multi
default POWER_RESIDUE_LCM=360, POWER_RESIDUE_LCM=360 for multi)
-X --lbmultipler=X multiplied by POWER_RESIDUE_LCM to compute LIMIT_BASE (default 1 for single 1 for multi
default LIMIT_BASE=24, LIMIT_BASE=360 for multi)[/QUOTE]

I cannot even understand what is written here, using those switches to me is big mystery.
Any manual, samples, anything?

For example

[QUOTE]q = 2 with 16 subseq yields bs = 2679, gs = 168, work = 5375
q = 4 with 30 subseq yields bs = 2587, gs = 87, work = 5212
q = 8 with 51 subseq yields bs = 2394, gs = 47, work = 4817
q = 16 with 102 subseq yields bs = 2446, gs = 23, work = 4844
q = 32 with 203 subseq yields bs = 2344, gs = 12, work = 4884
q = 64 with 406 subseq yields bs = 2344, gs = 6, work = 4987
q = 3 with 36 subseq yields bs = 3297, gs = 91, work = 6589
q = 6 with 38 subseq yields bs = 2381, gs = 63, work = 4794
q = 12 with 65 subseq yields bs = 2206, gs = 34, work = 4450
q = 24 with 111 subseq yields bs = 2084, gs = 18, work = 4143
q = 48 with 222 subseq yields bs = 2084, gs = 9, work = 4204
q = 96 with 434 subseq yields bs = 1876, gs = 5, work = 4287
q = 192 with 868 subseq yields bs = 2344, gs = 2, work = 4562
q = 9 with 100 subseq yields bs = 3126, gs = 32, work = 6372
q = 18 with 104 subseq yields bs = 2273, gs = 22, work = 4615
q = 36 with 175 subseq yields bs = 2084, gs = 12, work = 4279
q = 72 with 297 subseq yields bs = 2084, gs = 6, work = 4035
q = 144 with 594 subseq yields bs = 2084, gs = 3, work = 4204
q = 288 with 1163 subseq yields bs = 1563, gs = 2, work = 4556
q = 576 with 2322 subseq yields bs = 1564, gs = 1, work = 5218
q = 5 with 74 subseq yields bs = 3674, gs = 49, work = 7333
q = 10 with 79 subseq yields bs = 2648, gs = 34, work = 5373
q = 20 with 147 subseq yields bs = 2648, gs = 17, work = 5220
q = 40 with 241 subseq yields bs = 2250, gs = 10, work = 4784
q = 80 with 481 subseq yields bs = 2250, gs = 5, work = 4903
q = 160 with 957 subseq yields bs = 2813, gs = 2, work = 5222
q = 320 with 1914 subseq yields bs = 2813, gs = 1, work = 5717
q = 15 with 166 subseq yields bs = 3158, gs = 19, work = 6389
q = 30 with 176 subseq yields bs = 2308, gs = 13, work = 4687
q = 60 with 299 subseq yields bs = 2143, gs = 7, work = 4398
q = 120 with 491 subseq yields bs = 1876, gs = 4, work = 4120
q = 240 with 977 subseq yields bs = 1876, gs = 2, work = 4389
q = 480 with 1908 subseq yields bs = 1876, gs = 1, work = 4883
q = 960 with 3816 subseq yields bs = 939, gs = 1, work = 6953
q = 45 with 462 subseq yields bs = 2858, gs = 7, work = 6308
q = 90 with 482 subseq yields bs = 2001, gs = 5, work = 4667
q = 180 with 805 subseq yields bs = 2501, gs = 2, work = 4559
q = 360 with 1312 subseq yields bs = 2501, gs = 1, work = 4590
q = 720 with 2612 subseq yields bs = 1251, gs = 1, work = 5412
q = 1440 with 5109 subseq yields bs = 626, gs = 1, work = 8787
q = 2880 with 10202 subseq yields bs = 314, gs = 1, work = 16613
q = 1 with 15 subseq yields bs = 3674, gs = 245, work = 7356[/QUOTE]

what parameters you recommend to me, looking at this report?


All times are UTC. The time now is 10:04.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.