![]() |
![]() |
#1387 | |
"Alexander"
Nov 2008
The Alamo City
32×113 Posts |
![]() Quote:
Accept the attached fkbnsieve patch as an assist. I copied the template from srsieve2. No attribution is needed, as I added nothing original. |
|
![]() |
![]() |
![]() |
#1388 |
Random Account
Aug 2009
Oceanus Procellarum
23·13·29 Posts |
![]()
In my long-run project for Gary Barnes, srsieve2 is not a real problem. It is PFGW. I know, it is not part of the framework. It simply takes too much time, IMO. I dropped the "phase=20000" from the list. 16,000 is as far as I will go with it, which is running now. I anticipate a month to six weeks just to get it that far. Maybe more...
![]() |
![]() |
![]() |
![]() |
#1389 | |
"Mark"
Apr 2003
Between here and the
24×467 Posts |
![]() Quote:
|
|
![]() |
![]() |
![]() |
#1390 | |
Romulan Interpreter
"name field"
Jun 2011
Thailand
2·5,179 Posts |
![]() Quote:
![]() Last fiddled with by LaurV on 2023-09-20 at 04:18 |
|
![]() |
![]() |
![]() |
#1391 | |
"Gary"
May 2007
Overland Park, KS
310016 Posts |
![]() Quote:
I don't think you understand the way testing works. Sieving is only 5-10% of any effort. Testing is the other 90-95% of it. If you have sieved about far enough, testing should take ~10-20 times as long as sieving does. If it takes more than 20X then you have likely have not sieved far enough. If it takes less than 10X then you likely sieved too far. In between is the golden zone for CPU efficiency. Srbsieve does pretty well in instructing srsieve2 to hit that zone. There's nothing wrong with how long the various latest versions of the testing programs such as LLR or PFGW take to test. |
|
![]() |
![]() |
![]() |
#1392 |
"Mark"
Apr 2003
Between here and the
24·467 Posts |
![]()
I have posted mtsieve 2.5.4 to sourceforge. Here are the changes:
Code:
framework: Fix issue where small primes might not be tested because the initial worker stops processing its chunk so that another worker can continue. fkbnsieve: version 1.6 Remove x86 asm in factor validation. srsieve2/srsieve2cl: version 1.7.7 Fix a performance issue when starting with -i with tens of thousands of sequences. If you are sieving tens of thousands of sequences avoid using input files where k's are not in ascending sequence. Performance for loading the sequences will take a noticeable hit if the input is not sorted by ascending k. Fix memory usage upon startup when searching for square free part of k. Remove x86 asm use so that it can run on ARM factors. Enforce generic sieving for k > 2^63. Some workers have AVX logic (cwsieve, psieve, xyyxsieve), but that is conditionally compiled for x86 CPUs and used only if the x86 CPU supports the AVX functionality that is needed by that worker. |
![]() |
![]() |
![]() |
#1393 | |
Random Account
Aug 2009
Oceanus Procellarum
23·13·29 Posts |
![]() Quote:
All instances are using the latest releases of srbsieve, srsieve2, and PFGW. I was able to slip in the latter two on-the-fly when they were not being used. srbsieve, I had to stop everything to replace it. There were no problems in restarting each one. I found each console parked against the left side of the screen this morning. I usually keep them horizontally staggered in the center. It makes it easier to differentiate between each. What happened there, I don't know. Each were still running though. I have had experience with testing in past years before PRP replaced LL. TF went fast, P-1 took longer, and LL took way longer. I saw a post somewhere a few days ago where an individual was discussing running a PRP on a wavefront exponent. It was going to take 66 days. I don't think I could do that. When these finish at 16K, I plan to put each instance into a zip file and send them to you via email. The initial single run to 500 will be included. |
|
![]() |
![]() |
![]() |
#1394 |
"Mark"
Apr 2003
Between here and the
24×467 Posts |
![]()
I use ConsoleZ. It is a Windows app that allows you to open multiple command prompts in a single window, so you have less desktop clutter.
|
![]() |
![]() |
![]() |
#1395 | |
Random Account
Aug 2009
Oceanus Procellarum
23·13·29 Posts |
![]() Quote:
Windows Powershell ISE can have multiple tabs but will only run things related to it, like scripts, for example. |
|
![]() |
![]() |
![]() |
#1396 | |
Dec 2011
After 1.58M nines:)
23·13·17 Posts |
![]() Quote:
Code:
./srsieve2 -P 1e15 -W 28 -i b767_n.boinc -O factors.txt -f B srsieve2 v1.7.7, a program to find factors of k*b^n+c numbers for fixed b and variable k and n Sieving with single sequence c=1 logic for p >= 1957275377549 BASE_MULTIPLE = 30, POWER_RESIDUE_LCM = 720, LIMIT_BASE = 720 Split 1 base 767 sequence into 15 base 767^120 sequences. Legendre summary: Approximately 1 bytes needed for Legendre tables 1 total sequences 1 are eligible for Legendre tables 0 are not eligible for Legendre tables 1 have Legendre tables in memory 0 cannot have Legendre tables in memory 0 have Legendre tables loaded from files 1 required building of the Legendre tables 518400 bytes used for congruent q and ladder indices 259200 bytes used for congruent qs and ladders Sieve started: 1957275377549 <= p <= 1e15 with 1377 terms (1000102 <= n <= 1099918, k*767^n+1) (expecting 248 factors) Increasing worksize to 1000000000 since each chunk is tested in less than a second p=1961325128843, 29.04M p/sec, 3 factors found at 864 sec per factor (last 2 min), 0.0% done. ETC 2024-09-02 22:02 Simple math is show 4*8 is far less then 28*2.8. Ryzen at 4Ghz has 27M p/sec Xeon at 2.8 Ghz has from 29 to 32M p/sec Last fiddled with by pepi37 on 2023-09-20 at 20:56 |
|
![]() |
![]() |
![]() |
#1397 |
"Mark"
Apr 2003
Between here and the
24·467 Posts |
![]()
Adding threads won't always scale and sometimes threads are competing for work. Note that the amount of available memory will also have an impact. Note that each worker will be using 8 GB of memory to hold the primes.
Run with -W4 on both machines to see the impact of thrashing on the machine with 28 cores. You can also set the number of primes per chunk to a fixed value by using -w and adding 'f' to the parameter passed to it. For example -w5e8f will use about 4 GB per worker and won't increase or decrease the number of primes per chunk. You might also need it to run for a few minutes to reduce the impact of "peaks and valleys" of the rate calculation done when sieving starts. A single GPU could be faster than all of your CPU cores combined. |
![]() |
![]() |