20171129, 17:45  #1 
Feb 2014
54_{16} Posts 
Number of workers vs. number of CPUs
I'm a little confused on how many workers I should spawn in prime95. Should I spawn 1 worker per CPU, per Core, or per Hyperthread "core"?
The prime95 program offered to run 6 workers on my WIN7 virtual machine that I've assigned 32GB RAM and 20vCPUs to. Is that a good ratio? It appears that the VM is running at 100% CPU. Would it be better to run 1 worker and dedicate 20 CPUs to it? 
20171129, 18:41  #2 
"Curtis"
Feb 2005
Riverside, CA
67^{2} Posts 
Boundaries that seem to apply across all CPU families:
More than one worker per physical core is not optimal (hyperthreaded cores should not be considered for worker count). Assigning one worker to more threads than a single physical socket is inefficient; each socket should get its own worker, at minimum. Within those two bounds, optimal production is determined by experimentation; the benchmark tools mostly automate this, but virtual machines are hard to pin down because thread assignments may go to HT cores sometimes but not others. 
20171129, 18:49  #3  
1976 Toyota Corona years forever!
"Wayne"
Nov 2006
Saskatchewan, Canada
10565_{8} Posts 
Quote:
First 100% CPU is always expected as Prime95 is very efficient. NEVER allocate more workers than Physical cores. (There is the very odd exception to this rule but not enough to consider). I'm guessing Prime95 thinks you have 6 Physical Cores... If you do indeed have 6 cores the general rule is to run 6 workers with 1 core each. Sometime it is slightly more efficient to run less workers with more cores each: For example 3 workers with 2 cores each or 2 workers with 3 cores each. If you want to complete a very large assignment quickly allocate all 6 cores to 1 worker. However, the overall throughput will be up to 25% less than 6 workers with 1 core each. NOTE: a very large assignment is something like an LL test on an exponent over 100 Million. If you have more or less physical cores adjust appropriately. 

20171129, 21:22  #4 
Feb 2014
2^{2}×3×7 Posts 
First, thank you for the quick reply!
The odd thing is that I have 2 physical CPUs (sockets) and each have 12 cores. So, if my math is correct I have 24 cores (48 with hyperthreading). So, if I want to maximize the number of "things" I'm working on I "could" have 12 workers, or if I wanted to maximize speed on completing a single "thing" I could have 1 worker. Is that how I should look at this? 
20171129, 21:35  #5  
1976 Toyota Corona years forever!
"Wayne"
Nov 2006
Saskatchewan, Canada
4469_{10} Posts 
Quote:
Your limiting factor may be RAM. With 32GB and 24 workers definitely do NOT run P1 tests. Again unless you are doing a REALLY big assignment you would lose a reasonable amount of overall thruput putting all 24 cores on 1 assignment. As VBCurtis your best bet would be to run the Benchmark tool. On Version 2.8.x in Windows it is: Options... Benchmark. In Version 2.9.x there are a few more options. I believe you want a "Throughput" benchmark. Maybe someone can correct me. It the end it should direct you to the best worker/core mix. And further indicate the number of Physical cores. 

20171129, 21:38  #6 
P90 years forever!
Aug 2002
Yeehaw, FL
1BF7_{16} Posts 
Options/Benchmark is your friend. Prime95 arbitrarily guessed 4 cores/worker would be pretty good.
Do a throughput benchmark using all 24 cores, a 4M FFT size, and 2,4,6,8,12 workers. Let us know what was best  we are a curious bunch. 
20171129, 23:14  #7  
Feb 2014
2^{2}×3×7 Posts 
So, RAM is included in the calculation? That adds to the question then... how much RAM per core should I account for? Or is it RAM per worker? I have up to 128GB of RAM available.
Quote:
<snip> [Wed Nov 29 14:40:46 2017] Compare your results to other computers at http://www.mersenne.org/report_benchmarks Intel(R) Xeon(R) CPU E52640 0 @ 2.50GHz CPU speed: 1371.03 MHz, 20 cores CPU features: Prefetch, SSE, SSE2, SSE4, AVX L1 cache size: 32 KB L2 cache size: 256 KB, L3 cache size: 15 MB L1 cache line size: 64 bytes L2 cache line size: 64 bytes TLBS: 64 Machine topology as determined by hwloc library: Machine#0 (total=31082972KB, Backend=Windows, hwlocVersion=1.11.6, ProcessName=prime95.exe) NUMANode#0 (local=15302680KB, total=15302680KB) Package#0 (CPUVendor=GenuineIntel, CPUFamilyNumber=6, CPUModelNumber=45, CPUModel="Intel(R) Xeon(R) CPU E52640 0 @ 2.50GHz", CPUStepping=7) L3 (size=15360KB, linesize=64, ways=20, Inclusive=1) L2 (size=256KB, linesize=64, ways=8, Inclusive=0) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x00000001) PU#0 (cpuset: 0x00000001) Core (cpuset: 0x00000002) PU#1 (cpuset: 0x00000002) Core (cpuset: 0x00000004) PU#2 (cpuset: 0x00000004) Core (cpuset: 0x00000008) PU#3 (cpuset: 0x00000008) Core (cpuset: 0x00000010) PU#4 (cpuset: 0x00000010) Core (cpuset: 0x00000020) PU#5 (cpuset: 0x00000020) Core (cpuset: 0x00000040) PU#6 (cpuset: 0x00000040) Core (cpuset: 0x00000080) PU#7 (cpuset: 0x00000080) Core (cpuset: 0x00000100) PU#8 (cpuset: 0x00000100) Core (cpuset: 0x00000200) PU#9 (cpuset: 0x00000200) NUMANode#1 (local=15780292KB, total=15780292KB) Package#1 (CPUVendor=GenuineIntel, CPUFamilyNumber=6, CPUModelNumber=45, CPUModel="Intel(R) Xeon(R) CPU E52640 0 @ 2.50GHz", CPUStepping=7) L3 (size=15360KB, linesize=64, ways=20, Inclusive=1) L2 (size=256KB, linesize=64, ways=8, Inclusive=0) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x00000400) PU#10 (cpuset: 0x00000400) Core (cpuset: 0x00000800) PU#11 (cpuset: 0x00000800) Core (cpuset: 0x00001000) PU#12 (cpuset: 0x00001000) Core (cpuset: 0x00002000) PU#13 (cpuset: 0x00002000) Core (cpuset: 0x00004000) PU#14 (cpuset: 0x00004000) Core (cpuset: 0x00008000) PU#15 (cpuset: 0x00008000) Core (cpuset: 0x00010000) PU#16 (cpuset: 0x00010000) Core (cpuset: 0x00020000) PU#17 (cpuset: 0x00020000) Core (cpuset: 0x00040000) PU#18 (cpuset: 0x00040000) Core (cpuset: 0x00080000) PU#19 (cpuset: 0x00080000) Prime95 64bit version 29.4, RdtscTiming=1 Timings for 2048K FFT length (20 cores, 1 worker): 2.40 ms. Throughput: 417.35 iter/sec. Timings for 2048K FFT length (20 cores, 2 workers): 3.50, 3.51 ms. Throughput: 570.46 iter/sec. Timings for 2048K FFT length (20 cores, 6 workers): 10.70, 10.69, 8.42, 9.42, 12.00, 8.57 ms. Throughput: 611.89 iter/sec. Timings for 2048K FFT length (20 cores, 20 workers): 38.21, 38.33, 38.31, 38.31, 17.91, 35.94, 36.37, 18.15, 38.42, 38.39, 38.36, 38.32, 17.43, 38.42, 38.28, 38.19, 38.04, 38.25, 17.36, 38.39 ms. Throughput: 646.77 iter/sec. Timings for 2304K FFT length (20 cores, 1 worker): 4.15 ms. Throughput: 241.13 iter/sec. Timings for 2304K FFT length (20 cores, 2 workers): 3.85, 3.88 ms. Throughput: 517.34 iter/sec. Timings for 2304K FFT length (20 cores, 6 workers): 11.53, 10.27, 9.75, 10.23, 10.69, 10.34 ms. Throughput: 574.78 iter/sec. Timings for 2304K FFT length (20 cores, 20 workers): 40.52, 23.15, 35.27, 40.88, 40.39, 21.67, 40.95, 39.49, 31.57, 38.20, 39.93, 40.20, 19.38, 40.52, 40.27, 40.41, 19.36, 40.17, 40.23, 40.44 ms. Throughput: 601.11 iter/sec. Timings for 2400K FFT length (20 cores, 1 worker): 3.42 ms. Throughput: 292.46 iter/sec. Timings for 2400K FFT length (20 cores, 2 workers): 3.77, 3.77 ms. Throughput: 530.75 iter/sec. Timings for 2400K FFT length (20 cores, 6 workers): 12.33, 12.89, 7.86, 11.96, 9.72, 9.84 ms. Throughput: 574.05 iter/sec. Timings for 2400K FFT length (20 cores, 20 workers): 39.98, 40.55, 28.61, 40.29, 40.02, 39.59, 37.99, 24.52, 22.99, 40.43, 32.32, 40.09, 40.28, 40.18, 20.40, 40.19, 40.07, 40.01, 40.43, 23.68 ms. Throughput: 591.48 iter/sec. Timings for 2560K FFT length (20 cores, 1 worker): 3.14 ms. Throughput: 318.65 iter/sec. Timings for 2560K FFT length (20 cores, 2 workers): 4.35, 4.30 ms. Throughput: 462.82 iter/sec. Timings for 2560K FFT length (20 cores, 6 workers): 11.99, 13.23, 11.09, 11.03, 14.05, 11.33 ms. Throughput: 499.19 iter/sec. Timings for 2560K FFT length (20 cores, 20 workers): 39.06, 37.04, 47.32, 46.93, 22.23, 49.12, 51.32, 27.56, 48.94, 51.36, 48.66, 48.86, 48.21, 34.94, 49.27, 49.53, 49.32, 49.49, 25.27, 21.49 ms. Throughput: 513.50 iter/sec. Timings for 2688K FFT length (20 cores, 1 worker): 3.28 ms. Throughput: 304.46 iter/sec. Timings for 2688K FFT length (20 cores, 2 workers): 4.36, 4.35 ms. Throughput: 459.37 iter/sec. [Wed Nov 29 14:45:54 2017] Timings for 2688K FFT length (20 cores, 6 workers): 13.08, 13.23, 10.58, 12.76, 12.83, 11.04 ms. Throughput: 493.39 iter/sec. Timings for 2688K FFT length (20 cores, 20 workers): 24.45, 47.69, 48.00, 47.75, 28.96, 38.76, 37.22, 48.61, 48.32, 48.57, 47.70, 47.90, 23.82, 23.24, 44.10, 47.90, 48.27, 48.27, 48.01, 47.68 ms. Throughput: 506.34 iter/sec. Timings for 2880K FFT length (20 cores, 1 worker): 4.37 ms. Throughput: 228.88 iter/sec. Timings for 2880K FFT length (20 cores, 2 workers): 4.73, 4.85 ms. Throughput: 417.48 iter/sec. Timings for 2880K FFT length (20 cores, 6 workers): 16.77, 13.26, 10.49, 12.55, 14.41, 12.31 ms. Throughput: 460.71 iter/sec. Timings for 2880K FFT length (20 cores, 20 workers): 37.25, 40.22, 48.85, 33.47, 48.82, 24.89, 48.70, 49.19, 48.90, 49.24, 25.41, 46.63, 45.70, 48.11, 48.66, 48.40, 25.95, 48.13, 49.43, 49.20 ms. Throughput: 488.89 iter/sec. Timings for 3072K FFT length (20 cores, 1 worker): 3.55 ms. Throughput: 281.84 iter/sec. Timings for 3072K FFT length (20 cores, 2 workers): 5.41, 5.41 ms. Throughput: 369.54 iter/sec. Timings for 3072K FFT length (20 cores, 6 workers): 18.44, 17.31, 11.57, 11.79, 19.59, 16.38 ms. Throughput: 395.31 iter/sec. Timings for 3072K FFT length (20 cores, 20 workers): 62.78, 26.01, 63.60, 56.58, 67.10, 68.04, 68.02, 26.69, 66.81, 68.02, 68.14, 49.76, 36.64, 62.25, 31.23, 46.76, 52.42, 46.79, 61.31, 68.11 ms. Throughput: 402.18 iter/sec. Timings for 3200K FFT length (20 cores, 1 worker): 5.88 ms. Throughput: 169.94 iter/sec. Timings for 3200K FFT length (20 cores, 2 workers): 5.41, 6.12 ms. Throughput: 348.38 iter/sec. Timings for 3200K FFT length (20 cores, 6 workers): 14.67, 15.56, 13.00, 16.57, 14.94, 11.97 ms. Throughput: 420.22 iter/sec. Timings for 3200K FFT length (20 cores, 20 workers): 30.51, 46.17, 54.46, 39.68, 38.83, 54.05, 54.58, 54.56, 54.89, 46.56, 55.20, 54.06, 55.14, 55.12, 52.44, 54.26, 53.96, 54.72, 27.76, 28.53 ms. Throughput: 436.89 iter/sec. Timings for 3360K FFT length (20 cores, 1 worker): 3.65 ms. Throughput: 273.63 iter/sec. Timings for 3360K FFT length (20 cores, 2 workers): 5.39, 5.39 ms. Throughput: 370.87 iter/sec. Timings for 3360K FFT length (20 cores, 6 workers): 16.19, 15.84, 13.75, 14.42, 17.07, 14.15 ms. Throughput: 396.18 iter/sec. Timings for 3360K FFT length (20 cores, 20 workers): 31.55, 58.57, 58.84, 55.16, 58.61, 59.27, 58.73, 29.12, 54.68, 59.17, 58.91, 47.68, 58.45, 58.94, 34.64, 53.75, 51.15, 58.31, 31.31, 59.16 ms. Throughput: 409.43 iter/sec. Timings for 3456K FFT length (20 cores, 1 worker): 4.10 ms. Throughput: 244.11 iter/sec. [Wed Nov 29 14:51:09 2017] Timings for 3456K FFT length (20 cores, 2 workers): 5.92, 5.87 ms. Throughput: 339.44 iter/sec. Timings for 3456K FFT length (20 cores, 6 workers): 19.57, 17.84, 13.56, 16.80, 18.63, 14.50 ms. Throughput: 363.07 iter/sec. Timings for 3456K FFT length (20 cores, 20 workers): 65.46, 61.70, 63.60, 65.22, 66.48, 33.04, 52.40, 66.38, 34.28, 66.15, 66.54, 39.14, 64.78, 66.58, 64.80, 62.08, 64.60, 46.17, 65.94, 31.01 ms. Throughput: 373.41 iter/sec. Timings for 3584K FFT length (20 cores, 1 worker): 4.16 ms. Throughput: 240.15 iter/sec. Timings for 3584K FFT length (20 cores, 2 workers): 6.72, 6.71 ms. Throughput: 297.75 iter/sec. Timings for 3584K FFT length (20 cores, 6 workers): 20.76, 22.58, 14.98, 16.59, 24.33, 16.51 ms. Throughput: 321.14 iter/sec. Timings for 3584K FFT length (20 cores, 20 workers): 76.80, 75.28, 75.80, 80.42, 72.56, 81.19, 72.34, 33.61, 81.32, 32.69, 81.03, 80.52, 73.16, 77.12, 76.79, 33.11, 64.69, 79.31, 38.12, 75.66 ms. Throughput: 326.63 iter/sec. Timings for 3840K FFT length (20 cores, 1 worker): 5.41 ms. Throughput: 185.01 iter/sec. Timings for 3840K FFT length (20 cores, 2 workers): 6.59, 6.56 ms. Throughput: 304.39 iter/sec. Timings for 3840K FFT length (20 cores, 6 workers): 17.96, 20.20, 16.72, 15.19, 23.88, 17.07 ms. Throughput: 331.30 iter/sec. Timings for 3840K FFT length (20 cores, 20 workers): 71.68, 39.16, 53.96, 71.43, 70.50, 53.34, 71.57, 49.44, 71.82, 56.43, 68.67, 68.87, 40.21, 70.96, 71.39, 71.04, 40.10, 44.88, 71.68, 71.31 ms. Throughput: 342.12 iter/sec. Timings for 4032K FFT length (20 cores, 1 worker): 4.73 ms. Throughput: 211.54 iter/sec. Timings for 4032K FFT length (20 cores, 2 workers): 6.99, 6.98 ms. Throughput: 286.28 iter/sec. Timings for 4032K FFT length (20 cores, 6 workers): 16.60, 25.97, 18.07, 19.64, 19.20, 19.79 ms. Throughput: 307.60 iter/sec. Timings for 4032K FFT length (20 cores, 20 workers): 76.88, 79.40, 79.36, 36.92, 76.59, 60.98, 63.29, 47.68, 78.14, 78.19, 48.42, 57.99, 78.47, 61.37, 78.25, 62.23, 44.51, 79.94, 77.13, 78.79 ms. Throughput: 313.53 iter/sec. Timings for 4096K FFT length (20 cores, 1 worker): 5.18 ms. Throughput: 193.18 iter/sec. Timings for 4096K FFT length (20 cores, 2 workers): 7.31, 7.29 ms. Throughput: 274.03 iter/sec. Timings for 4096K FFT length (20 cores, 6 workers): 22.95, 20.14, 18.22, 22.82, 22.11, 16.91 ms. Throughput: 296.26 iter/sec. [Wed Nov 29 14:56:14 2017] Timings for 4096K FFT length (20 cores, 20 workers): 79.73, 79.14, 77.49, 78.53, 79.39, 39.13, 39.36, 79.21, 79.83, 79.68, 71.10, 66.34, 59.75, 70.98, 79.29, 55.76, 79.08, 40.79, 79.47, 78.83 ms. Throughput: 305.01 iter/sec. Timings for 4480K FFT length (20 cores, 1 worker): 5.38 ms. Throughput: 185.84 iter/sec. Timings for 4480K FFT length (20 cores, 2 workers): 7.49, 7.47 ms. Throughput: 267.44 iter/sec. Timings for 4480K FFT length (20 cores, 6 workers): 25.92, 18.35, 20.65, 20.54, 24.16, 19.28 ms. Throughput: 283.43 iter/sec. Timings for 4480K FFT length (20 cores, 20 workers): 41.13, 83.66, 41.92, 82.51, 83.57, 81.01, 83.30, 83.00, 83.33, 83.33, 48.19, 83.46, 83.62, 82.28, 82.34, 83.66, 83.56, 69.19, 74.60, 40.70 ms. Throughput: 289.94 iter/sec. Timings for 4608K FFT length (20 cores, 1 worker): 5.61 ms. Throughput: 178.17 iter/sec. Timings for 4608K FFT length (20 cores, 2 workers): 7.90, 7.89 ms. Throughput: 253.43 iter/sec. Timings for 4608K FFT length (20 cores, 6 workers): 25.14, 22.88, 19.34, 23.64, 25.61, 18.41 ms. Throughput: 270.86 iter/sec. Timings for 4608K FFT length (20 cores, 20 workers): 86.80, 86.17, 87.27, 86.26, 88.85, 42.12, 86.63, 79.47, 44.13, 88.85, 88.15, 87.34, 87.18, 42.15, 86.41, 88.15, 41.88, 86.70, 86.29, 86.48 ms. Throughput: 278.69 iter/sec. Timings for 4800K FFT length (20 cores, 1 worker): 5.66 ms. Throughput: 176.62 iter/sec. Timings for 4800K FFT length (20 cores, 2 workers): 8.23, 8.19 ms. Throughput: 243.52 iter/sec. Timings for 4800K FFT length (20 cores, 6 workers): 25.57, 26.91, 18.84, 23.79, 23.07, 22.78 ms. Throughput: 258.64 iter/sec. Timings for 4800K FFT length (20 cores, 20 workers): 93.11, 90.29, 91.82, 47.44, 59.41, 94.25, 50.78, 92.35, 92.56, 85.95, 59.33, 90.55, 94.93, 42.52, 90.34, 92.09, 91.99, 94.98, 90.94, 54.14 ms. Throughput: 268.94 iter/sec. Timings for 5120K FFT length (20 cores, 1 worker): 6.47 ms. Throughput: 154.51 iter/sec. Timings for 5120K FFT length (20 cores, 2 workers): 9.21, 9.16 ms. Throughput: 217.75 iter/sec. Timings for 5120K FFT length (20 cores, 6 workers): 30.00, 30.28, 20.17, 34.15, 22.34, 23.50 ms. Throughput: 232.52 iter/sec. Timings for 5120K FFT length (20 cores, 20 workers): 49.59, 101.60, 101.01, 100.88, 101.62, 99.63, 98.00, 100.16, 48.96, 101.52, 99.51, 91.20, 49.11, 101.38, 96.45, 101.02, 100.16, 100.72, 50.93, 101.14 ms. Throughput: 241.10 iter/sec. Timings for 5376K FFT length (20 cores, 1 worker): 6.53 ms. Throughput: 153.05 iter/sec. [Wed Nov 29 15:01:24 2017] Timings for 5376K FFT length (20 cores, 2 workers): 9.34, 9.31 ms. Throughput: 214.41 iter/sec. Timings for 5376K FFT length (20 cores, 6 workers): 25.42, 30.07, 24.28, 21.27, 34.93, 25.79 ms. Throughput: 228.21 iter/sec. Timings for 5376K FFT length (20 cores, 20 workers): 69.55, 77.28, 96.74, 104.01, 102.45, 58.99, 74.29, 102.78, 103.01, 94.24, 102.66, 101.69, 103.75, 71.43, 88.18, 57.02, 63.89, 103.12, 90.30, 103.89 ms. Throughput: 235.63 iter/sec. Timings for 5760K FFT length (20 cores, 1 worker): 7.09 ms. Throughput: 141.02 iter/sec. Timings for 5760K FFT length (20 cores, 2 workers): 9.66, 9.61 ms. Throughput: 207.63 iter/sec. Timings for 5760K FFT length (20 cores, 6 workers): 31.77, 30.04, 21.98, 32.62, 23.01, 27.46 ms. Throughput: 220.78 iter/sec. Timings for 5760K FFT length (20 cores, 20 workers): 63.67, 107.85, 107.65, 61.85, 106.82, 65.99, 100.42, 108.34, 108.39, 105.96, 106.88, 107.29, 107.29, 107.88, 86.25, 109.04, 107.83, 53.38, 109.05, 56.44 ms. Throughput: 225.73 iter/sec. Timings for 6144K FFT length (20 cores, 1 worker): 7.70 ms. Throughput: 129.90 iter/sec. Timings for 6144K FFT length (20 cores, 2 workers): 11.12, 11.12 ms. Throughput: 179.86 iter/sec. Timings for 6144K FFT length (20 cores, 6 workers): 28.95, 40.99, 26.71, 41.50, 35.84, 22.80 ms. Throughput: 192.23 iter/sec. Timings for 6144K FFT length (20 cores, 20 workers): 66.36, 125.75, 124.54, 83.76, 123.24, 123.94, 74.74, 124.56, 125.17, 100.65, 130.60, 125.03, 105.78, 123.31, 122.69, 65.99, 124.44, 130.60, 59.97, 110.87 ms. Throughput: 196.41 iter/sec. Timings for 6400K FFT length (20 cores, 1 worker): 7.74 ms. Throughput: 129.23 iter/sec. Timings for 6400K FFT length (20 cores, 2 workers): 11.50, 11.42 ms. Throughput: 174.47 iter/sec. Timings for 6400K FFT length (20 cores, 6 workers): 43.84, 33.20, 25.16, 34.96, 33.42, 28.68 ms. Throughput: 186.05 iter/sec. Timings for 6400K FFT length (20 cores, 20 workers): 67.83, 59.96, 129.29, 122.52, 133.75, 132.97, 126.52, 126.06, 133.96, 100.43, 127.68, 58.84, 135.45, 133.65, 130.27, 133.77, 135.44, 129.95, 128.17, 58.70 ms. Throughput: 190.33 iter/sec. Timings for 6720K FFT length (20 cores, 1 worker): 8.31 ms. Throughput: 120.40 iter/sec. Timings for 6720K FFT length (20 cores, 2 workers): 11.53, 11.37 ms. Throughput: 174.72 iter/sec. [Wed Nov 29 15:06:26 2017] Timings for 6720K FFT length (20 cores, 6 workers): 27.68, 43.63, 30.57, 42.24, 32.78, 25.91 ms. Throughput: 184.54 iter/sec. Timings for 6720K FFT length (20 cores, 20 workers): 129.37, 129.04, 115.29, 130.64, 68.38, 129.43, 119.89, 62.04, 128.76, 127.72, 129.64, 126.78, 127.53, 128.82, 76.40, 61.43, 128.18, 129.49, 101.97, 111.66 ms. Throughput: 189.07 iter/sec. Timings for 6912K FFT length (20 cores, 1 worker): 8.58 ms. Throughput: 116.49 iter/sec. Timings for 6912K FFT length (20 cores, 2 workers): 13.05, 12.98 ms. Throughput: 153.71 iter/sec. Timings for 6912K FFT length (20 cores, 6 workers): 35.86, 37.57, 37.19, 45.06, 46.26, 26.01 ms. Throughput: 163.65 iter/sec. Timings for 6912K FFT length (20 cores, 20 workers): 155.31, 65.91, 158.20, 158.95, 160.21, 158.13, 160.28, 158.85, 155.31, 66.22, 73.09, 152.40, 150.07, 152.12, 155.50, 123.06, 150.83, 75.56, 116.95, 157.62 ms. Throughput: 163.66 iter/sec. Timings for 7168K FFT length (20 cores, 1 worker): 8.95 ms. Throughput: 111.73 iter/sec. Timings for 7168K FFT length (20 cores, 2 workers): 13.20, 13.23 ms. Throughput: 151.34 iter/sec. Timings for 7168K FFT length (20 cores, 6 workers): 50.22, 34.11, 32.42, 37.41, 37.99, 36.57 ms. Throughput: 160.47 iter/sec. Timings for 7168K FFT length (20 cores, 20 workers): 69.52, 151.76, 151.72, 151.10, 153.70, 152.53, 151.18, 69.46, 153.64, 152.80, 149.58, 147.73, 144.40, 148.19, 149.92, 149.94, 68.87, 72.03, 140.11, 148.58 ms. Throughput: 164.05 iter/sec. Timings for 7680K FFT length (20 cores, 1 worker): 9.23 ms. Throughput: 108.37 iter/sec. Timings for 7680K FFT length (20 cores, 2 workers): 14.34, 14.34 ms. Throughput: 139.47 iter/sec. Timings for 7680K FFT length (20 cores, 6 workers): 50.29, 31.75, 43.11, 54.94, 33.58, 39.34 ms. Throughput: 147.98 iter/sec. Timings for 7680K FFT length (20 cores, 20 workers): 167.94, 71.75, 93.29, 109.12, 179.30, 179.36, 176.81, 171.65, 166.21, 176.98, 91.27, 102.03, 179.94, 163.99, 167.54, 168.75, 168.41, 163.89, 152.53, 81.45 ms. Throughput: 149.26 iter/sec. Timings for 8000K FFT length (20 cores, 1 worker): 9.82 ms. Throughput: 101.84 iter/sec. Timings for 8000K FFT length (20 cores, 2 workers): 14.04, 13.93 ms. Throughput: 143.02 iter/sec. Timings for 8000K FFT length (20 cores, 6 workers): 50.13, 49.67, 27.40, 49.86, 37.47, 34.30 ms. Throughput: 152.48 iter/sec. [Wed Nov 29 15:11:36 2017] Timings for 8000K FFT length (20 cores, 20 workers): 154.19, 72.59, 158.16, 163.38, 72.88, 155.24, 151.27, 148.87, 163.62, 159.39, 99.64, 155.69, 159.07, 98.74, 158.52, 163.94, 164.93, 164.83, 89.34, 113.56 ms. Throughput: 155.99 iter/sec. Timings for 8192K FFT length (20 cores, 1 worker): 10.50 ms. Throughput: 95.27 iter/sec. Timings for 8192K FFT length (20 cores, 2 workers): 15.86, 15.82 ms. Throughput: 126.27 iter/sec. Timings for 8192K FFT length (20 cores, 6 workers): 46.48, 43.33, 46.01, 60.91, 42.41, 37.49 ms. Throughput: 133.00 iter/sec. Timings for 8192K FFT length (20 cores, 20 workers): 184.80, 93.81, 182.43, 187.74, 145.93, 113.38, 189.01, 148.27, 187.70, 139.96, 187.41, 182.88, 189.76, 183.01, 187.30, 87.91, 166.80, 135.81, 96.36, 189.81 ms. Throughput: 134.32 iter/sec. </snip> 

20171130, 00:24  #8  
P90 years forever!
Aug 2002
Yeehaw, FL
7159_{10} Posts 
Quote:
Yes, it simply is a case of maximizing the throughput (iter/sec) value. Which in your case seems heavily skewed to one core per worker. I'd try benching the 5, 10, 20 worker case just to be sure (I previously suggested 6,12 because I thought you had a 24 worker case). Assuming the 20 worker benchmark maintains the best throughput, the only question remaining is "do you have the patience to wait for 20 workers to plod along at a slow pace before getting any results?". GIMPS is better off with 4 completed results after a week's time rather than 20 abandoned partially completed results in a week's time. 

20171130, 00:51  #9  
Feb 2014
84_{10} Posts 
Quote:
Just trying to figure out how to read the results output and decide which is best to do. 

20171130, 00:54  #10  
Undefined
"The unspeakable one"
Jun 2006
My evil lair
1011011110111_{2} Posts 
Quote:
Last fiddled with by retina on 20171130 at 00:54 

20171130, 00:56  #11 
Feb 2014
2^{2}·3·7 Posts 
Which, from the above results output, appears to be 20 cores and 6 workers, yes? (Which happens to be the suggested number of workers when I first started the program.)

Thread Tools  
Similar Threads  
Thread  Thread Starter  Forum  Replies  Last Post 
Change number of workers not interactively  Mcbsc  Software  1  20150309 14:25 
Number of CPUs to use  houding  Software  3  20150226 19:56 
Number of distinct prime factors of a Double Mersenne number  aketilander  Operazione Doppi Mersennes  1  20121109 21:16 
command line switch for the number of workers to start  roemer2201  Software  6  20120216 07:47 
Fermat number F6=18446744073709551617 is a composite number. Proof.  literka  Factoring  5  20120130 12:28 