![]() |
![]() |
#837 |
1976 Toyota Corona years forever!
"Wayne"
Nov 2006
Saskatchewan, Canada
123128 Posts |
![]() Code:
Timings for 2304K FFT length (8 cores, 1 worker): 1.13 ms. Throughput: 881.23 iter/sec. Timings for 2304K FFT length (8 cores, 8 workers): 14.88, 13.80, 14.65, 15.13, 14.54, 14.41, 14.49, 13.89 ms. Throughput: 553.21 iter/sec. This seems contrary to what I've seen for every past Computer (all 4 cores). Related to that I tried an unrelated test where I ran P-1s on 2 cores only leaving the other 6 idle. The P-1s completed in 4:41. If I run all 8 cores on P-1 (MaxHighMemWorkers=6) each P-1 takes about 9 hours. Or are these benchmarks only legit for LL/PRP and not for P1? Here is a snipped from a benchmark from today Code:
[Jan 18 16:01] Your timings will be written to the results.bench.txt file. [Jan 18 16:01] Compare your results to other computers at http://www.mersenne.org/report_benchmarks [Jan 18 16:01] Benchmarking multiple workers to measure the impact of memory bandwidth [Jan 18 16:01] Timing 2048K FFT, 8 cores, 1 worker. Average times: 0.83 ms. Total throughput: 1208.05 iter/sec. [Jan 18 16:01] Timing 2048K FFT, 8 cores, 2 workers. Average times: 2.12, 2.22 ms. Total throughput: 921.34 iter/sec. [Jan 18 16:01] Timing 2048K FFT, 8 cores, 4 workers. Average times: 5.31, 5.12, 5.26, 4.85 ms. Total throughput: 779.64 iter/sec. [Jan 18 16:01] Timing 2048K FFT, 8 cores, 8 workers. Average times: 10.36, 10.62, 10.29, 9.98, 10.65, 10.65, 10.33, 10.32 ms. Total throughput: 769.56 iter/sec. The total throughput for 4 cores with 1, 2 or 4 workers was similar; slightly better overall with 4 workers....as expected. Last fiddled with by petrw1 on 2021-01-18 at 22:16 |
![]() |
![]() |
![]() |
#838 |
Romulan Interpreter
"name field"
Jun 2011
Thailand
240608 Posts |
![]()
The benchmarks apply to P-1 too. Except for the GCD phase, P-1 is computationally the same as LL (needs FFT multiplications and sqarings). Your system seems to be memory bounded, so you will get better if you run less workers. This is the "norm" for a while (few years), with new CPUs and with many cores, since P95 v28 or so. What you see is normal. I have a quad channel X99 mobo, with a 10 cores i7-6950X on it, 64GB RAM (edit: in 8 sticks times 8GB)**, and I witness the same behavior like you, except timing is a bit different. You get 4 hours wall-clock running one worker in 2 cores, but you will NOT get the same 4 hours running 2 workers, each in 2 cores (total 4 cores uses), because they access the same memory channels, waiting for each-other. Your best bet may be to run 2 workers, each in 4 cores, or so. Try different versions.
** Edit: does your 8x4 means "8 sticks" or "4 sticks"? (and don't answer "yes" ![]() Last fiddled with by LaurV on 2021-01-19 at 04:00 |
![]() |
![]() |
![]() |
#839 | |
1976 Toyota Corona years forever!
"Wayne"
Nov 2006
Saskatchewan, Canada
10100110010102 Posts |
![]() Quote:
4x8GB per attachment. I was hoping getting very fast (?) 3600 RAM would minimize the bottleneck. I can try different cores/workers setups but the benchmark seems to indicate 1 worker x 8 cores will be the fastest by far. Or am I reading it wrong? |
|
![]() |
![]() |
![]() |
#840 | |
1976 Toyota Corona years forever!
"Wayne"
Nov 2006
Saskatchewan, Canada
123128 Posts |
![]() Quote:
Total about 22 completions per day. A short test suggests 2 workers x 4 cores will take almost exactly 2 hours to complete. With 2 workers that is 24 completions per day. A short test suggests 1 worker x 8 cores takes 51 minutes per completion. That is 28 completions per day. I find it hard to believe that with 4x8GB DDR6 DRAM 3600MHZ that the best throughput is 1 worker sharing 8 cores. Could something be set up wrong? Where do I start? Thanks |
|
![]() |
![]() |
![]() |
#841 |
Romulan Interpreter
"name field"
Jun 2011
Thailand
24·643 Posts |
![]()
You are not reading it wrong. Try it and see if it is indeed so much faster. If it is not, you have an argument with George
![]() Edit: crosspost Last fiddled with by LaurV on 2021-01-19 at 05:19 |
![]() |
![]() |
![]() |
#842 |
P90 years forever!
Aug 2002
Yeehaw, FL
11×751 Posts |
![]() |
![]() |
![]() |
![]() |
#843 | |
1976 Toyota Corona years forever!
"Wayne"
Nov 2006
Saskatchewan, Canada
2×3×887 Posts |
![]() Quote:
See CPU-Z screen shots. RAM is in slots 1, 3, 5, 7. 1 Worker x 8 Cores overnight. Anywhere from 47 to 60 minutes per. Code:
Magic_8_Ball 43313173 NF-PM1 2021-01-19 13:55 0.0 B1=1000000, B2=20000000 4.5044 Magic_8_Ball 43310401 NF-PM1 2021-01-19 13:07 0.0 B1=1000000, B2=20000000 4.5044 Magic_8_Ball 43311067 NF-PM1 2021-01-19 12:20 0.0 B1=1000000, B2=20000000 4.5044 Magic_8_Ball 43311071 NF-PM1 2021-01-19 11:21 0.0 B1=1000000, B2=20000000 4.5044 Magic_8_Ball 43311077 NF-PM1 2021-01-19 10:29 0.0 B1=1000000, B2=20000000 4.5044 Magic_8_Ball 43311101 NF-PM1 2021-01-19 09:29 0.0 B1=1000000, B2=20000000 4.5044 Magic_8_Ball 43311131 NF-PM1 2021-01-19 08:37 0.0 B1=1000000, B2=20000000 4.5044 Magic_8_Ball 43311269 NF-PM1 2021-01-19 07:37 0.0 B1=1000000, B2=20000000 4.5044 Magic_8_Ball 43311287 NF-PM1 2021-01-19 06:45 0.0 B1=1000000, B2=20000000 4.5044 |
|
![]() |
![]() |
![]() |
#844 |
1976 Toyota Corona years forever!
"Wayne"
Nov 2006
Saskatchewan, Canada
2×3×887 Posts |
![]()
I'm now completing these same P1 in 45 to 47 minutes.
|
![]() |
![]() |
![]() |
#845 |
Romulan Interpreter
"name field"
Jun 2011
Thailand
24×643 Posts |
![]() ![]() ![]() Just to make it clear, my wheelbarrow is faster when I run 2 workers in 5 cores each, compared with a single worker in 10, especially for small FFT (like PRP-CF and CF-DC ranges). For larger FFTs, the difference is not significant, or is arguable. That's why I suggested you try both versions. |
![]() |
![]() |
![]() |
#846 |
"Nuri, the dragon :P"
Jul 2016
Good old Germany
23×3×37 Posts |
![]()
Quick run without water cooling.
After about 10 minutes the temperature growed up to around 95°C-100°C; so i have to stop running BOINC work til i got the water cooling to run. |
![]() |
![]() |
![]() |
#847 |
"Viliam Furík"
Jul 2018
Martin, Slovakia
2×401 Posts |
![]() |
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Perpetual "interesting video" thread... | Xyzzy | Lounge | 57 | 2023-06-08 04:11 |
LLR benchmark thread | Oddball | Riesel Prime Search | 5 | 2010-08-02 00:11 |
Perpetual I'm pi**ed off thread | rogue | Soap Box | 19 | 2009-10-28 19:17 |
Perpetual autostereogram thread... | Xyzzy | Lounge | 10 | 2006-09-28 00:36 |
Perpetual ECM factoring challenge thread... | Xyzzy | Factoring | 65 | 2005-09-05 08:16 |