mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Software

Reply
 
Thread Tools
Old 2022-04-30, 23:27   #199
storm5510
Random Account
 
storm5510's Avatar
 
Aug 2009
Not U. + S.A.

2·3·397 Posts
Default

Quote:
Originally Posted by tshinozk View Post
On p95v307b9.win64, I can not chage a number of cores when I run the benchmark.
If I set to 1 core, UI shows 1 core but Prime95.exe uses all cores.

p95v308b13.win64 also has same issue.
You're specifying a range. I don't think it works this way. Just put 18.
storm5510 is offline   Reply With Quote
Old 2022-05-01, 00:07   #200
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

157018 Posts
Default

Quote:
Originally Posted by James Heinrich View Post
Your screenshot shows 1-18 for "Number of CPU cores to benchmark".
Not a problem. See the note at the bottom of the screen capture's benchmark pane. Prime95 explicitly supports ranges, lists, or lists of ranges, benchmarking successively through 1, 2, 3, ... 18 cores on a single worker by entering 1-18 for example.
Check for other applications using lots of cycles. Firefox can be very CPU and memory intensive.
That can really distort both prime95 benchmark results and what Task Manager CPU monitoring show.

Best benchmarking results will be obtained when all other processes practical are idle or absent.
Attached Thumbnails
Click image for larger version

Name:	prime95benchmark.png
Views:	51
Size:	127.0 KB
ID:	26816  

Last fiddled with by kriesel on 2022-05-01 at 00:08
kriesel is online now   Reply With Quote
Old 2022-05-01, 00:33   #201
James Heinrich
 
James Heinrich's Avatar
 
"James Heinrich"
May 2004
ex-Northern Ontario

2·37·53 Posts
Default

Quote:
Originally Posted by kriesel View Post
Not a problem.
It's only a problem because tshinozk claimed to have entered 1 but got benchmarks (at some point) using all cores.
James Heinrich is offline   Reply With Quote
Old 2022-05-01, 14:17   #202
tshinozk
 
Nov 2012
Japan

5×7 Posts
Default

I run some old versions.
From the taskmanger or the results of benchmarks, I can distinguish the issue.

p95v306b4.win64 OK (single core)
Timings for 2048K FFT length (1 core, 1 worker): 5.01 ms. Throughput: 199.74 iter/sec.

p95v307b1.win64 OK (single core), but fail to complete
p95v307b2.win64 NG (all cores)
p95v307b3.win64 NG (all cores)
p95v307b4.win64 fail to run, immediately stop
p95v307b5.win64 NG (all cores)
p95v307b7.win64 NG (all cores)
p95v307b8.win64 NG (all cores)
p95v307b9.win64 NG (all cores)

p95v308b13.win64 NG (all cores)
Timings for 2048K FFT length (1 core, 1 worker): 0.57 ms. Throughput: 1741.37 iter/sec.
tshinozk is offline   Reply With Quote
Old 2022-05-02, 02:22   #203
tshinozk
 
Nov 2012
Japan

438 Posts
Default

It seems that AlderLake has the issue.
The result of 1 core shows too fast, even if AlderLake is running over 5GHz.

12900k:
https://mersenneforum.org/showpost.p...8&postcount=64
Timings for 2048K FFT length (8 cores, 1 worker): 0.62 ms. Throughput: 1602.22 iter/sec.

https://mersenneforum.org/showpost.p...4&postcount=65
FFTlen=2048K all-complex, Type=3, Arch=8, Pass1=128, Pass2=16384, clm=4 (1 core, 1 worker): 0.62 ms. Throughput: 1624.29 iter/sec.

12700K:
https://mersenneforum.org/showpost.p...7&postcount=69
Timings for 2048K FFT length (1 core, 1 worker): 4.57 ms. Throughput: 218.94 iter/sec.
Timings for 2048K FFT length (8 cores, 1 worker): 0.64 ms. Throughput: 1564.59 iter/sec.
It appears that this is normal.
tshinozk is offline   Reply With Quote
Old 2022-05-02, 10:28   #204
Zhangrc
 
"University student"
May 2021
Beijing, China

4158 Posts
Default

Quote:
Originally Posted by tshinozk View Post
It seems that AlderLake has the issue.
The result of 1 core shows too fast, even if AlderLake is running over 5GHz.

12900k:
https://mersenneforum.org/showpost.p...8&postcount=64
Timings for 2048K FFT length (8 cores, 1 worker): 0.62 ms. Throughput: 1602.22 iter/sec.

https://mersenneforum.org/showpost.p...4&postcount=65
FFTlen=2048K all-complex, Type=3, Arch=8, Pass1=128, Pass2=16384, clm=4 (1 core, 1 worker): 0.62 ms. Throughput: 1624.29 iter/sec.

It appears that this is normal.
Maybe all-complex FFT with AVX-512 instruction set is faster.
FFT uses complex numbers, if we compute a complex number directly instead of computing the real and imaginary part respectively, we could get an over 2x speedup.

Last fiddled with by Zhangrc on 2022-05-02 at 10:32
Zhangrc is offline   Reply With Quote
Old 2022-05-03, 02:00   #205
tshinozk
 
Nov 2012
Japan

5·7 Posts
Default

"Benchmark all-complex FFTs" option is not much faster as normal in my machine with AVX512.

Timings for 2048K FFT length (1 core, 1 worker): 0.65 ms. Throughput: 1535.94 iter/sec.

Timings for 2048K all-complex FFT length (1 core, 1 worker): 0.65 ms. Throughput: 1546.01 iter/sec.

And both have the issue.(running using all cores)
tshinozk is offline   Reply With Quote
Old 2022-05-03, 02:08   #206
James Heinrich
 
James Heinrich's Avatar
 
"James Heinrich"
May 2004
ex-Northern Ontario

2·37·53 Posts
Default

Quote:
Originally Posted by tshinozk View Post
And both have the issue.(running using all cores)
Have you fixed the issue where you're entering 1-18 for "CPU cores to benchmark"? Change that to 1 if you only want to test 1 core...
James Heinrich is offline   Reply With Quote
Old 2022-05-03, 02:32   #207
tshinozk
 
Nov 2012
Japan

5×7 Posts
Default

No.
Prime95.exe uses all cores, even if I enter 1 in "Number of CPU cores to benchmark" textbox.

Timings for 2048K FFT length (1 core, 1 worker): 0.61 ms. Throughput: 1627.02 iter/sec.
Timings for 2100K FFT length (1 core, 1 worker): 0.75 ms. Throughput: 1338.92 iter/sec.
Timings for 2160K FFT length (1 core, 1 worker): 0.80 ms. Throughput: 1243.79 iter/sec.


Throughput for 1 core is expected to around 100-200 iter/sec for such FFT length
tshinozk is offline   Reply With Quote
Old 2022-05-03, 05:08   #208
tshinozk
 
Nov 2012
Japan

5·7 Posts
Default

"FFT timings benchmark" does not have the issue.
I can see the multi-core scaling.

Timing FFTs using 1 core:
Best time for 2048K FFT length: 4.987 ms., avg: 5.001 ms.
Timing FFTs using 2 cores:
Best time for 2048K FFT length: 2.606 ms., avg: 2.891 ms.
Timing FFTs using 3 cores:
Best time for 2048K FFT length: 1.768 ms., avg: 2.066 ms.
Timing FFTs using 4 cores:
Best time for 2048K FFT length: 1.328 ms., avg: 1.853 ms.
Attached Files
File Type: txt results.bench_FFTtimings.txt (19.1 KB, 31 views)
tshinozk is offline   Reply With Quote
Old 2022-05-04, 06:38   #209
tshinozk
 
Nov 2012
Japan

5×7 Posts
Default

I try reduceing the active cores in BIOS setup.
Even if I activate only 4 cores (disabling 14 cores) and Hyperthread is off , Prime95.exe still uses all cores.


Intel(R) Core(TM) i9-7980XE CPU @ 2.60GHz
CPU speed: 3286.47 MHz, 4 cores
Timings for 2048K FFT length (1 core, 1 worker): 1.42 ms. Throughput: 703.46 iter/sec.
Attached Files
File Type: txt results.bench_reduce4Core.txt (6.7 KB, 36 views)
tshinozk is offline   Reply With Quote
Reply

Thread Tools


All times are UTC. The time now is 17:42.


Thu Dec 1 17:42:42 UTC 2022 up 105 days, 15:11, 1 user, load averages: 1.32, 1.17, 1.16

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2022, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔