mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware

Reply
 
Thread Tools
Old 2020-01-05, 11:50   #23
juza89
 
Jan 2020
Finland

3 Posts
Default

I just recently upgraded my computer to 9900K.


I did couple tests regarding wattage / throughput. Only tested 2880K fft because that's the size i am doing doublechecking at the moment.



results for 9900K @3.6Ghz all cores:
FFTlen=2880K, Type=3, Arch=4, Pass1=320, Pass2=9216, clm=4 (1 core, 1 worker): 11.45 ms. Throughput: 87.36 iter/sec.
FFTlen=2880K, Type=3, Arch=4, Pass1=320, Pass2=9216, clm=4 (2 cores, 1 worker): 5.98 ms. Throughput: 167.18 iter/sec.
FFTlen=2880K, Type=3, Arch=4, Pass1=320, Pass2=9216, clm=4 (2 cores, 2 workers): 11.71, 11.60 ms. Throughput: 171.63 iter/sec.
FFTlen=2880K, Type=3, Arch=4, Pass1=320, Pass2=9216, clm=4 (4 cores, 1 worker): 3.22 ms. Throughput: 310.47 iter/sec.
FFTlen=2880K, Type=3, Arch=4, Pass1=320, Pass2=9216, clm=4 (4 cores, 2 workers): 7.12, 7.06 ms. Throughput: 281.97 iter/sec.
FFTlen=2880K, Type=3, Arch=4, Pass1=320, Pass2=9216, clm=4 (4 cores, 4 workers): 14.38, 14.23, 14.26, 14.28 ms. Throughput: 279.98 iter/sec.
FFTlen=2880K, Type=3, Arch=4, Pass1=320, Pass2=9216, clm=4 (6 cores, 1 worker): 2.73 ms. Throughput: 366.08 iter/sec.
FFTlen=2880K, Type=3, Arch=4, Pass1=320, Pass2=9216, clm=4 (6 cores, 2 workers): 6.80, 6.79 ms. Throughput: 294.25 iter/sec.
FFTlen=2880K, Type=3, Arch=4, Pass1=320, Pass2=9216, clm=4 (6 cores, 4 workers): 21.49, 21.41, 10.72, 10.71 ms. Throughput: 279.89 iter/sec.
FFTlen=2880K, Type=3, Arch=4, Pass1=320, Pass2=9216, clm=4 (8 cores, 1 worker): 2.76 ms. Throughput: 362.61 iter/sec.
FFTlen=2880K, Type=3, Arch=4, Pass1=320, Pass2=9216, clm=4 (8 cores, 2 workers): 7.24, 7.23 ms. Throughput: 276.43 iter/sec.
FFTlen=2880K, Type=3, Arch=4, Pass1=320, Pass2=9216, clm=4 (8 cores, 4 workers): 14.98, 15.23, 15.15, 15.02 ms. Throughput: 265.06 iter/sec.
FFTlen=2880K, Type=3, Arch=4, Pass1=320, Pass2=9216, clm=4 (8 cores, 8 workers): 30.88, 30.40, 31.11, 30.18, 30.40, 30.64, 30.48, 30.48 ms. Throughput: 261.71 iter/sec.


doublechecking work for 6cores 1 worker results in 62w powerconsumption reported by HWMonitor.



same test 9900K with 4.7ghz boost all cores.
6cores 1 worker was still the fastest with 380iter/sec and power usage was 125w.


Conclusion: doubling the power consumption only results in 5% performance increase in Prime95.
juza89 is offline   Reply With Quote
Old 2020-01-06, 04:28   #24
VBCurtis
 
VBCurtis's Avatar
 
"Curtis"
Feb 2005
Riverside, CA

3×1,543 Posts
Default

Conclusion: Your CPU is waiting on your memory to provide data.
Faster Ghz CPU, in this case, is hurry up and wait.

You could try *under*clocking the CPU, if you were looking for peak efficiency; I imagine you could drop wattage 10% or more while still waiting on memory a bit.
VBCurtis is online now   Reply With Quote
Old 2020-01-06, 05:56   #25
nomead
 
nomead's Avatar
 
"Sam Laur"
Dec 2018
Turku, Finland

33010 Posts
Default

I have an i5-8400 with crappy slow OEM memory at work. It's a 6-core processor but I'm actually running Prime95 on just 4 cores because the throughput was best at that setting. So it is starved for memory bandwidth even sooner.
nomead is offline   Reply With Quote
Old 2020-01-25, 11:25   #26
juza89
 
Jan 2020
Finland

3 Posts
Default

I did some more testing for the 9900K
I overclocked the memory to 3600Mhz with 1.38V and got it stable.
I tested different speed from 800Mhz to 4000Mhz. No need to go faster because memory starts to bottleneck.


Fastest speed was 455,32 iter/sec 6cores @4000Mhz consuming 84.5watts. That results in 5.39iters/watt



For the peak efficiency / watt, @1500Mhz was able to get 283,27 iter/sec with 25watt consumption. Thats 11.33 iters/watt!


For anyone interested i've attached all the data that I collected in a spreadsheet.
Attached Files
File Type: 7z 9900K efficiency chart.7z (44.5 KB, 82 views)
juza89 is offline   Reply With Quote
Old 2020-01-25, 11:35   #27
retina
Undefined
 
retina's Avatar
 
"The unspeakable one"
Jun 2006
My evil lair

177F16 Posts
Default

Quote:
Originally Posted by juza89 View Post
I did some more testing for the 9900K
I overclocked the memory to 3600Mhz with 1.38V and got it stable.
I tested different speed from 800Mhz to 4000Mhz. No need to go faster because memory starts to bottleneck.


Fastest speed was 455,32 iter/sec 6cores @4000Mhz consuming 84.5watts. That results in 5.39iters/watt



For the peak efficiency / watt, @1500Mhz was able to get 283,27 iter/sec with 25watt consumption. Thats 11.33 iters/watt!


For anyone interested i've attached all the data that I collected in a spreadsheet.
Thanks for the values.

Note that if you compute iterations/sec per Watt then the output unit is iterations/Joule (because 1 Watt = 1 Joule/sec).
retina is offline   Reply With Quote
Old 2020-01-25, 23:04   #28
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

32·72·11 Posts
Default

iterations/Joule is an interesting measure, but a great deal depends on the fft length.
And on whether the system's many auxiliary loads are fed by those Joules, or only the cpu. Where is the power consumption measured, at the wall plug, the cpu's sensors, or elsewhere? Computational effort per iteration is O(n log n log log n), not constant.
kriesel is online now   Reply With Quote
Old 2020-01-26, 08:07   #29
juza89
 
Jan 2020
Finland

3 Posts
Default

Quote:
Originally Posted by kriesel View Post
iterations/Joule is an interesting measure, but a great deal depends on the fft length.
And on whether the system's many auxiliary loads are fed by those Joules, or only the cpu. Where is the power consumption measured, at the wall plug, the cpu's sensors, or elsewhere? Computational effort per iteration is O(n log n log log n), not constant.
All measurements was done using 2880K fft.
power usage was measured by checking cpu's sensors (package power) with HWMonitor when first fft implementation was running in prime95 benchmark. And I averaged it by eye. Lets say i've measured wattage to be 25w, in that case it was actually fluctuating between 24,7 and 25,3. Random spikes was ignored, I figured they are probably background processes consuming cpu cycles occasionally.
There was little difference in consumption when different types of fft implementations were running, but i didn't bother taking measures every implementation. It would've been too time consuming.
The whole point of Iters/joule measure was to find the most power efficient speed for the cpu. I would've guessed it to be with slowest speed and lowest corevoltage but it did not be that case.
First I changed cpu speeds in BIOS and voltages were on auto, but for some reason the voltages didin't go lower than 0.9v. Then I found software called Throttlestop which lets you change cpu speed on the fly from windows. Voltages with different speeds were stock voltages the cpu asked for.
juza89 is offline   Reply With Quote
Old 2020-01-28, 22:07   #30
TheJudger
 
TheJudger's Avatar
 
"Oliver"
Mar 2005
Germany

2×3×5×37 Posts
Default

Stock Ryzen 9 3900X with dual DDR4-3200 (dual rank).

Code:
Prime95 64-bit version 29.8, RdtscTiming=1
Timings for 2880K FFT length (12 cores, 1 worker):  1.33 ms.  Throughput: 750.80 iter/sec.
Timings for 2880K FFT length (12 cores, 2 workers):  2.10,  2.10 ms.  Throughput: 954.10 iter/sec.
L3 cache works fine! When you increase the FFT size (somewhere near 4M) you want to switch to 12 cores, 1 worker because it doesn't fit twice into the L3 cache anymore.

Oliver
TheJudger is offline   Reply With Quote
Old 2020-01-31, 21:21   #31
TheJudger
 
TheJudger's Avatar
 
"Oliver"
Mar 2005
Germany

21268 Posts
Default

Hi,

full benchmarks here: https://mersenneforum.org/showpost.p...&postcount=788

Oliver
TheJudger is offline   Reply With Quote
Old 2020-10-25, 06:38   #32
DrobinsonPE
 
Aug 2020

2×41 Posts
Default

I decided that my computers were using too much power so I have started tuning them for efficiency. My first set of results are for an i3-9100.

With the setting I chose to use, mprime speed dropped by less than 10% with almost a 46% drop in power use. In addition to the significant drop in power use, the computer fans spin much slower now and make significantly less noise. The one thing I should have measured during the testing but didn't was the CPU temperature. I will look into adding that in the future. Most likely the temperature is much lower with the reduced power use.

See the attached picture for the details.
Attached Thumbnails
Click image for larger version

Name:	I3-9100 Efficiency Data.png
Views:	49
Size:	67.6 KB
ID:	23621  
DrobinsonPE is offline   Reply With Quote
Old 2020-10-25, 07:55   #33
axn
 
axn's Avatar
 
Jun 2003

2·52·97 Posts
Default

Quote:
Originally Posted by DrobinsonPE View Post
With the setting I chose to use, mprime speed dropped by less than 10% with almost a 46% drop in power use. In addition to the significant drop in power use, the computer fans spin much slower now and make significantly less noise. The one thing I should have measured during the testing but didn't was the CPU temperature. I will look into adding that in the future. Most likely the temperature is much lower with the reduced power use.
Very nice. How was the power measured?
axn is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
How much do you pay for your electric energy? em99010pepe Lounge 31 2011-02-14 01:57
kinetic energy science_man_88 Miscellaneous Math 8 2010-05-29 04:14
Energy Minimization ShiningArcanine Math 2 2008-04-16 13:47
VIA C3 efficiency ET_ Hardware 4 2007-03-27 21:29
Energy efficiency for LL markhl Hardware 5 2004-02-04 13:33

All times are UTC. The time now is 16:22.

Wed Jan 27 16:22:24 UTC 2021 up 55 days, 12:33, 0 users, load averages: 5.54, 5.33, 5.13

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.