mersenneforum.org New Raspberry PI 4
 Register FAQ Search Today's Posts Mark Forums Read

 2019-07-05, 18:11 #12 hansl     Apr 2019 5·41 Posts I don't have one to play with, but I wonder what sort of difference in idle power it makes if HDMI is disabled. From what I just read, on the earlier pi3 it saves about 30mA. Maybe more savings available for rpi4 since it has two HDMI ports, or at least more powerful graphics processing? Also maybe USB ports could be disabled too(assuming you just access via SSH) for some savings? Is there a build of powertop or similar program which breaks down what devices power is going towards?
2019-07-05, 19:03   #13

"Sam Laur"
Dec 2018
Turku, Finland

1010010002 Posts

Quote:
 Originally Posted by hansl I don't have one to play with, but I wonder what sort of difference in idle power it makes if HDMI is disabled. From what I just read, on the earlier pi3 it saves about 30mA. Maybe more savings available for rpi4 since it has two HDMI ports, or at least more powerful graphics processing?
I can test this, but I left the thing on my desk at work... Monday at the earliest, then.

Also, I don't know what power saving tricks the official Raspbian distribution does by default, maybe it runs cooler? But yeah, Monday.

Quote:
 Originally Posted by hansl Also maybe USB ports could be disabled too(assuming you just access via SSH) for some savings?
Hmm... will have to look into it. Anyway, there is apparently a firmware update for the USB chip now available, that can reduce the power consumption somewhat - by about 300 mW.
https://www.raspberrypi.org/forums/v...vl805#p1490467
But apparently this needs to be done under 32-bit Linux (for example plain old Raspbian), trying to run the upgrade utility just gives an error message for me.

Quote:
 Originally Posted by hansl Is there a build of powertop or similar program which breaks down what devices power is going towards?
Not to my knowledge, no. I was under the impression that it can only tell where the CPU power consumption is going, not the peripherals.

2019-07-05, 19:13   #14
ewmayer
2ω=0

Sep 2002
República de California

2×13×443 Posts

Nomead, thanks for the data. Re. idle-power, on the Odroid-C2 there is a removable jumper whose pulling-off saves some power, not sure if anything similar on your board.

Quote:
 Originally Posted by nomead The Pi3B (and 3A) likes to use radix-352 at 2816K FFT size, but the Pi4 for some reason is slower with it (not by much, 78.82 ms/iter for radix-352 vs. 76.70 ms for radix-176). By the way, the same thing happens on the Cortex-A57 on the Jetson Nano. Also, only 2 of 5 radix sets for 2304K passed, so it was skipped, no entry in mlucas.cfg .
352 is mainly geared for 5632K where it makes a significant difference on most of my ARM devices, whether it also helps at 2816K is hit or miss.

Do you still have the screen log from your self-tests? I'd like to look at the 2304K self-tests outputs to see why only 2 of the various FFT-radix combos at that length passed. Thanks.

2019-07-05, 19:34   #15

"Sam Laur"
Dec 2018
Turku, Finland

32810 Posts

Quote:
 Originally Posted by ewmayer Nomead, thanks for the data. Re. idle-power, on the Odroid-C2 there is a removable jumper whose pulling-off saves some power, not sure if anything similar on your board. 352 is mainly geared for 5632K where it makes a significant difference on most of my ARM devices, whether it also helps at 2816K is hit or miss. Do you still have the screen log from your self-tests? I'd like to look at the 2304K self-tests outputs to see why only 2 of the various FFT-radix combos at that length passed. Thanks.
No jumpers on the Pi4... if something can be disabled, it needs to be done via software or firmware.

352 happens to help on Pi3 / BCM2837 so there it's a definite hit.

I'll attach the screenlog to this message.
Attached Files
 screenlog_pi4.zip (6.9 KB, 47 views)

2019-07-05, 20:12   #16
ewmayer
2ω=0

Sep 2002
República de California

2·13·443 Posts

Quote:
 Originally Posted by nomead 352 happens to help on Pi3 / BCM2837 so there it's a definite hit. I'll attach the screenlog to this message.
Thanks - looking at the 2304K self-tests in the log, 3 of the 5 runs suffer ROE >= 0.4375, and since that means fewer than half the tests passed, the code treats that FFT length as having failed the self-tests. Based on the good data points, here is a manually created cfg-file line for that length:
Code:
      2304  msec/iter =   61.16  ROE[avg,max] = [0.249911153, 0.343750000]  radices = 288 16 16 16  0  0  0  0  0  0
I quick-checked that length on my Odroid C2 just now and 3 of 5 tests passed so it wrote a cfgfile entry for me ... that difference had me puzzled - same code, same CPU hardware - until I recalled that the random residue shift can lead to such otherwise-identical-everything differences. If you try rerunning just that one FFT length in self-test mode via

./Mlucas -fftlen 2304 -iters 100 -cpu 0:3

you should see different residue shifts from your Mlucas -s m run, and perhaps will get the one more good data point that is needed for the cfg-file to get written. The self-test exponents are already set at the extreme high end of the range computed for each FFT length, so sometimes a little manual hackery of this kind is needed to get a complete set of cfg-file entries.

2019-07-06, 00:20   #17

"Sam Laur"
Dec 2018
Turku, Finland

5108 Posts

Quote:
 Originally Posted by ewmayer T ./Mlucas -fftlen 2304 -iters 100 -cpu 0:3 you should see different residue shifts from your Mlucas -s m run, and perhaps will get the one more good data point that is needed for the cfg-file to get written. The self-test exponents are already set at the extreme high end of the range computed for each FFT length, so sometimes a little manual hackery of this kind is needed to get a complete set of cfg-file entries.
Yup - and it gives the same 288 16 16 16 radix set on three consecutive tries, with 3 of 5 sets passed.

Oh, and here are the rest of the self-test runs.
v18.0:
Code:
      4096  msec/iter =  116.38  ROE[avg,max] = [0.000227303, 0.312500000]  radices = 256 16 16 32  0  0  0  0  0  0
4608  msec/iter =  129.56  ROE[avg,max] = [0.000248429, 0.312500000]  radices = 288 16 16 32  0  0  0  0  0  0
5120  msec/iter =  181.85  ROE[avg,max] = [0.000234485, 0.281250000]  radices = 160 32 32 16  0  0  0  0  0  0
5632  msec/iter =  204.47  ROE[avg,max] = [0.000257845, 0.343750000]  radices = 176 32 32 16  0  0  0  0  0  0
6144  msec/iter =  225.43  ROE[avg,max] = [0.000247003, 0.312500000]  radices = 192 32 32 16  0  0  0  0  0  0
6656  msec/iter =  242.89  ROE[avg,max] = [0.000266479, 0.375000000]  radices = 208 32 32 16  0  0  0  0  0  0
7168  msec/iter =  262.44  ROE[avg,max] = [0.000226100, 0.281250000]  radices = 224 32 32 16  0  0  0  0  0  0
7680  msec/iter =  290.09  ROE[avg,max] = [0.000236377, 0.312500000]  radices = 240 32 32 16  0  0  0  0  0  0
preview version:
Code:
      4096  msec/iter =  124.18  ROE[avg,max] = [0.227270067, 0.281250000]  radices = 256 16 16 32  0  0  0  0  0  0
4608  msec/iter =  130.03  ROE[avg,max] = [0.249110271, 0.312500000]  radices = 288 16 16 32  0  0  0  0  0  0
5120  msec/iter =  154.51  ROE[avg,max] = [0.296955541, 0.375000000]  radices = 320 16 16 32  0  0  0  0  0  0
5632  msec/iter =  166.74  ROE[avg,max] = [0.223459145, 0.281250000]  radices = 352 16 16 32  0  0  0  0  0  0
6144  msec/iter =  226.16  ROE[avg,max] = [0.246091736, 0.343750000]  radices = 192 32 32 16  0  0  0  0  0  0
6656  msec/iter =  243.35  ROE[avg,max] = [0.230394501, 0.312500000]  radices = 208 32 32 16  0  0  0  0  0  0
7168  msec/iter =  265.73  ROE[avg,max] = [0.236601462, 0.312500000]  radices = 224 32 32 16  0  0  0  0  0  0
7680  msec/iter =  283.72  ROE[avg,max] = [0.235477282, 0.343750000]  radices = 240 32 32 16  0  0  0  0  0  0
Indeed, there's a huge difference in 5120K and 5632K FFT sizes' speeds, because of those radix sets with 320 and 352.

Last fiddled with by nomead on 2019-07-06 at 00:25 Reason: added tables

2019-07-06, 02:42   #18
ewmayer
2ω=0

Sep 2002
República de California

2·13·443 Posts

Quote:
 Originally Posted by nomead Yup - and it gives the same 288 16 16 16 radix set on three consecutive tries, with 3 of 5 sets passed.
If you rerun the same single-FFT-length way, you will get the same initial radix shift, and thus run-to-run data will be identical. You can, however, manually fiddle the initial shift via the -shift flag, if you like.
[timings snipped]
Quote:
 Indeed, there's a huge difference in 5120K and 5632K FFT sizes' speeds, because of those radix sets with 320 and 352.
Hmm ... a healthy speedup at 5632K I can believe because radix-352 is new in v19, but radix-320 was already there in v18. Maybe rerun the 5120K self-test once more using each of the v18 and v19 builds?

2019-07-06, 06:59   #19

"Sam Laur"
Dec 2018
Turku, Finland

23×41 Posts

Quote:
 Originally Posted by ewmayer Hmm ... a healthy speedup at 5632K I can believe because radix-352 is new in v19, but radix-320 was already there in v18. Maybe rerun the 5120K self-test once more using each of the v18 and v19 builds?
Apparently v18 happened to give excessive roundoff on both 320 16 16 32 and 320 32 16 16 so that's why it wasn't using it. So again, yes, hand-massaging the test would help here.

2019-07-06, 18:40   #20
ewmayer
2ω=0

Sep 2002
República de California

1151810 Posts

Quote:
 Originally Posted by nomead Apparently v18 happened to give excessive roundoff on both 320 16 16 32 and 320 32 16 16 so that's why it wasn't using it. So again, yes, hand-massaging the test would help here.
Sounds like I need to back off a bit on the self-test exponents in v19, to make sure faster but slightly more roundoff-prone FFT radix combos don't go by the wayside like that.

 2019-07-08, 08:10 #21 nomead     "Sam Laur" Dec 2018 Turku, Finland 14816 Posts Okay, power saving measurements: force_turbo=1 in the configuration file for both Gentoo and Debian to keep it at 1.5 GHz even when idle. Baseline (Gentoo 64-bit, nothing disabled yet) 0.69A idle -> 1.29A Mlucas running a doublecheck at 2816K FFT Raspbian (because the firmware updater only runs on 32-bit Linux) : 0.65A idle before USB update 0.59A idle after USB update So yes, Raspbian does something different and saves a bit more power at idle. Gentoo after USB firmware update, HDMI still on: 0.61A idle -> 1.21A Mlucas Turning HDMI off with tvservice -o saves a further 0.02 Amps apparently.
 2019-07-08, 08:56 #22 M344587487     "Composite as Heck" Oct 2017 2·313 Posts powertop is worth a shot if it works, on my laptop it can disable controllers for USB, ethernet, SATA and other PCI devices. The older pi's USB/ethernet controller was a power hog if I remember rightly.

 Similar Threads Thread Thread Starter Forum Replies Last Post M344587487 Hardware 3 2018-11-17 13:20 BrainStone Mlucas 14 2017-11-19 00:59 lavalamp Hobbies 10 2017-08-16 00:37 sloppyonefoot Software 1 2017-07-02 08:48 xilman Hardware 126 2017-06-01 14:42

All times are UTC. The time now is 01:05.

Tue Sep 22 01:05:42 UTC 2020 up 11 days, 22:16, 0 users, load averages: 1.46, 1.71, 1.69