mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Software > Mlucas

Reply
 
Thread Tools
Old 2019-06-24, 08:19   #1
ET_
Banned
 
ET_'s Avatar
 
"Luigi"
Aug 2002
Team Italia

2×2,383 Posts
Default New Raspberry PI 4

Please send your impressions using it with mlucas

https://www.raspberrypi.org/blog/ras...e-now-from-35/
ET_ is online now   Reply With Quote
Old 2019-06-24, 18:52   #2
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

101100111111102 Posts
Default

Looks good from a cost/expected-performance perspective ... the Odroid N2 a73 core runs at 1.8GHz vs the Pi4's a72 @1.5GHz, so assuming the RAM/caches are similarly good at keeping the CPU fed, Pi4 should run at ~80% of the N2's a73 CPU, for a little over half the cost, including the minimum accessories.. As I've noted over in the next-gen Odroid thread, running a second Mlucas job on the 2xa53 sub-CPU of the N2 only boosts total throughput by around 1/8th, due to memory contention. Looking forward to seeing some timings from a purchaser of one of these - Luigi, did you order one or are you waiting for someone else to try before you decide whether to buy?

Edit: Only saw this thread on the Pi4 over in Hardware after writing the above ... so a 28nm process vs the N2's 12nm, that will likely need an added heatsink/fan to prevent throttling (the N2 somes with a big honking heatsink covering the bottom of the board, in my monthlong all-6-cores Mlucas running I detected no throttling.) And the Pi4 ships with 32-bit Raspbian, ugh - you'll want to load a 64-bit Linux to run Mlucas, since the Mlucas SIMD build requires it.

Last fiddled with by ewmayer on 2019-06-24 at 18:59
ewmayer is offline   Reply With Quote
Old 2019-06-25, 07:23   #3
nomead
 
nomead's Avatar
 
"Sam Laur"
Dec 2018
Turku, Finland

14816 Posts
Default

I'll get one as soon as I can. Yesterday, when I heard the news, there were 15 pcs left of the 2GB version on Farnell, but being at work, had to do other things for a short while. And when I came back, those had gone. Now their website shows that the next batch (both 2GB and 4GB models, but no info on 1GB) will arrive on September 23rd, but those estimates have been wildly inaccurate in the past, at least for the last couple launches (3B+ and 3A+)

I just wonder how much magic will be needed to make all the new bits work in e.g. Gentoo, but let's hope that someone else does the work by the time I get the device.
nomead is offline   Reply With Quote
Old 2019-06-26, 18:59   #4
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

1151810 Posts
Default

Nomead, when your Pi4 arrives and you have a 64-bit install, I'll be interested in the CPU temps running Mlucas on all cores. Over in the next-gen odroid thread someone mentioned using 'cat /sys/class/thermal/thermal_zone0/temp' (divide result by 1000 to get temp in C) to monitor temps ... my Odroid C2 - no fan but with a 40x40mm factory-mounted heatsink, and the board inside a clear plastic case, which is surely not good for temps but crucial to allow the thing to be handled to plug in stuff - typically ranges 60-70C, and only starts to show signs of throttling (based on Mlucas checkpoint timings) around 70C and above.
ewmayer is offline   Reply With Quote
Old 2019-06-27, 09:20   #5
ldesnogu
 
ldesnogu's Avatar
 
Jan 2008
France

10000100002 Posts
Default

https://www.cnx-software.com/2019/06...comment-564167
Quote:
With all packages updated my RPi 4 now idles at close to 65°C (ambient temp 27°C).
That doesn't look good I hope it's not representative...

Last fiddled with by ldesnogu on 2019-06-27 at 09:20
ldesnogu is offline   Reply With Quote
Old 2019-06-27, 19:46   #6
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

2CFE16 Posts
Default

Quote:
Originally Posted by ldesnogu View Post
https://www.cnx-software.com/2019/06...comment-564167
That doesn't look good I hope it's not representative...
Oof - I mean, how much effort would it have required, during Pi4 board development, to experiment with a couple different CPU+RAM configurations to find a quasi-optimal one from a thermal standpoint, and slapping a small Alu heatsink on at the factory would cost an extra what, maybe 50 cents? We'd be happy to pay a couple bucks more for a Pi4 which can actually run under load without massive throttling, guys. The article you link does note this:

so far one of the cheapest RK3399 boards was NanoPi M4 going for $65. FriendlyELEC has now decided, certainly in response to Raspberry Pi 4 offering, to lower the price to $50 for the 2GB RAM version which compares to $45 with Raspberry Pi 4 2GB

The NanoPi M4 is 2xa72 + 4xa53, so using the general rule that an a72/a73 core = 2 a53 cores an ignoring throttling and multi-CPU memory contention issues, here are the relative strengths of the various offerings being discussed around here in terms of 'equivalent a53 cores':

o Odroid C2: 4
o NanoPi M4: 8
o RPi4: 8
o Odroid N2: 10

But designed-to-throttle-under-load-ness will knock the number down, potentially by as much as 2x. Hopefully one of our RPi4 guinea-piggers will find a relatively cheap way to affix a heatsink and get some airflow around same. In a sense it's not that different from my Galaxy S7-compute-cluster phones, which need external airflow to keep throttling levels reasonable under load, although those at least use a highly sophisticated compact internal heat-spreading system to get the heat out to the case.

Last fiddled with by ewmayer on 2019-06-27 at 19:47
ewmayer is offline   Reply With Quote
Old 2019-06-27, 22:08   #7
ldesnogu
 
ldesnogu's Avatar
 
Jan 2008
France

10208 Posts
Default

Also the RK3399 has higher frequency than the Pi 4. The NanoPi M4 looks a better choice to me.



An article about the throttling issue in Pi 4: https://www.cnx-software.com/2019/06...tsink-edition/


The heatsink is definitely required.
ldesnogu is offline   Reply With Quote
Old 2019-06-27, 23:15   #8
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

101100111111102 Posts
Default

I'm guessing under all-cores SIMD-using load, there will be much larger than a mere 20% throttling penalty, sans heatsink. With heatsink, a good "free airflow" place for the unit might be next to a vent fan in an ATX-cased system - even if the exhaust air is warm from the PC's CPU, warm moving air beats room-temp still air, as long as its temperature is still appreciably less than that of the Pi4 CPU. Or one could stack several PI4s-with-heatsink next to a USB fan.
ewmayer is offline   Reply With Quote
Old 2019-06-28, 08:01   #9
M344587487
 
M344587487's Avatar
 
"Composite as Heck"
Oct 2017

23·79 Posts
Default

Quote:
Originally Posted by ldesnogu View Post
Also the RK3399 has higher frequency than the Pi 4. The NanoPi M4 looks a better choice to me.

An article about the throttling issue in Pi 4: https://www.cnx-software.com/2019/06...tsink-edition/

The heatsink is definitely required.
When someone does the 100M jest benchmark with the pi 4 we'll be able to directly compare with the M4 ( https://www.mersenneforum.org/showpo...&postcount=229 ). From what I remember I was somewhat underwhelmed with how the RK3399 performed. Testing was done with the included massive aluminium heatsink and no airflow.

The pi 4 definitely needs a $2 copper heatsink at least, airflow might be optional but it'll need testing to confirm.
M344587487 is offline   Reply With Quote
Old 2019-06-28, 14:35   #10
nomead
 
nomead's Avatar
 
"Sam Laur"
Dec 2018
Turku, Finland

14816 Posts
Default

Quote:
Originally Posted by M344587487 View Post
The pi 4 definitely needs a $2 copper heatsink at least, airflow might be optional but it'll need testing to confirm.
My favourite for the Pi3B+ and 3A+ :
https://uk.farnell.com/fischer-elekt...4mm/dp/1850037
It's aluminium though, but taller than most heat sinks sold for the Pi so its thermal resistance is lower. Then I use thermally conductive glue to stick it on. The conductive two-sided tapes aren't that good. Handy, but crap. The conductivity might seem OK but the tape is much thicker than a layer of that glue. But still, even the 3A+ at stock clock needs moving air in addition to this heat sink, if running Mlucas. The whole card draws about 0.9 A then, no idea how much goes to the CPU. This current draw, by the way, is also much higher than most "full load" measurements on the web. I guess they don't use the NEON instruction set, then.

Now, the BCM2711 on the Pi 4 is about the same size, so I'll probably use the same cooling solution for it. I managed to order one from Denmark, so maybe next week... but getting it to run any flavour of 64-bit Linux is still likely to be a problem.
nomead is offline   Reply With Quote
Old 2019-07-05, 13:26   #11
nomead
 
nomead's Avatar
 
"Sam Laur"
Dec 2018
Turku, Finland

23×41 Posts
Default

Okay I'm in a bit of a hurry now. So I only have incomplete results for now. Received one 1GB Pi4 yesterday. I got lucky, the maintainer of the Raspberry Pi 3-compatible 64-bit Gentoo image was very fast and managed to make a quick patch set that misses some features (like, ahem, accessing memory over 1 GB) but it's good enough to check the performance. And I have to say I'm disappointed. It seems to be about 2.2x faster than a Pi3B+ but let's hope that some of it is due to the kernel being a quick hack and some part of the memory subsystem is somehow being used wrong. I mean, the Jetson Nano with 4x Cortex-A57 running at a slower clock speed seems to be about 15% faster than this.

So I fetched a fresh v1.4.2 image of Pi3 64-bit Gentoo upon which the update package is written. See instructions at https://www.raspberrypi.org/forums/v...91136#p1492322 There were some other performance related surprises earlier in that thread (Merge sort benchmark as compiled by GCC was actually slower than on a Pi3; but with Clang it was faster). Same guesses there, probably something wrong with the memory access.

Running idle without a heat sink, core temperature 55C.

Running Mlucas self-tests without a heat sink will result in throttling in under two minutes. First from 1.5 to 1.0 GHz and after a longer period the hard thermal limit of 85C is reached and it will throttle down to 750 MHz. Maybe even lower, but I didn't feel like waiting that long.

Idle, 14x14x14 mm heat sink but no fan, core temperature 51C after about 10 minutes. So no huge difference there. But running Mlucas self-tests, the core temperature now only rises to about 73C and there is no throttling. Running as a bare board on the table though, so in a case, some extra airflow would be needed.

So next I placed a small undervolted fan (12V fan fed with 5V) next to the Pi and now it stays at around 52-55C while running self tests, even for a longer period. And the fan really only makes a slight breeze and is pretty much silent in an office environment. I might be able to hear it at home though.

mlucas.cfg, as far as I let it run, precompiled Mlucas v18.0 :
Code:
18.0
      2048  msec/iter =   56.03  ROE[avg,max] = [0.000306411, 0.375000000]  radices = 128 32 16 16  0  0  0  0  0  0
      2304  msec/iter =   61.13  ROE[avg,max] = [0.000249863, 0.343750000]  radices = 288 16 16 16  0  0  0  0  0  0
      2560  msec/iter =   67.98  ROE[avg,max] = [0.000236003, 0.312500000]  radices = 160 16 16 32  0  0  0  0  0  0
      2816  msec/iter =   77.15  ROE[avg,max] = [0.000259256, 0.343750000]  radices = 176 16 16 32  0  0  0  0  0  0
      3072  msec/iter =   85.38  ROE[avg,max] = [0.000267585, 0.375000000]  radices = 192 16 16 32  0  0  0  0  0  0
      3328  msec/iter =   93.26  ROE[avg,max] = [0.000280428, 0.406250000]  radices = 208 16 16 32  0  0  0  0  0  0
      3584  msec/iter =   99.89  ROE[avg,max] = [0.000254826, 0.343750000]  radices = 224 16 16 32  0  0  0  0  0  0
      3840  msec/iter =  111.78  ROE[avg,max] = [0.000247071, 0.312500000]  radices = 240 16 16 32  0  0  0  0  0  0
mlucas.cfg, another run with the "preview" binary posted a short while ago:
Code:
18.0
      2048  msec/iter =   55.83  ROE[avg,max] = [0.256171472, 0.312500000]  radices = 128 16 16 32  0  0  0  0  0  0
      2560  msec/iter =   67.83  ROE[avg,max] = [0.235211654, 0.312500000]  radices = 160 16 16 32  0  0  0  0  0  0
      2816  msec/iter =   76.70  ROE[avg,max] = [0.276159794, 0.343750000]  radices = 176 16 16 32  0  0  0  0  0  0
      3072  msec/iter =   85.49  ROE[avg,max] = [0.266928258, 0.406250000]  radices = 192 16 16 32  0  0  0  0  0  0
      3328  msec/iter =   91.66  ROE[avg,max] = [0.254067332, 0.343750000]  radices = 208 16 16 32  0  0  0  0  0  0
      3584  msec/iter =  100.41  ROE[avg,max] = [0.271899162, 0.375000000]  radices = 224 16 16 32  0  0  0  0  0  0
      3840  msec/iter =  107.90  ROE[avg,max] = [0.247359254, 0.312500000]  radices = 240 16 16 32  0  0  0  0  0  0
The Pi3B (and 3A) likes to use radix-352 at 2816K FFT size, but the Pi4 for some reason is slower with it (not by much, 78.82 ms/iter for radix-352 vs. 76.70 ms for radix-176). By the way, the same thing happens on the Cortex-A57 on the Jetson Nano. Also, only 2 of 5 radix sets for 2304K passed, so it was skipped, no entry in mlucas.cfg .

For monitoring, I used these Raspberry Pi - specific commands:
Code:
vcgencmd measure_temp
vcgencmd measure_clock arm
vcgencmd get_throttled
Especially the last one is important. It shows even if throttling has occurred at all. The clock rate might seem normal at the time of polling, but if this value stays at 0x0, then no throttling has happened for any reason at any time.

Current consumption on the 5V supply:
0.69A idle, wired LAN in use, WLAN off
1.32A highest during Mlucas self-tests
The idle consumption seems really high, no wonder the board is running so hot...
nomead is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Raspberry Pi 3A+ M344587487 Hardware 3 2018-11-17 13:20
Which SIMD flag to use for Raspberry Pi BrainStone Mlucas 14 2017-11-19 00:59
Raspberry Pi lavalamp Hobbies 10 2017-08-16 00:37
Raspberry Pi sloppyonefoot Software 1 2017-07-02 08:48
Raspberry Pi xilman Hardware 126 2017-06-01 14:42

All times are UTC. The time now is 16:33.

Wed Sep 30 16:33:57 UTC 2020 up 20 days, 13:44, 0 users, load averages: 1.76, 1.86, 1.83

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.