mersenneforum.org Version 0.7.8 Private Beta
 Register FAQ Search Today's Posts Mark Forums Read

 2019-07-20, 07:37 #1 Mysticial     Sep 2016 23×43 Posts Version 0.7.8 Private Beta As of last week, I've feature-frozen v0.7.8 for a ~September/October release. So while it's still a couple months away, the changelist is mostly set. In the past, I've mostly done feature-freeze testing myself. But that clearly didn't work for v0.7.7 since a gazillion bugs were found afterwards anyway. So this time, I'm going to try something different and open it up to a semi-private beta. Summary of change for the new version: Most of the big changes in v0.7.8 are centered around large computations and record breaking: Swap Mode now has checksumming for all disk I/O. This will hopefully go a long way to solving the problem of silent data corruption for disk I/O. The performance impact of this is non-zero, but acceptable. The checksumming can be disabled if you want maximum performance. You can now configure the program to run a system command after each checkpoint. This makes it much easier and safer to automate backups for the really long computations. For Swap Mode computations, the program will now give you an upper-bound for the size of the largest checkpoint. This makes it possible to budget storage space for the checkpoint backups. Internally, there were major changes to the entire Swap Mode functionality. They are significant enough that it would be better if more than people than just myself test it before somebody spends a lot of resources to attempt a Pi record with it. Other Notable Changes: The BBP digit extractor is now faster and has higher offset limits. The previous limit was the 2^46'th hexadecimal digit of Pi. The means you can't use it to verify a Pi computation larger than 84.7 trillion decimal digits. That number is totally within range of modern hardware. The new limit varies depending on the ISA, but will go all the way up to 2^59'th hex digit. Jesús Guillera recently discovered a very fast formula for Catalan's Constant. It's the 2nd fastest known formula. Version 0.7.8 will have a native implementation of it. The custom formulas can now do Nth root radicals. This lets y-cruncher compute Gamma(1/3). With this, the program is finally capable of computing all of the popular quasi-linear run-time constants. An big unknown for this release is adding a Zen 2 binary. I'm currently waiting until September to make a decision on that. The 16 core 3950X becomes available in September. (Barring surprises, I will be getting the 3950X.) AMD will have revealed more Zen 2 details at Hot Chips 2019. Details of Zen 2 Threadripper and Epyc are (more likely) to be out. The Zen 2 binary will only happen if there are worthwhile performance gains over the existing binaries. And if it does, it will be the only (planned) exception to the feature-freeze. ------ If anyone is interested in trying out the private beta for v0.7.8, contact me either PM or other means. The Windows binaries are ready now, the Linux ones may take a few more days. ----- EDIT: This is now a public beta. Last fiddled with by Mysticial on 2019-10-19 at 08:06
 2019-09-18, 07:43 #2 Mysticial     Sep 2016 23×43 Posts We're now half-way through September: There is still no official release date for the 3950X. The 3900X can't stay in stock for more than a few minutes at a time. None of the stores I've talked to have received any 3900X shipments in a month now. There are (credible) rumors that AMD's clock speeds are too aggressive making it difficult to bin chips into these higher bins. (thus affect supply of these high-end parts) TSMC is facing severe supply problems. There's an upcoming iPhone launch that's competing for TSMC's manufacturing capacity. While I'm not yet willing to put any money on it yet, I'm getting the feeling that AMD will not be able to meet their promise of delivering the chip in September. If we assume the worst based on the reports about TSMC, I won't be surprised if the 3950X doesn't show up until next year. I already have about $1000 of dead-weight hardware for this build that's been accumulating since July. And it looks like it will have to wait even longer. ------ This puts me in a somewhat stupid dilemma of whether I should keep holding the v0.7.8 release for the 3950X so I can test and (potentially) add a Zen 2 binary. I don't really want to do two releases since that breaks benchmark comparability within major releases. Deferring a Zen 2 binary to v0.7.9 is probably not a good idea since v0.7.9 is probably going to be very far out.  2019-09-18, 11:55 #3 mackerel Feb 2016 UK 44010 Posts Since AMD committed to September I suspect they'll do a last minute launch regardless, even if only a handful of units will actually be available and sell out instantly. There is a lot of built up demand and who knows how long that will take to be filled. The only possible positive note is that some used 3900X might start to appear if people who want to sit on the leading edge upgrade already. I have wondered if AMD could produce "lower clock" 12/16 core models in volume, but it would probably make their lineup appear confusing. I presume you want the 2 CCD models in particular to optimise for? Otherwise you could get a 3600 as a temporary measure to start core optimisations. AMD had stated CCX-CCX latency is the same regardless of which CCD it is on, so is it more a bandwidth optimisation at that point? 2019-09-18, 16:31 #4 Mysticial Sep 2016 23×43 Posts Quote:  Originally Posted by mackerel Since AMD committed to September I suspect they'll do a last minute launch regardless, even if only a handful of units will actually be available and sell out instantly. There is a lot of built up demand and who knows how long that will take to be filled. The only possible positive note is that some used 3900X might start to appear if people who want to sit on the leading edge upgrade already. I have wondered if AMD could produce "lower clock" 12/16 core models in volume, but it would probably make their lineup appear confusing. Paper launch at best. They're not fooling anyone. But I'd love to be proven wrong. Quote:  I presume you want the 2 CCD models in particular to optimise for? Otherwise you could get a 3600 as a temporary measure to start core optimisations. AMD had stated CCX-CCX latency is the same regardless of which CCD it is on, so is it more a bandwidth optimisation at that point? The 16-core is the one that has a comparable compute/bandwidth ratio as the high-end Rome Epyc (which is what I'm really targeting). Technically, there's nothing stopping me from running an 8-core single channel or severely downclocking the memory (aside from the unbalanced read/writes from 1 die). But I do intend to use it as a workhorse for everything non-AVX512. At the very least, this will take a huge load off my 7940X. Quote:  Otherwise you could get a 3600 as a temporary measure to start core optimisations. I kinda hate doing that unless there's an immediate place to dump the chip afterwards. (I generally don't sell my stuff - or at least I've never bothered to.) In any case, I'm currently not working on optimizations for anything (and probably won't until next year). So there's kinda no point to getting anything now. I'm holding off on Ice Lake for similar reasons. Everything I need ISA-wise in the short-term is in Cannon Lake which I have as my HTPC (yes, that 8121U NUC). And that Dell XPS 13, for my purposes, is not suitable as a laptop replacement. I can't justify spending$2300 for what's essentially an HTPC that doubles as a glorified iPad. But if nothing better comes out before the end of the year, I'll have to get it anyway due to complicated reasons.

Last fiddled with by Mysticial on 2019-09-18 at 17:11

 2019-09-20, 17:53 #5 mackerel     Feb 2016 UK 23·5·11 Posts https://www.anandtech.com/show/14895...oming-november 3950X pushed back to November, but some new Threadripper will also be out then.
2019-09-20, 18:06   #6
Mysticial

Sep 2016

23·43 Posts

Quote:
 Originally Posted by mackerel https://www.anandtech.com/show/14895...oming-november 3950X pushed back to November, but some new Threadripper will also be out then.
Looks like you beat me to it.

Well then, screw that and the Zen 2 binary/optimizations. Public beta for v0.7.8 will go out this weekend since it's now feature-complete. Looking at my priories and to-do list, v0.7.9 will probably be around this time next year at best.

Gonna try an experiment one of these weekends: Will a B350 mobo take 4 x 32GB DIMMs? I'm gonna need to at least test all my hardware for DOAs before I let them sit out the return periods.

Last fiddled with by Mysticial on 2019-09-21 at 01:11

 2019-09-22, 06:41 #7 Mysticial     Sep 2016 5308 Posts As promised, v0.7.8 public betas: These are also release candidates. But I do expect a ton of bugs to show up. ----- Off topic: 1st Ryzen with a crappy 1st gen AM4 mobo will handle 32 GB DIMMs. So if anybody has any crazy ideas... https://twitter.com/Mysticial/status...43213101289477 It did take a couple BIOS updates though. Last fiddled with by Mysticial on 2019-09-22 at 06:43
 2019-10-11, 23:59 #8 Mysticial     Sep 2016 5308 Posts Looks like 9502 will not be the final release. 3 issues came up. 2 are trivial and harmless, the 3rd is embarrassing and non-trivial to fix. Not sure how I screwed this one up. The non-trivial one only affects the NthRoot function in the custom formulas. So no effect on Pi computations. Last fiddled with by Mysticial on 2019-10-12 at 00:00
 2019-10-19, 08:05 #9 Mysticial     Sep 2016 5308 Posts New beta is out: Changes since 9502: Fixed a crash for very small computations. Fixed a corner case of non-convergence for Nth root radicals in the custom formulas. Fixed an issue in swap mode where it may underestimate how much disk space is needed for checkpoints. Untested fix for a crash in Linux when no NUMA nodes are detected. Last fiddled with by Mysticial on 2019-10-19 at 08:06
2019-10-21, 18:48   #10
kruoli

"Oliver"
Sep 2017
Porta Westfalica, DE

2×3×127 Posts

Quote:
 Originally Posted by Mysticial Untested fix for a crash in Linux when no NUMA nodes are detected.
Fixed for me.

By the way, every time I'm on Linux with y-cruncher, I will try to close the program with Ctrl+D (like bash etc.), but instead, the program freaks out. Would it be a lot of work to integrate something like "on every input, check for input = EOF, if true, exit"?

2019-10-21, 20:14   #11
Mysticial

Sep 2016

23×43 Posts

Quote:
 Originally Posted by kruoli Fixed for me.
Awesome!

Quote:
 By the way, every time I'm on Linux with y-cruncher, I will try to close the program with Ctrl+D (like bash etc.), but instead, the program freaks out. Would it be a lot of work to integrate something like "on every input, check for input = EOF, if true, exit"?
I've actually never heard of Ctrl+D in this context, lol. Ctrl+C is how I break out of the program.

In all cases, I'm not sure how "cleanly" it terminates the program and whether any unflushed file buffers to cause data corruption in swap-mode/checkpointing. At the very least, the new version will detect any such corruption sometime after resume.

I can look at the Ctrl+D case. But I can't promise anything since I'm not sure if I want to bother with all the platform-specific things.

 Similar Threads Thread Thread Starter Forum Replies Last Post Jean Penné Software 111 2015-01-26 21:41 Prime95 Software 20 2014-03-02 02:51 Prime95 Software 68 2014-02-23 05:42 Prime95 Software 33 2005-06-14 13:19 Prime95 PSearch 15 2004-09-17 19:21

All times are UTC. The time now is 09:38.

Sun Nov 28 09:38:38 UTC 2021 up 128 days, 4:07, 0 users, load averages: 1.09, 1.21, 1.08