![]() |
![]() |
#1 |
Aug 2020
79*6581e-4;3*2539e-3
2·293 Posts |
![]()
Somewhat embarrisingly I underestimated how different a 172 digits composite I could nicely do on a 10 core i10 incl sieving is from just the post-processing step of the 192 digits 5p2_950M.
The server supposed to do it is a 24 core Epyc 7401P with 128 GB RAM. Is it idiotic to try that or not? I wouldn't mind having it busy for 3 weeks or so, but if it's months I'm getting worried about instability or the hoster (Hetzner) deciding they don't like that kind of usage. |
![]() |
![]() |
![]() |
#2 |
"Oliver"
Sep 2017
Porta Westfalica, DE
26·17 Posts |
![]()
This should be okay. The c204 from the last team sieve took a month on 16 cores, and that was much more difficult.
|
![]() |
![]() |
![]() |
#3 |
Apr 2020
19·43 Posts |
![]()
Yeah, you'll be fine. As a rule of thumb, matrix solving time roughly multiplies by 4.5 for each doubling of the matrix dimensions.
|
![]() |
![]() |
![]() |
#4 |
Aug 2020
79*6581e-4;3*2539e-3
24A16 Posts |
![]()
Ok, thanks. I panicked a bit... ;)
|
![]() |
![]() |
![]() |
#5 |
"Curtis"
Feb 2005
Riverside, CA
17×317 Posts |
![]()
Your machine is plenty. A C192 should produce a matrix around 30M dimensions, which ought to solve on a 24GB ram machine. My 16-core Ryzen would take 10 days or so; your machine might complete it in a week, give or take a day or two. I'm not sure how more RAM channels helps speed, so you might be as fast as 5 days if you use the whole machine?
|
![]() |
![]() |
![]() |
#6 |
Aug 2020
79*6581e-4;3*2539e-3
2·293 Posts |
![]()
So, the matrix was build successfully using TD=130:
Code:
Mon Jun 13 09:13:59 2022 matrix includes 64 packed rows Mon Jun 13 09:14:03 2022 matrix is 31962806 x 31963031 (15913.7 MB) with weight 4274411069 (133.73/col) Mon Jun 13 09:14:03 2022 sparse part has weight 3852042710 (120.52/col) Mon Jun 13 09:14:03 2022 using block size 8192 and superblock size 6291456 for processor cache size 65536 kB Mon Jun 13 09:16:54 2022 commencing Lanczos iteration (20 threads) Mon Jun 13 09:16:54 2022 memory use: 15106.0 MB linear algebra completed 56625 of 31963031 dimensions (0.2%, ETA 805h36m) It's using about 20 GB RAM btw. Last fiddled with by bur on 2022-06-13 at 08:49 |
![]() |
![]() |
![]() |
#7 |
"Oliver"
Sep 2017
Porta Westfalica, DE
26×17 Posts |
![]()
What does found 21680656 cycles, need 21619192 say for you?
I would suggest trying different ways to execute msieve like in the 3,748+ team sieve, maybe trying MPI. What I found:
|
![]() |
![]() |
![]() |
#8 | |
Aug 2020
79*6581e-4;3*2539e-3
58610 Posts |
![]()
found 32078384 cycles, need 31964190.
Quote:
I don't have the hardware to use MPI, just this one server. I'm mainly wondering if the dimensions are ok, it's more than the 30M vbcurtis mentioned. Decreasing that would have the biggest impact, I guess. edit: Browsing through old logs, it seems 800 h for a 30M matrix is very long. Does LA time increase linearly with dimensions? I also noticed that the threads are not always running at 100%. Average according to htop is 18.5 cores, i.e. 92.5%. They even regularly switch from running to sleeping. It's not a general problem, yafu runs ECM with 100%+ utilization. Is that normal behavior for msieve? Last fiddled with by bur on 2022-06-13 at 09:53 |
|
![]() |
![]() |
![]() |
#9 |
"Oliver"
Sep 2017
Porta Westfalica, DE
26×17 Posts |
![]()
You can use MPI on a single machine to optimize the usage of the single CPU's "apartments".
I cannot really comment on the matrix size since I am not experienced enough with this. Have you had a look at the according NFS@home post-processing thread to get a glimpse of what others got with similar size numbers and similar TD? As Robert pointed out, matrix solving time roughly multiplies by 4.5 for each doubling of the matrix dimensions, and since I do not think you will be able to decrease the dimensions much further, I would leave it running. It might be that the time possible to save is smaller than the time it would take to optimize the matrix. Please double check you msieve filtering invocation; I had made an error more than once where the TD was ignored on the command line because of wrong parameter order. But msieve will log the TD if it detects it correctly. Regarding Linux scheduler: Yes, often, it is better than the Windows one. But in no means perfect! Especially when working with CPUs that have divided L3 caches, chiplets or in case you would have multiple CPUs, manually setting the affinity will usually help immensely. For "basic" CPUs, I see only a few percent improvement usually. But this changed again drastically if you run multiple things in parallel on a single machine.. |
![]() |
![]() |
![]() |
#10 | |
"Oliver"
Sep 2017
Porta Westfalica, DE
26·17 Posts |
![]() Quote:
|
|
![]() |
![]() |
![]() |
#11 | |
Aug 2020
79*6581e-4;3*2539e-3
2×293 Posts |
![]()
I didn't know MPI even helps on one CPU, if only due to overhead. VBITS is a compiler setting? No idea about octa channel. Isn't that a hardware thing that I can't change anyway?
I will leave it running for now, don't really have time to get into MPI at the moment and I suspect it just is that slow, now that I remembered the exponential increase in time for LA you mentioned. Swellman even estimated much more than 30 days. If someone could confirm that it's normal that msieve tasks sleep a lot? That's the only thing I find really weird. Quote:
|
|
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Is there any sensible auxiliary task for HT logical cores when physical cores already used for PRP? | hansl | Information & Answers | 5 | 2019-06-17 14:07 |
More cores or less. | Math31415 | Hardware | 6 | 2019-01-16 18:51 |
Cannot use two cores | abelianbhaskar | Information & Answers | 3 | 2018-05-28 15:40 |
Is an online exercise game not based on trust doable? | jasong | jasong | 1 | 2013-04-07 05:55 |
CPU cores | Unregistered | Information & Answers | 7 | 2009-11-02 08:27 |