mersenneforum.org  

Go Back   mersenneforum.org > Factoring Projects > Msieve

Reply
 
Thread Tools
Old 2021-10-26, 09:35   #67
charybdis
 
charybdis's Avatar
 
Apr 2020

3B616 Posts
Default

What was your general impression of 34-bit vs 33-bit? Will the extra bit allow slightly larger jobs to be run as I'd hoped?
charybdis is offline   Reply With Quote
Old 2021-10-26, 15:47   #68
VBCurtis
 
VBCurtis's Avatar
 
"Curtis"
Feb 2005
Riverside, CA

165E16 Posts
Default

Quote:
Originally Posted by frmky View Post
2,2174M is in LA, so here's one more data point. Running on eight NVLink-connected V100's,
It'll take a bit longer due to queue logistics, but hopefully it'll be done within the week.
How many relations did you collect? Was the unique ratio better than 2,2174L's? The matrices came out pretty similar in size, so a comparison of relations counts (raw and unique) gives a nice 33 vs 34 data point.
VBCurtis is offline   Reply With Quote
Old 2021-10-26, 18:06   #69
frmky
 
frmky's Avatar
 
Jul 2003
So Cal

2,621 Posts
Default

For 2,2174L we sieved from 20M - 6B, and collected 1.36B relations. This gave 734M uniques, so about 46% duplicates.

For 2,2174M we sieved from 20M - 4B, and collected 2.19B relations. This gave 1.29B uniques, so about 41% duplicates. However, we sieved a considerably narrower range of q, and it was overall much faster.
frmky is offline   Reply With Quote
Old 2021-10-27, 03:14   #70
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
"name field"
Jun 2011
Thailand

101000001011012 Posts
Default

[offtopic] I changed the thread title. The old one made me nostalgic every time someone posted in it... The new title is easier to search too, as the thread contains a lot of useful info... [/offtopic]

Last fiddled with by LaurV on 2021-10-27 at 03:17
LaurV is offline   Reply With Quote
Old 2021-10-31, 18:58   #71
frmky
 
frmky's Avatar
 
Jul 2003
So Cal

2,621 Posts
Default

Quote:
Originally Posted by frmky View Post
2,2174M is in LA, so here's one more data point.
It's done.
frmky is offline   Reply With Quote
Old 2022-02-20, 15:14   #72
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

10101001010112 Posts
Default

I'm contemplating playing with Colab to see if it could be used with smaller matrices. But I wonder if there is really any worth.

If I do everything but LA locally and only upload the necessary files for the matrix work, I'm still looking at a pretty large relations file for anything of value. But, I'm currently looking at more than a day of local CPU LA for ~c170 candidates. If I could knock that down to a few hours, maybe it would be "fun" to try.

The assigned GPUs vary widely as well. My last two experiments (sessions with GPU ECM) yielded a P100 and a K80. I do normally get some longer session times, but it's not guaranteed. Also, I may have only been getting half the card. (I'm still confused on shader/core/sm/etc.

If my source is correct the K80 is only CUDA 3.7. Is this current enough to work?

Would d/ling the checkpoint file at regular intervals be enough to be able to restart a timed out session later?

What else would I need to consider?

Sorry for the questions. Thanks for any help.

An extra question: Since the K80 is only CUDA 3.7 architecture, would it even be worth obtaining one? It seems the current minimum is at 3.5 and I'd hate to have another obsolete card right after getting one.
EdH is offline   Reply With Quote
Old 2022-02-21, 02:38   #73
frmky
 
frmky's Avatar
 
Jul 2003
So Cal

2,621 Posts
Default

Yes, it will work on a K80. My updated version requires CC 3.5 or greater.

You don't need to transfer the large relations file. Do this:
1. Complete the filtering and build the matrix locally. You can stop it manually once you see "commencing Lanczos iteration".
2. Transfer the ini, fb, and mat files (and mat.idx if using multiple GPUs with MPI, not covered here) to the GPU node.
3. On the GPU node, start the LA with options like ./msieve -nc2 skip_matbuild=1 -g 0 -v
4. You can interrupt it and restart it with "-ncr -g 0".
5. Once it's complete, transfer the dep file to the local node and run sqrt with -nc3 as usual.

The local and GPU msieve binaries can be compiled with different values for VBITS since the LA is run entirely using the GPU binary. And yes, you just need the chk file in addition to the other files above to restart.

A K80 is a dual GPU card, so without using MPI you will only be using half the card. And each half is only a little bit faster than a K20. It will be slower than a P100 as you would expect.
frmky is offline   Reply With Quote
Old 2022-02-21, 03:29   #74
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

152B16 Posts
Default

Thanks frmky! This helps a bunch. I will pursue the Colab session. I also have a 3.5 card to play with, but it only has 2GB. Not sure if that's enough to even get a small matrix into.

I'm off to study. . .

Last fiddled with by EdH on 2022-02-21 at 13:54 Reason: It's only 3.0! (frown)
EdH is offline   Reply With Quote
Old 2022-02-22, 01:05   #75
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

5,419 Posts
Default

I'm saddened to report that even had I been successful with my Colab experiments, it would still be impractical.

I was able to compile Msieve for two different GPUs, a K80 (3.7) and a T4 (7.5). However, Msieve refused to understand the options although I tried all the variations I could think of in both Python and BASH scripts, with and without single/double quotes around various portions, and in a variety of orders. In all cases, Msieve simply displayed all the available options.

In any case, the impracticality is that for a c160, the msieve.dat.mat file is just short of 2GB. The two tested methods of getting the file loaded into the Colab sessions were via SSH and via Google Drive. SSH took just under two hours. Uploading the file to Google Drive took just under two hours. The first method held the session open without using the GPU for anything, for which Colab complained, while the second allowed the session to start rather quickly (after the two hour upload to Google Drive). But, since a c160 created a 2GB file, I'm expecting larger matrices will just take a much longer time to load into a Colab Session.

I may try again later to get Msieve to process the test case, since at this point I have the needed files in Google Drive, but the practicality is in doubt.

Thank you for the assistance. I will surely put this to use when I finally acquire a usable CUDA GPU. (I'm even eying some K20s ATM.)

Last fiddled with by EdH on 2022-02-22 at 01:07 Reason: spillin'
EdH is offline   Reply With Quote
Old 2022-02-22, 23:40   #76
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

5,419 Posts
Default

Quote:
Originally Posted by EdH View Post
. . .
I may try again later to get Msieve to process the test case, since at this point I have the needed files in Google Drive, but the practicality is in doubt.

Thank you for the assistance. I will surely put this to use when I finally acquire a usable CUDA GPU. (I'm even eying some K20s ATM.)
I'm going to claim success!

I got a Colab session to run Msieve LA on a Tesla T4! I didn't let it complete, but the log claims:
Code:
Tue Feb 22 22:48:53 2022  linear algebra at 0.0%, ETA 3h44m
The best time I could get for a 40 threaded Xeon was about twice that long.

I was able to compress the .mat file to almost half the size, but it still takes an hour to upload it to Google Drive and a little bit of time to decompress it. (Others may be able to upload a lot faster.)

The actual details are much more complicated than my other sessions, so I need to work quite a bit on them before I can publish them. As to the earlier comments of practicality, I will have to study this further for my use. On one hand, it takes a lot of manual intervention and timely success is not guaranteed. On the other hand, all of this work being done by Colab is letting the local machines perform other work. Perhaps the value can be realized for larger jobs.

I don't seem to be getting the screen output I expected from the -v option.

Is there a way to redirect the checkpoint file? I couldn't find an option that I thought existed.

Thanks again for all the help.
EdH is offline   Reply With Quote
Old 2022-02-24, 01:24   #77
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

5,419 Posts
Default

Sorry if you're tired of these reports, but here's another:

I have a full-fledged Colab session that works through completion of LA. I let a c157 finish today, that I had recently run on my 20c/40t Xeon. The times were nearly identical:
Code:
Xeon  04:17:41 elapsed time
Colab 04:19:08 elapsed time
I hope to do the same test with a different GPU, to compare.
EdH is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Resume linear algebra Timic Msieve 35 2020-10-05 23:08
use msieve linear algebra after CADO-NFS filtering aein Msieve 2 2017-10-05 01:52
Has anyone tried linear algebra on a Threadripper yet? fivemack Hardware 3 2017-10-03 03:11
Linear algebra at 600% CRGreathouse Msieve 8 2009-08-05 07:25
Linear algebra proof Damian Math 8 2007-02-12 22:25

All times are UTC. The time now is 11:45.


Fri Mar 31 11:45:33 UTC 2023 up 225 days, 9:14, 0 users, load averages: 0.93, 0.85, 0.78

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔