![]() |
478__891_13m1 factored
1 Attachment(s)
[QUOTE=richs;584027]Taking 478__891_13m1[/QUOTE]
[CODE]p76 factor: 3431470368665622242184217103591598096099932327907456020495877029935727723147 p82 factor: 7615720048137620822959331660523770456031672477991612884633860118579072277536159943[/CODE] Approximately 6 hours on 6 threads of a Core i7-10510U with 12 GB memory for a 2.67M matrix at TD=100. Log attached and at [URL="https://pastebin.com/ursm0Hmb"]https://pastebin.com/ursm0Hmb[/URL] Factors added to factordb. |
[QUOTE=RichD;584512]Pretty impressive. Do you know how big of a matrix can be solved on that 16G card?[/QUOTE]
It's the 32 GB version, but we need to store both the matrix and its transpose on the card in CSR format. Otherwise random reads from global memory kill performance. So we can go up to about 25M matrices, probably a little larger, on a single card. Larger matrices can be divided across multiple GPU's with MPI. Currently, though, we have to transfer vectors off and back on the GPU for the MPI comms multiple times in each iteration, which introduces a large performance hit. I've added support for CUDA-aware MPI but OpenMPI still transfers off the card for collective reductions. MVAPICH2-GDR I think supports collective reductions on the card, but it's still being tested on the SDSC Expanse cluster. Hopefully that will be working in a few weeks. But for now quick tests show a 43M matrix on two cards in ~70 hours and an 84M matrix on four cards in ~350 hours. |
275__471_7m1 is factored and posted. The 4.9M matrix took 37 minutes to solve on a V100.
|
[QUOTE=frmky;584943]The 4.9M matrix took 37 minutes to solve on a V100.[/QUOTE]
It takes me longer than that to download the dataset of that size. :smile: |
228__877_13m1 factored
1 Attachment(s)
[QUOTE=richs;584462]Taking 228__877_13m1[/QUOTE]
[CODE]p71 factor: 30764958729565508484880341083575244195277785972334468896433685365982679 p84 factor: 678970423122703057140573081857968878682989180158386921603029858822918272257102908581[/CODE] Approximately 12 hours on 6 threads of a Core i7-10510U with 12 GB memory for a 3.92M matrix at TD=100. Log attached and at [URL="https://pastebin.com/4azJHm9V"]https://pastebin.com/4azJHm9V[/URL] Factors added to factordb. |
Taking 261__601_13m1
|
474__749_11m1 factored
1 Attachment(s)
[QUOTE=richs;584501]Taking 474__749_11m1[/QUOTE]
[CODE]p59 factor: 73119398814213571649205971000941430624353762947445533933047 p99 factor: 268848587983605823488306320117291030386483771174564863043022982200674143258795499226789114717733577[/CODE] Approximately 6 hours on 6 threads of a Core i7-10510U with 12 GB memory for a 2.87M matrix at TD=100. Log attached and at [URL="https://pastebin.com/DFSQmvg2"]https://pastebin.com/DFSQmvg2[/URL] Factors added to factordb. |
261__601_13m1 factored
1 Attachment(s)
[QUOTE=richs;585183]Taking 261__601_13m1[/QUOTE]
[CODE]p62 factor: 56146592317121920255303081963338793425677534352049844673096611 p128 factor: 64457733833344456440386887077326733647815241253008397942166314360406672207208434693484135241565811229366738777648026384819776361[/CODE] Approximately 14 hours on 6 threads of a Core i7-10510U with 12 GB memory for a 4.18M matrix at TD=100. Log attached and at [URL="https://pastebin.com/P22jhaHu"]https://pastebin.com/P22jhaHu[/URL] Factors added to factordb. |
Taking 274__353_7m1
|
Taking 7305457_31m1
|
274__353_7m1 is in LA. Results Tuesday morning.
|
All times are UTC. The time now is 05:09. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.