mersenneforum.org  

Go Back   mersenneforum.org > Factoring Projects > Factoring

Reply
 
Thread Tools
Old 2019-10-14, 14:43   #210
bsquared
 
bsquared's Avatar
 
"Ben"
Feb 2007

E5B16 Posts
Default

Quote:
Originally Posted by Till View Post
Thanks for the explanation; in my ignorance I half-ways expected that to be a bug.
Honestly, not a bad assumption for any code I write :) Thank you for looking everything over so closely.
bsquared is offline   Reply With Quote
Old 2022-09-19, 13:57   #211
bsquared
 
bsquared's Avatar
 
"Ben"
Feb 2007

3·52·72 Posts
Default

Quote:
Originally Posted by bsquared View Post
I've been playing around with ECM lately and I thought I'd try to implement a toy version that works for inputs that fit in a single machine word (64-bit). Here are timings for lists of 100k random semiprimes compared to other methods explored in this thread:

Code:
Bits   ECM     Lehman  Brent   Factor64    SIQS
42     1.03    0.65    1.31    1.32       
44     1.19    1.18    1.73    1.75       
46     1.41    1.63    2.27    2.35       
48     1.72    2.67    3.15    3.25       
50     2.17    4.31    4.35    4.54       
52     2.82    6.89    6.1     6.29       
54     3.33    10.7    8.5     8.82       
56     3.97    16.8    11.9    12.5       
58     5.02    26.5    16.9    17.6        16.44
60     6.25    42.2    23.6    24.8        18.2
62     7.57            33.1    35.2        
64     9.83            48.6    53.1        21.6
It scales quite well, crossing over with Lehman at around 44 bits and remaining well ahead of all others up to 64 bits. Combined with a bit of trial division I imagine it would work well for arbitrary inputs too.
Over the past few weeks, Jeff Hurchalla has driven a series of updates to the microecm code. Foremost among them a faster way to do Montgomery arithmetic on single-limb inputs. He has also been very helpful in cleaning up the code and creating a much improved interface. The latest and greatest is now available in the yafu github repository.

Below is an updated table of timings on the benchmark semiprime input files (100,000 semiprime inputs each composed of 2 factors of approximately equal size). Unfortunately I don't have the same hardware anymore, so the table incorporates speedups from the hardware as well. (spbrent didn't change, so the hardware portion of the speed increase could be extracted from that.) The code was compiled with
icc -Ofast -march=skylake-avx512 -o uecm microecm.c -lm

Code:
Bits   ECM     Lehman  Brent  
42     0.41    0.38    1.13
44     0.51    0.59    1.50 
46     0.64    0.94    1.99   
48     0.81            2.75
50     0.99            3.76   
52     1.25            5.23
54     1.55            7.29
56     1.95            10.3
58     2.46            14.4
60     3.06            20.3
62     3.90            28.7
64     4.76
Thank you Jeff!
bsquared is offline   Reply With Quote
Old 2022-09-21, 18:21   #212
bsquared
 
bsquared's Avatar
 
"Ben"
Feb 2007

3·52·72 Posts
Default

Update:
I borrowed some single-limb vector multiply functions from avx-ecm to make a vectorized version of microecm that operates up to 52 bits (the precision limit of the multipliers). Here it is compared to the others when processing the 100k input test sets. vec_uecm is structured to process large-ish lists of inputs... that turned out to be necessary in order to keep vector occupancy high.

Code:
timings in seconds for 100k inputs.
Bits   vec_uecm   uecm     Lehman  Brent  
42     0.24       0.41      0.38   1.13
44     0.29       0.51      0.59   1.50 
46     0.35       0.64      0.94   1.99   
48     0.44       0.81             2.75
50     0.53       0.99             3.76   
52     0.65       1.25             5.23
54                1.55             7.29
56                1.95             10.3
58                2.46             14.4
60                3.06             20.3
62                3.90             28.7
64                4.76
Code is available in the yafu github.

Last fiddled with by bsquared on 2022-09-21 at 18:37
bsquared is offline   Reply With Quote
Old 2022-09-21, 18:59   #213
Till
 
Till's Avatar
 
"Tilman Neumann"
Jan 2016
Germany

5·101 Posts
Default

Amazing!
Till is offline   Reply With Quote
Old 2022-09-21, 21:29   #214
VBCurtis
 
VBCurtis's Avatar
 
"Curtis"
Feb 2005
Riverside, CA

3×31×59 Posts
Default

Might this be useful as some sort of patch for GGNFS? Then again, is anyone conversant-enough in the code to try changing the current cofactor-splitting routine for this one?
VBCurtis is online now   Reply With Quote
Old 2022-09-22, 12:46   #215
henryzz
Just call me Henry
 
henryzz's Avatar
 
"David"
Sep 2007
Liverpool (GMT/BST)

135578 Posts
Default

Quote:
Originally Posted by VBCurtis View Post
Might this be useful as some sort of patch for GGNFS? Then again, is anyone conversant-enough in the code to try changing the current cofactor-splitting routine for this one?
lasieve5 contains ecm/p-1 code that no one uses. It is only activated with a parameter file present. The format for this parameter file is pretty hard to understand and there is no documentation. Code comments are the best we have as clues.

The cofactorisation strategy is handled in:
https://github.com/gchilders/lasieve...ows/strategy.w

Probably not too hard to substitute in an alternate ecm method. It is also worth bearing in mind that the majority of time is spent on larger numbers than this thread covers.

I suspect that switching to the CADO siever is probably a better use of time though.

Last fiddled with by henryzz on 2022-09-22 at 13:27
henryzz is offline   Reply With Quote
Old 2022-09-22, 15:43   #216
Till
 
Till's Avatar
 
"Tilman Neumann"
Jan 2016
Germany

7718 Posts
Default

Quote:
Originally Posted by henryzz View Post
It is also worth bearing in mind that the majority of time is spent on larger numbers than this thread covers.

Maybe Ben's 128 bit ecm (should be somewhere in this thread to find, I'm too lazy) would be more adequate? It looked pretty fast and might profit from recent improvements, too.
Till is offline   Reply With Quote
Old 2022-09-28, 02:25   #217
bsquared
 
bsquared's Avatar
 
"Ben"
Feb 2007

3×52×72 Posts
Default

Quote:
Originally Posted by henryzz View Post
Probably not too hard to substitute in an alternate ecm method. It is also worth bearing in mind that the majority of time is spent on larger numbers than this thread covers.

I suspect that switching to the CADO siever is probably a better use of time though.
Agreed on the CADO note. However I think on most jobs, microecm is completely sufficient. Anything with mfba/r of 64 or less can use microecm as a complete replacement for mpqs. For really big jobs or jobs with three large primes on one side, then tinyecm would need to be used as well as Till said.

So I went ahead and experimented with this, and indeed it does help! microecm is almost 10x faster than the assembly mpqs in ggnfs. But, that portion of the job is pretty small compared to the whole. So overall the speed increase is not that big. Still, it is something.

edit:
more details here

Last fiddled with by bsquared on 2022-09-28 at 03:32
bsquared is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Factoring Mersenne numbers paulunderwood Miscellaneous Math 18 2017-08-27 14:56
Factoring Big Numbers In c# ShridharRasal Factoring 10 2008-03-20 17:17
Question about factoring code Peter Nelson Software 9 2005-03-30 18:28
Factoring Fermat Numbers Axel Fox Software 14 2003-07-04 18:57
Questions about SSE2 code and Factoring Joe O Software 2 2002-09-13 23:39

All times are UTC. The time now is 07:01.


Tue Oct 4 07:01:00 UTC 2022 up 47 days, 4:29, 0 users, load averages: 0.81, 0.91, 0.88

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2022, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔