mersenneforum.org  

Go Back   mersenneforum.org > Factoring Projects > Factoring

Reply
 
Thread Tools
Old 2019-08-26, 19:10   #1
Branger
 
Oct 2018

2·3·5 Posts
Default Factorization factory

Hello everyone,

During the past couple of years I have been playing around with Coppersmith's factorization factory approach for factoring SNFS numbers sharing a common rational polynomial, which was inspired by the factorization of several Mersenne numbers by this approach. Since I have not really found any publically available code to do this, I ended up writing some. I decided to make the code I have been using available, in case it is useful to anyone. It is available at GitHub, at https://github.com/ebranger/NFS_factory

The implementation is fairly straight-forward, taking as input gnfs-lasieve relations valid for the shared polynomial, using a batch GCD to remove most factors, and any remainder is tested in a double large prime fashion, using code I grabbed from msieve. Any found relations is then factored and outputted in a msieve format, to allow msieve to do the postprocessing. I have so far used the code to do about 250 SNFS-220 factorizations.

There are still some optimizations that can be done, but the speed is not too bad. On my system using 6 cores, a normal SNFS-220 factorization takes about 12~14 days to finish. Using the factorization factory approach, batch factoring and postprocessing takes on the order of 2-4 days for numbers of that size. However, to this comes the amortized cost of finding the relations for the shared polynomial, which depends on the number of SNFS factorizations performed. In my case, the amortized cost is a bit under 2 days per number at the moment, for a comparable total time per factorization of about 4-6 days, which is still faster than the regular approach.

I would also like to take this opportunity to thank all the people on mersenneforum. I have been lurking this forum for quite some time, and I have learnt a lot from your discussions. Hopefully I can contribute something back to you all!

Erik
Branger is offline   Reply With Quote
Old 2019-08-26, 20:16   #2
R.D. Silverman
 
R.D. Silverman's Avatar
 
Nov 2003

22×5×373 Posts
Default

Quote:
Originally Posted by Branger View Post
Hello everyone,

During the past couple of years I have been playing around with Coppersmith's factorization factory approach for factoring SNFS numbers sharing a common rational polynomial, which was inspired by the factorization of several Mersenne numbers by this approach. Since I have not really found any publically available code to do this, I ended up writing some. I decided to make the code I have been using available, in case it is useful to anyone. It is available at GitHub, at https://github.com/ebranger/NFS_factory

The implementation is fairly straight-forward, taking as input gnfs-lasieve relations valid for the shared polynomial, using a batch GCD to remove most factors, and any remainder is tested in a double large prime fashion, using code I grabbed from msieve. Any found relations is then factored and outputted in a msieve format, to allow msieve to do the postprocessing. I have so far used the code to do about 250 SNFS-220 factorizations.

There are still some optimizations that can be done, but the speed is not too bad. On my system using 6 cores, a normal SNFS-220 factorization takes about 12~14 days to finish. Using the factorization factory approach, batch factoring and postprocessing takes on the order of 2-4 days for numbers of that size. However, to this comes the amortized cost of finding the relations for the shared polynomial, which depends on the number of SNFS factorizations performed. In my case, the amortized cost is a bit under 2 days per number at the moment, for a comparable total time per factorization of about 4-6 days, which is still faster than the regular approach.

I would also like to take this opportunity to thank all the people on mersenneforum. I have been lurking this forum for quite some time, and I have learnt a lot from your discussions. Hopefully I can contribute something back to you all!

Erik
Kudos.
R.D. Silverman is offline   Reply With Quote
Old 2019-08-26, 21:04   #3
mart_r
 
mart_r's Avatar
 
Dec 2008
you know...around...

12558 Posts
Default

A certain Alberico Lepore might be interested in this...
mart_r is offline   Reply With Quote
Old 2019-08-30, 18:06   #4
Nooks
 
Jul 2018

19 Posts
Default

Thank you for posting this!

I'd noticed your work in progress by finding the github page when researching the factorization factory idea, and later realizing you were behind a number of factorizations with shared polynomials on stdkmd.com.

This project is probably going to be the final push I need to build and deploy msieve myself and (try to) finish off factoring 37w1 composites.
Nooks is offline   Reply With Quote
Old 2019-08-31, 21:32   #5
Branger
 
Oct 2018

2·3·5 Posts
Default

Quote:
Originally Posted by Nooks View Post
Thank you for posting this!

I'd noticed your work in progress by finding the github page when researching the factorization factory idea, and later realizing you were behind a number of factorizations with shared polynomials on stdkmd.com.

This project is probably going to be the final push I need to build and deploy msieve myself and (try to) finish off factoring 37w1 composites.
You have been using CADO for the near-repdigit factorizations you have done, right? I have mainly been using a combination of the gnfs-lasieve sievers and msieve, in part because those were the programs I started with, and in part because they are fast, works well, and that a number of people on this forum have provided compiled binaries for windows, which made everything easier. I am a bit curious to see if CADO plays nice under the Ubuntu tools available thorugh windows store, othewise CADO on windows does seem to be a challenge to use.

The near-repdigit project is really nice for the SNFS factorization factory approach, since there are typically hundreds of numbers sharing a rational polynomial. Right now I am working on clearing all numbers with a SNFS-difficulty of ~219-222, which should keep me occupied for the forseeable futureā€¦ I am also looking into if I can sieve more to use the same relation set to factor SNFS-275 numbers. I still need to figure out good parameters for that, which is a challene since the factorization factory apporach has not been widely used. And I probably have to buy a new hard drive to store the ~100G or so additional relations needed
Branger is offline   Reply With Quote
Old 2019-09-02, 18:39   #6
Nooks
 
Jul 2018

19 Posts
Default

Correct, my factorizations have been done with CADO-NFS via SNFS or sometimes GNFS. I have not really been focused on optimizing NFS parameters, however; rather I've been working on trying to streamline and manage a pipeline of factoring jobs.

Mersenne Forum has been very helpful, pointing me in the right direction to the point that I can find my own algebraic polynomials and understand the factorization factory paper at a high level, if not the details.

I wouldn't be surprised if CADO could be built and run reasonably on the Ubuntu Windows tools, but neither would I be shocked to learn that they don't; one of my earliest CADO-NFS jobs nearly failed because of some oddness in macOS, I think.

That said I have saved most of my relations, so I'm curious to see if any of them can be reused or if I can adopt the factory approach going forward: clearing out 37w1 is going to be difficult verging on perhaps impossible.
Nooks is offline   Reply With Quote
Old 2019-09-03, 08:58   #7
henryzz
Just call me Henry
 
henryzz's Avatar
 
"David"
Sep 2007
Cambridge (GMT/BST)

172F16 Posts
Default

Quote:
Originally Posted by Nooks View Post
Correct, my factorizations have been done with CADO-NFS via SNFS or sometimes GNFS. I have not really been focused on optimizing NFS parameters, however; rather I've been working on trying to streamline and manage a pipeline of factoring jobs.

Mersenne Forum has been very helpful, pointing me in the right direction to the point that I can find my own algebraic polynomials and understand the factorization factory paper at a high level, if not the details.

I wouldn't be surprised if CADO could be built and run reasonably on the Ubuntu Windows tools, but neither would I be shocked to learn that they don't; one of my earliest CADO-NFS jobs nearly failed because of some oddness in macOS, I think.

That said I have saved most of my relations, so I'm curious to see if any of them can be reused or if I can adopt the factory approach going forward: clearing out 37w1 is going to be difficult verging on perhaps impossible.
I believe I had CADO running under WSL on 2803. I don't know whether WSL 2 will break it. I would hope not. Failing that a virtual machine will work.

I have been working through using the factorization factory attempting to use it for a constant algebraic polynomial(using deg 8 rather than 4 to keep the rational small). I have managed to produce relations and passed them to msieve, however, I didn't have enough. I also got a -11 error on about 10% of the relations.
I am thinking of switching sieving to the CADO siever as I am hoping it will be more resilient with very small numbers.
henryzz is offline   Reply With Quote
Old 2019-09-03, 09:57   #8
Branger
 
Oct 2018

2×3×5 Posts
Default

Quote:
Originally Posted by Nooks View Post
That said I have saved most of my relations, so I'm curious to see if any of them can be reused or if I can adopt the factory approach going forward: clearing out 37w1 is going to be difficult verging on perhaps impossible.
I would be a bit surprised if the old relations are useful for a new SNFS factorization. With normal SNFS parameters very few of the old relations should be valid for another polynomial, not enough to contribute noticeably to the relations needed for a new factorization. As for the possibility of factoring several 37w1 numbers using the factory approach, that is an interesting question. I expect a degree-8 shared algebraic polynomial to be necessary, and that leaves relatively few numbers in the sequence sharing the polynomial. But it might just be enough many numbers that it could be worth it, depending on how big numbers you intend to factor.

Quote:
Originally Posted by henryzz View Post
I have been working through using the factorization factory attempting to use it for a constant algebraic polynomial(using deg 8 rather than 4 to keep the rational small). I have managed to produce relations and passed them to msieve, however, I didn't have enough. I also got a -11 error on about 10% of the relations.
I am thinking of switching sieving to the CADO siever as I am hoping it will be more resilient with very small numbers.
So far I have only used the code for shared rational polynomial factorizations, apparently some bugs remain for shared algebraic runs. If i read the msieve code correctly, error -11 indicates that there are factors missing in the rational relation, so it appears that some factors were not printed. If possible, could you send me the polynomial and a few of the lines that gives errors? I'll try to track down the bug.

Also, when changing the polynomial degree away from the optimal one, a bit more relations seems to be needed to finish, even if all other parameters are the same.

The code reads the relations in gnfs-lasieve/msieve format, so if CADO relations are provided some new input/output functions would have to be written. Reading the relations should not be that hard, since manly the (a,b) values of the relations are needed, but outputting relations in CADO format will probably require more work.
Branger is offline   Reply With Quote
Old 2019-09-03, 13:59   #9
henryzz
Just call me Henry
 
henryzz's Avatar
 
"David"
Sep 2007
Cambridge (GMT/BST)

593510 Posts
Default

Quote:
Originally Posted by Branger View Post
So far I have only used the code for shared rational polynomial factorizations, apparently some bugs remain for shared algebraic runs. If i read the msieve code correctly, error -11 indicates that there are factors missing in the rational relation, so it appears that some factors were not printed. If possible, could you send me the polynomial and a few of the lines that gives errors? I'll try to track down the bug.

Also, when changing the polynomial degree away from the optimal one, a bit more relations seems to be needed to finish, even if all other parameters are the same.

The code reads the relations in gnfs-lasieve/msieve format, so if CADO relations are provided some new input/output functions would have to be written. Reading the relations should not be that hard, since manly the (a,b) values of the relations are needed, but outputting relations in CADO format will probably require more work.
I should update my copy of msieve before confirming that there is an error. I believe that the minimum size of factor that can be omitted has changed in the past. I could be falling afoul of that. If that doesn't fix it I will send you the relevant info.

CADO uses the same format for relations as the ggnfs siever. I would still use msieve for the postprocessing.
henryzz is offline   Reply With Quote
Old 2019-09-03, 18:29   #10
henryzz
Just call me Henry
 
henryzz's Avatar
 
"David"
Sep 2007
Cambridge (GMT/BST)

5·1,187 Posts
Default

I have worked out the error. 7213133561 divided one of my rational relations. The hex for 7213133561-2^32=2918166265 was what was output to file.
It looks like the current solution is to reduce the large prime bound to 32(which makes sense anyway in my case).

Last fiddled with by henryzz on 2019-09-03 at 18:30
henryzz is offline   Reply With Quote
Old 2019-09-03, 19:38   #11
Branger
 
Oct 2018

2·3·5 Posts
Default

Quote:
Originally Posted by henryzz View Post
I have worked out the error. 7213133561 divided one of my rational relations. The hex for 7213133561-2^32=2918166265 was what was output to file.
It looks like the current solution is to reduce the large prime bound to 32(which makes sense anyway in my case).
That is indeed a bug on my part, large primes are stored as 64 bit uints but were printed as 32 bit. I have fixed it on GitHub if you want to download a new version, alternatively the bug is on lines 614 and 623 in Batch_smooth.cpp, change "%X" to "%llX".

And thank you for the clarification about the CADO relation format, I was under the impression that the differences were more significant, and that some conversion needed to take place to make gnfs-lasieve relations into CADO ones. But if they are the same then it should be no problem to take CADO relations as input. You are also correct in that factors less than 1000 are not printed, since I assume msieve will handle that.
Branger is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Factorization of RSA-180 Robert Holmes Factoring 19 2010-11-08 18:46
Factorization of 7,254+ dleclair NFSNET Discussion 1 2006-03-21 05:11
Factorization of 11,212+ Wacky NFSNET Discussion 1 2006-03-20 23:43
Factorization of 5,307- Jeff Gilchrist NFSNET Discussion 7 2005-02-23 19:46
Factorization of M(738) McBryce Factoring 2 2003-09-19 19:32

All times are UTC. The time now is 02:10.


Sat Nov 27 02:10:52 UTC 2021 up 126 days, 20:39, 0 users, load averages: 1.19, 1.15, 1.15

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.