20160107, 04:48  #1  
Jan 2016
101_{2} Posts 
Understanding NFS
Hello! I have been recently helping victims of a particular ransomware that has been discovered to have a weakness in it's implementation of storing the AES key. In assisting with this, I (and several others) have seemingly been thrust into the exciting world of factoring large numbers!
I've been slowly understanding atleast how to use Yafu to factor these numbers (ranging from C78 to C156 on average), but would like to learn more to make sure we are utilizing the program to its fullest potential. The goal is to factor as fast as possible in order to help all of these victims. My current interest is in learning how to interpret the NFS step, and most importantly, have a rough idea on an ETA possibly. It's fine if a number legitimately needs to take a day or two to factor, but it's the mystery of whether it will finish in a minute or keep running through the night that is frustrating. Can I have some help on interpreting these lines? Quote:
Quote:
Thanks for any help, and for this very helpful program. 

20160107, 13:50  #2 
Tribal Bullet
Oct 2004
3547_{10} Posts 
The source of the uncertainty here is that the numbers everyone is trying to factor from Teslacrypt are random, and hence have factors of random sizes. If a particular number has small factors they can be found with specialized factorization methods very quickly; but if it has no small factors you are left having to use the quadratic sieve and number field sieve. Those methods don't care whether factors are large or small, and the time they take is predictable and based only on the size of the number to be factored. Unfortunately they take a significant amount of time (i.e. overnight for a 120 digit input, 12 weeks for a 150 digit input).
So the time needed to factor a random number completely depends on the size of the factors. Of course for a random number you don't know the size of the factors until you find them, so the best we can do is hope that the number contains small factors, try methods which will only work if that's the the case, and if they succeed the uncertainty is reduced because the number to factor has been reduced in size. Once you try all of the specialized methods for a little while and they stop succeeding, and the number hasn't been completely factored yet, you can be fairly sure any remaining factors are large and can switch to the highpowered factorization methods. If the Teslacrypt authors had not made a mistake but instead made the numbers to be factored strong semiprimes, i.e. numbers with only two factors and both of them large, then you can skip all the specialized methods because only QS and NFS have a chance of succeeding, and every number to factor will take the worstcase time. 
20160107, 14:06  #3 
"Ben"
Feb 2007
2·1,831 Posts 
Jason's answer is excellent  I'll just try to add a little bit more.
As you've probably seen, NFS can be thought of as 3 separate processes: 1) polynomial search 2) sieving 3) post processing although 3) can be further broken down into a) filtering b) linear algebra (matrix step) c) square root The lines you posted come from the sieving step. After this has gone on awhile you should start to see yafu printing lines like this (you may have to run the command with the v option): Code:
nfs: found 105308 relations, need at least 1936352 (filtering ETA: 0h 9m), continuing with sieving ... Once you start filtering, you are nearly done, so the ETA for filtering is close to the ETA for the whole job. Postprocessing time also increases with size of the input number, but for C150 or less it will be a small fraction of the total time. Last fiddled with by bsquared on 20160107 at 14:08 Reason: v is required when running the job... 
20160107, 15:03  #4 
Jan 2016
5 Posts 
Thanks for the information. How did you ever guess it was TeslaCrypt. ;)
I paused that particular task, reran tune(), and resumed it overnight. It took only 3 hours this time  maybe something goofed up my first tune(). I've paused and resumed another task with the verbose flag, and it is telling me 7h ETA. Very helpful! Is 10M relations reasonable for a C121? 
20160107, 15:16  #5  
Tribal Bullet
Oct 2004
6733_{8} Posts 
Quote:


20160107, 15:22  #6 
Jan 2016
5 Posts 
I figured, lol. Sorry I missed that post  I jumped straight to yafu as it has been (for the most part) quicker at getting factors for most of my cases. I tried msieve with CUDA on my GTX 760, and it didn't seem like it was doing as well. I'll take a deeper read through that topic.
Never got hit by the virus myself, I'm just helping others. Thanks for the help. Hello from BleepingComputer! Last fiddled with by Demonslay335 on 20160107 at 15:23 
20160107, 15:40  #7 
"Ben"
Feb 2007
2·1,831 Posts 

20160107, 17:06  #8  
"Robert Gerbicz"
Oct 2005
Hungary
627_{16} Posts 
Quote:
Note that this is similar to a standard choice for RSA, but in those cases K and L is prime, but not here, and this gives that you can use this fact to early abort in Ecm (or to continue it to a larger level !). Example: check this Tesla number: http://factordb.com/index.php?query=...98158773983475 after Ecm finds the p27 say at t30 and no other large factor. Say p27 divides K, then with large probability there is only one missing prime factor of K since log(sqrt(2^256/p27))/log(10)=25.3, and we can also see the size of this unknown factor, since K is in (2^252,2^256), if P is the missing factor, then Code:
2^252<K<2^256, and P*p27<=K<=P*p27*5^2*251, from these: 10^45.61<P<10^50.62 Last fiddled with by R. Gerbicz on 20160107 at 17:15 Reason: edit url 

20160108, 15:26  #9  
Jan 2016
5_{8} Posts 
Quote:
Is there any other pattern I can use to my advantage? Kind of along the lines you are saying, I am noticing when I get down to a C130 or less after ECM, I end up with only two primes once factored, and the sum of their digits is close to the digits of the composite. Is there a way to make yafu skip a block of primes, or give it a suggested range maybe? Or does it kind of already figure that out on its own based on maths above my head. :P 

20160108, 15:34  #10  
Tribal Bullet
Oct 2004
3,547 Posts 
Quote:
The sieve methods don't care what the factors look like; even if you know they are very nearly the same size, it has no effect on the runtime of NFS. 

20160108, 15:48  #11  
Jan 2016
5 Posts 
Quote:
Man, 6 years out of calculus class and I feel like I've forgotten everything about number theory... thanks for all of the info, even if it's basic stuff, lol. easy for this stuff to go over your head when you've not had to do it everyday in a long time. Even if I'm mostly using this for helping with this current ransomware decryption efforts, this whole process has definitely peeked my interest in how these algorithms work. Factoring is kind of fun. :) 

Thread Tools  
Similar Threads  
Thread  Thread Starter  Forum  Replies  Last Post 
Understanding assignment rules  Fred  PrimeNet  3  20160519 13:40 
Understanding CPUs on the Benchmark Page  Fred  Hardware  33  20160427 04:42 
Understanding the LL proof and then more related stuff following it  Raman  Math  4  20120524 05:37 
LL Test: Understanding the math  zacariaz  Homework Help  32  20070516 15:18 
Are Parts of Human Experience Beyond Scientific Understanding?  edorajh  Science & Technology  11  20070122 06:04 