20190121, 08:04  #23 
Nov 2005
101 Posts 
Has anybody tried the Trial Division algorithm : https://github.com/TilmanNeumann/jav...63Inverse.java ?
The idea is pretty simple: Since we anyway compute a list of primes, why not also calculate the inverse of the prime as a double value and store it as well. Then instead of doing a trial division with this prime we multiply with its reciprocal. We have to check if this value is an integer. So we convert the value to long and see if this multiplied by the prime gives the back the original number to test. We observed a reasonable speedup over standard trial division. 
20190121, 14:04  #24  
"Ben"
Feb 2007
2·3^{2}·191 Posts 
Quote:


20190121, 14:58  #25  
"Tilman Neumann"
Jan 2016
Germany
2^{2}×109 Posts 
Quote:
You would see just that if you test the Lehman with a dataset of semiprimes N with smaller factor > cbrt(N) and parameter doTDivFirst=false ;) The minimal improvement at bigger sizes would result from the few cases where the "correction loop" is needed, which we put behind the (second) trial division loop. Anyway, thanks a lot for crosschecking our stuff in C, that is very helpful! 

20190203, 06:37  #26 
Feb 2019
California
1 Posts 
R Sherman Lehman
I have nothing mathematically to add to this discussion – just wanted to convey that my father, R Sherman Lehman, is intrigued by your work and pleased that even after all these years his factoring work is of value. He will continue to monitor your progress.

20190205, 09:52  #27 
"Tilman Neumann"
Jan 2016
Germany
2^{2}·109 Posts 
Thanks for the note, John; please tell your father that it is a great honor to us.
We are still trying to improve the implementation, but it is getting more difficult. Nonetheless, in the current form your father's algorithm is fast enough to have a prominent place not only in history but in practical applications, too. 
20190210, 13:16  #28 
"Tilman Neumann"
Jan 2016
Germany
664_{8} Posts 
Thilo and me found another fast algorithm.
This developed as follows: 1. The long tail of our splitted Lehman implementation is just a Hart oneline factorizer (HOLF). This let us believe that the congruences would speed up HOLF as well. Indeed that gave an immediate improvement like factor 3 compared to a simple HOLF implementation. 2. Next we noticed that only the most discriminative kloop with k==0 % 3 is needed. Having a single loop, we need no bound check either. 3. We found a better multiplier 315=3*3*5*7, so now we test k with k==0 % 315 only. This has no negative consequences for memory requirements, because only sqrts of k that are multiples of 315 need to be stored. 4. Finally, it showed to be beneficious to _race_ HOLF against trial division in a single loop. This makes the algorithm robust and very fast for test numbers having small prime factors. In Java, the new algorithm is factoring hard semiprimes about 20% faster than the Lehman implementation we proposed before. The smaller the factors can be, the bigger the speedup gets. The crossover point of the new algorithm with trial division is at ~25 bit and with PollardRho at ~50 bit. Arrays with 1 million entries are sufficient to factor any number <= 54 bit. Here is the source code: https://github.com/TilmanNeumann/jav...TDiv_Race.java We'll try next to apply the racing idea to our Lehman implementation. P.S. Segmenting the arrays might be another idea to speed the algorithms up in C(++). It didn't help in Java, though. Last fiddled with by Till on 20190210 at 13:18 Reason: minor text improvements 
20190210, 16:53  #29 
"Tilman Neumann"
Jan 2016
Germany
110110100_{2} Posts 
We would like to point out two more things:
1. Fast trial division by "multiplication of the inverse" is essential for the performance of Hart_TDiv_Race. With standard trial division, it would take twice the time, and then Hart_Fast (in the same package as Hart_TDiv_Race) would be much faster for hard semiprimes (Hart_Fast is faster in that case than our Lehman implementation, too). 2. Lehman's and Hart's algorithms seem to be very close relatives. Our "Lehman implementation" spends ~90% of time and finds ~90% of factors in the "Hart loops" ("medium/long range", "long tail"). On the other hand, our "Hart implementation" profits immensely from the congruences derived by Lehman. 
20190215, 20:46  #30 
Nov 2005
1100101_{2} Posts 
brief java implementation
Here is a brief java implementation of our last version.
It works fine for odd numbers up to 40 bits. For simplicity we have no primes table so it does not works super fast for numbers with factors below n^1/3. The gcd implementation is also missing. For numbers with factors above n^1/3 it has the same performance as the latest version of our lehman implementation. The main loop basically checks for small numbers with a modified trial division and for big numbers with the lehman approach. In the Lehman part we focus just on numbers of the form k = 3*3*5*7 * i. Even and odd k cause some limitations on the number 'a' of a^2  N = y^2. By choosing k as a multiple of some primes a satisfies a^2  N = y^2 modulus this primes. Enough talking here is the code: Code:
public class HartSimpleMin { private static final double DISCRIMINATOR = 1.0/(1<<10); // experimental result /** * We only test kvalues that are multiples of this constant. * Best values for performance are 315, 45, 105, 15 and 3, in that order. * The multiplier ensures that the generated test values are always a square mod K_MULT * Since the K_MULT consists out of 4 primes these numbers have a 2^4 = 16 times * higher chance of being a square then random numbers. This is very helpful */ private static final int K_MULT = 3 * 3 * 5* 7; /** This constant is used for fast rounding of double values to long. */ private static final double ROUND_UP_DOUBLE = 0.9999999665; private static double[] sqrt; // static double[] primes; with a table of primes the algorithm is much faster for small factors; here we just use odd numbers 3,5,7,9,.. static double[] primesInv; // works for numbers up to 40 Bits private static int maxFactor = 1 << 19; static { // Precompute sqrts for all k required for N <= MAX_N and multiplier K_MULT sqrt = new double[maxFactor+1]; primesInv = new double[maxFactor+1]; for (int i = 1; i <= maxFactor; i++) { sqrt[i] = Math.sqrt(i*K_MULT); primesInv[i] = 1.0 / i; } } public HartSimpleMin() { } /** * Find a factor of long N. * @param N * @return factor of N */ public long findSingleFactor(long N) { long a,b,test, gcd; final long fourN = N<<2; final double sqrt4N = Math.sqrt(fourN); for (int sqrtIndex = 1, k = sqrtIndex * K_MULT, prime = 3; ;k += K_MULT, prime += 2, sqrtIndex++) { // do trial division if ((long) (N * primesInv[prime] + DISCRIMINATOR) * prime == N) return prime; a = (long) (sqrt4N * sqrt[sqrtIndex] + ROUND_UP_DOUBLE); // adjust a mod 8 if ((sqrtIndex & 1) == 0) { a = 1; } else { final long kPlusN = k + N; a += (kPlusN & 3) == 0 ? ((kPlusN  a) & 7) : ((kPlusN  a) & 3); } test = a*a  k * fourN; b = (long) Math.sqrt(test); if (b*b == test && (gcd = gcd(a+b, N))>1 && gcd < N) { return gcd; } } } } Last fiddled with by ThiloHarich on 20190215 at 21:11 
Thread Tools  
Similar Threads  
Thread  Thread Starter  Forum  Replies  Last Post 
Do normal adults give themselves an allowance? (...to fast or not to fast  there is no question!)  jasong  jasong  35  20161211 00:57 
SQUFOF implementation  alpertron  Factoring  15  20100412 19:16 
ECM/FPGA Implementation  rdotson  Hardware  12  20060326 22:58 
need C implementation of divb(r,k)  Leith  Miscellaneous Math  4  20050118 23:14 
RSA implementation  flava  Programming  12  20041026 03:51 