![]() |
![]() |
#1 | |
Jun 2005
lehigh.edu
102410 Posts |
![]()
Tucked away inside of Thorsten's report of a new DL record (mod a
160-digit prime) is a description of what would appear to be the first public announcement of the latest development in the type of linear algebra used in NFS factorizations. Here's the NMBRTHRY paragraph: Quote:
that each job may not need the complete matrix. Each part seems to have gotten (the usual?) parallel linear algebra; after which there presumably would be some final assembly. One hopes that more explicit details of the distribution will appear. -bdodson |
|
![]() |
![]() |
![]() |
#2 | |
"Bob Silverman"
Nov 2003
North of Boston
23×3×311 Posts |
![]() Quote:
Note that this was a DL computation. The matrix must be solved modulo the order of the group. One typically does this by solving it modulo small primes, then pasting the results together with the CRT. Solving the system modulo one of the small primes can be done completely independently of the others. This is a natural parallel partition of the problem. One then brings everything together with the CRT. Such a partition does not seem to exist for solving it over GF(2). I wish it did. |
|
![]() |
![]() |
![]() |
#3 | |
Oct 2006
vomit_frame_pointer
23×32×5 Posts |
![]() Quote:
So many factorization algorithms depend on solving x^2 = y^2 mod n, which ties us to a characteristic 2 computation in the matrix stage. How about jumping to the congruence x^k = y^k mod n (k large and smooth)? Two huge problems with this, of course: (i) sieving, or whatever approach one uses to collect relations, could be inefficient -- but we'd already be at the point where the matrix was much more inefficient anyway; (ii) the linear algebra is now over Z/(k), which ain't a field, but this could be a minor issue. Anyone think it might go in this direction some day? |
|
![]() |
![]() |
![]() |
#4 | |
"Bob Silverman"
Nov 2003
North of Boston
164508 Posts |
![]() Quote:
Hint: Ask yourself what are the requirements on the factors of n (and n) for a solution to the congruence x^3 = y^3 mod n to even exist? Further hint: such a congruence does not exist for most n and almost all k. |
|
![]() |
![]() |
![]() |
#5 | ||
Jun 2005
lehigh.edu
100000000002 Posts |
![]() Quote:
Quote:
section of Thorsten's post on solving logs for small primes, which was "done at the end of the matrix step". I stand by my description; let's discuss again in six months. -Bruce |
||
![]() |
![]() |
![]() |
#6 | |
Oct 2006
vomit_frame_pointer
23·32·5 Posts |
![]() Quote:
![]() For a prime exponent k, x^k = x and y^k = y (mod k), so x^(k-1)+x^(k-2)y+...+y^(k-1) = (x^k - y^k)/(x - y) will be congruent to 1 whenever x and y are incongruent mod k. Hence you can only split off factors which are congruent to 1 modulo k. Yech. |
|
![]() |
![]() |
![]() |
#7 | |
"Bob Silverman"
Nov 2003
North of Boston
23×3×311 Posts |
![]() Quote:
Bingo! The exponent 2 works because all primes (except 2) are 1 mod 2. |
|
![]() |
![]() |
![]() |
#8 | |
Jun 2005
lehigh.edu
210 Posts |
![]() Quote:
there's a reference to Joux-Lercier, Math Comp, Nov., 2002 (from a revision in Oct. 2000). From that article ("improvements in gnfs for DL .."), Thorsten reports 3 "small differences", including an improvement in the linear algebra. I'm still somewhat puzzled by Bob's suggestion that DL gnfs is more inherently parallel than gnfs for integer factorization. One possible reduction uses the prime factors of p-1; but Thorsten takes the prime p in the "usual form" with q= (p-1)/2 also prime. The linear algebra is then mod q (not 2, indeed). J-L describes an initial structured gauss step; which sounds as though it may correspond to merging large primes during filtering (where J-L doesn't use large primes). I'm still seeing a central computation on a single large matrix --- parallel Lanczos for J-L, "Wiedemann" for Thorsten. The result of this "matrix step" does supply logs for some small primes ("good" ones in J-L, "many" in Th). We haven't yet gotten to the particular number whose log we're supposed to be computing --- the entire matrix step is a "pre-computation" (J-L) before getting to the particular log; but the post-computation is then relatively routine, and not-so intensive. So that's 14 cpu-years for the matrix, and "a few hours per individual log" (Th). J-L report an additional sieving step (after the matrix solution) to write x = A/B; Th has y = -(d/n) mod p. In either case, A,B or d,n are supposed to be products of small "good" primes or medium primes ("larger" in Th). The logs of these primes are, in fact, reduced to logs of primes below 2^32. Is this the step Bob wants to distribute? If so, it's just this final post-computation, not the main matrix "pre-computation" (independent/ before) getting to the specific number whose log is being computed. I do know for a fact (I was there to hear Contini's version), that block Lanczos mod 2 did/was adaptable to mod q. So far as I can tell (pending further enlightenment; please), the other stuff being discussed doesn't touch the main step of the computation; the "linear algebra", before getting to the specific log being computed. The central question (to me) remains those 8 jobs, 12-24 cpus each in the main matrix computation. I'd like to be able to say that mod q -vs- mod 2 doesn't matter; and that it's incidental that those 8 jobs are on a single cluster, instead of being separately done in 8 different countries, across several continents. I'll admit that doesn't appear to fit Thorsten's description of a "small difference", unless it's included in the "Wiedemann" that's currently part of the gnfs suit of Franke-Bahr-Kleinjung. Anyone here that's seen current code? -bdodson |
|
![]() |
![]() |
![]() |
#9 | |
"Bob Silverman"
Nov 2003
North of Boston
11101001010002 Posts |
![]() Quote:
for fairly large q we can do the following instead. Select a set of 32 bit primes p_i such that product(p_i) > q. Reduce the matrix mod each p_i, then solve it mod p_i for each p_i. This can be done totally independently for each p_i. Then find the null space mod q by pasting together the solutions for the p_i via the Chinese Remainder Theorem. This reduces the problem of solving a big matrix modulo a (moderately large) multi-precision integer, to one of solving multiple matrices modulo single precision primes. One solves these smaller coefficient matrices totally independently. Each of the reduced matrices can of course also be solved in parallel. We have changed the problem from solving a matrix with big numbers using many processors in parallel to one of solving multiple smaller number matrices. Each of the smaller matrices is still solved in parallel, but we use fewer processors for solving each of these small number matrices. Processor utilization *decreases* for large numbers of processors working on a parallel linear algebra problem. Breaking a large number of processors into smaller clusters, each cluster working independently should increase per processor utilization. |
|
![]() |
![]() |
![]() |
#10 | |
Aug 2004
New Zealand
22·3·19 Posts |
![]() Quote:
All this was for runs I did with MPQS in the C110 range. |
|
![]() |
![]() |
![]() |
#11 | |
Jun 2005
lehigh.edu
210 Posts |
![]() Quote:
ahead of the curve on deciphering/anticipating posts from Franke-Bahr- Kleinjung, I'm now inclined towards the view that Thorsten's description wasn't referring to a new feature at all. (That would fit the "small difference ... in the linear algebra" in his post.) Rather, we seem to be getting, at long last, new details about how the matrix for RSA200 was done. Anyone care to wager on whether there was a version of these eight "chunks" that we have to thank for RSA200 being factored? On Bob's post, Thanks for a clear description of the classical linear algebra for mod p DL. I recognize the Chinese Remaindering from Peter's ecm/fft output. Does anyone know whether the method described is still applied post-Lanczos (much less, post-Wiedemann)? I certainly didn't hear it in Scott's description (which appears to have been the first use of Peter's Lanczos applied to the mod p DL matrix; if I recall, this would have been in 1995, before Crypto95); and it's not referred to in Thorsten's reference to Joux-Lercier's paper (revised, 2000), which seems to otherwise be fairly complete in it's detail. -Bruce |
|
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Has anyone tried linear algebra on a Threadripper yet? | fivemack | Hardware | 3 | 2017-10-03 03:11 |
restarting nfs linear algebra | cubaq | YAFU | 2 | 2017-04-02 11:35 |
Linear algebra at 600% | CRGreathouse | Msieve | 8 | 2009-08-05 07:25 |
Linear algebra crashes | 10metreh | Msieve | 3 | 2009-02-02 08:34 |
Linear algebra proof | Damian | Math | 8 | 2007-02-12 22:25 |