mersenneforum.org 20-year-old MIT LCS35 Time Capsule Crypto-Puzzle solved
 Register FAQ Search Today's Posts Mark Forums Read

2019-05-09, 21:38   #23
R. Gerbicz

"Robert Gerbicz"
Oct 2005
Hungary

2×32×83 Posts

Quote:
 Originally Posted by TacticalCoder I'm also doing my own "revue de presse" (searching for the articles talking about that problem) and I stumbled upon the message saying that on the same hardware as I used it's actually doable in 2.16 years (I could speedup but couldn't reproduce the same speedup: maybe because of my RAM which is slower but that looks weird ?) so I created an account to ask questions about that :)
Sometimes there is a pre-installed gmp on the system, and it could mess up when you install more gmp version(s) which is the actually used gmp version, so insert this in the code to see it:
Code:
printf("Gmp version number=%s\n",gmp_version);
Also, update gcc to at least 8.3.0:
Code:
gerbicz@gerbicz-MS-7972:~/gmp-6.1.2$gcc -v Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/8/lto-wrapper OFFLOAD_TARGET_NAMES=nvptx-none OFFLOAD_TARGET_DEFAULT=1 Target: x86_64-linux-gnu Configured with: ../src/configure -v --with-pkgversion='Ubuntu 8.3.0-6ubuntu1~18.10' --with-bugurl=file:///usr/share/doc/gcc-8/README.Bugs --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++ --prefix=/usr --with-gcc-major-version-only --program-suffix=-8 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-bootstrap --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-libmpx --enable-plugin --enable-default-pie --with-system-zlib --with-target-system-zlib --enable-objc-gc=auto --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-offload-targets=nvptx-none --without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu Thread model: posix gcc version 8.3.0 (Ubuntu 8.3.0-6ubuntu1~18.10) gerbicz@gerbicz-MS-7972:~/gmp-6.1.2$
Today tried mpir (http://mpir.org/index.html), but it was slower than gmp by roughly 15-20%, for this problem.

 2019-05-16, 02:09 #24 TacticalCoder   May 2019 13 Posts Re all, it's super late and I'm very tired after an amazing day at the Stata Center but... I updated the factordb DB with the two factors of LCS35's n. Secret message was simply: "!!! Happy Birthday LCS !!! (seed value b for p = 712238904468723561162000937465778229877598711342253664788091132335)" Here goes the factorization... Code: n from LCS35: 6314466083072888893799357126131292332363298818330841375588990772701957128924885547308446055753206513618346628848948088663500368480396588171361987660521897267810162280557475393838308261759713218926668611776954526391570120690939973680089721274464666423319187806830552067951253070082020241246233982410737753705127344494169501180975241890667963858754856319805507273709904397119733614666701543905360152543373982524579313575317653646331989064651402133985265800341991903982192844710212464887459388853582070318084289023209710907032396934919962778995323320184064522476463966355937367009369212758092086293198727008292431243681 prime 1: 154562048657421937084682913138460711844778329290535727168950717733424202505601392625877205195325207546299861196433571846578805282932208177729566895723130005216848832881503977502006366130238044889916634521661721389848859842071584845041109033358607085246937424599502955074736164230571310927294787034883784001451 prime 2: 40853923313791905056390006230327028982596606945346318846549337657379367189284051739393623398120773967666443351405399733470781263258210328375697986182263865342952431430451931199880262185216184022204644820522254304557640038220504811993426511340296997627585129527426806995411578324937406976125239089383993333731 Last fiddled with by jasonp on 2019-05-16 at 11:48 Reason: added code tags
2019-05-21, 00:19   #26
paulunderwood

Sep 2002
Database er0rr

2·1,933 Posts

Quote:
 Originally Posted by kevinhake Put another way: Solving the puzzle by squaring requires 79685186856218 iterations. If we want to solve it in one week, how fast does our machine need to crunch? 1 week = 604800‬ seconds. 79685186856218/604800‬ = 131754608 iterations/sec. Now imagine you could build a machine that only did this calculation, one iteration per cycle. That's 132 MHz. Doesn't sound so impossible now, does it? You can buy FPGAs today that clock to 500MHz. In 1999, 60 MHz FPGAs were available. I'm not sure what kind of area they had, but I suspect if someone at Altera or Xilinx had given the puzzle a serious look in 1999 it would've been solved quickly then, maybe only a year or two later!
I have a very small basic Xilinx FPGA -- so I am under-qualified to say much about 2048x2048-->4096-->2048. One would need a hell of lot of mult units on a FPGA board to achieve it. I really don't know what is possible these days, nor the cost of such boards.

2019-05-21, 00:49   #27
kevinhake

"Kevin W. Hake"
May 2019

1012 Posts

Quote:
 Originally Posted by paulunderwood I have a very small basic Xilinx FPGA -- so I am under-qualified to say much about 2048x2048-->4096-->2048. One would need a hell of lot of mult units on a FPGA board to achieve it. I really don't know what is possible these days, nor the cost of such boards.
I was going to try it on an Arty A7-35T, it cost less than $100 usd. I wasn't even going to use the built-in multiply units, as each one is not very wide... I think then you might end up with a hierarchy with longer delays, but I'm not sure. I was going to use only look up tables, using the squaring circuit from a paper I found, "An Improved Squaring Circuit for Binary Numbers" by Kabiraj Sethi and Rutuparna Panda (https://pdfs.semanticscholar.org/29f...e93286160d.pdf) Somewhat related, I found in MPIR that performing a square operation followed by modulo was actually faster than the powmod function... but perhaps that's only because MPIR's implementation isn't good. I'm curious to try GMP's. I also didn't quite grasp Montgomery multiplication, and maybe I'm missing something that would've made it crunch much faster. 2019-05-21, 01:07 #28 paulunderwood Sep 2002 Database er0rr 386610 Posts Quote:  Originally Posted by kevinhake I was going to try it on an Arty A7-35T, it cost less than$100 usd. I wasn't even going to use the built-in multiply units, as each one is not very wide... I think then you might end up with a hierarchy with longer delays, but I'm not sure. I was going to use only look up tables, using the squaring circuit from a paper I found, "An Improved Squaring Circuit for Binary Numbers" by Kabiraj Sethi and Rutuparna Panda (https://pdfs.semanticscholar.org/29f...e93286160d.pdf) Somewhat related, I found in MPIR that performing a square operation followed by modulo was actually faster than the powmod function... but perhaps that's only because MPIR's implementation isn't good. I'm curious to try GMP's. I also didn't quite grasp Montgomery multiplication, and maybe I'm missing something that would've made it crunch much faster.
I reinvented the wheel by devising a mod reduction that had a the same run time as a multiplication in my own muti-precision JavaScript code. Apparently, the technique can be found in "Modern Computer Arithmetic" by Joachim von zur Gathen and Jürgen Gerhard, Cambridge University Press, 2nd edition, 2003. [118]. I have no idea whether this particular technique is applicable to FPGA.

You might also find that aligning n (over which you are taking the mod) with a shift to left to the nearest 64 bits. So you are taking mod n*2^i in the loop and then doing a final mod n after the loop. If n is already aligned then this point is moot. This would make a difference of 2% if not aligned.

HTH

Last fiddled with by paulunderwood on 2019-05-21 at 01:09

2019-05-21, 02:30   #29
kevinhake

"Kevin W. Hake"
May 2019

5 Posts

Quote:
 Originally Posted by paulunderwood I reinvented the wheel by devising a mod reduction that had a the same run time as a multiplication in my own muti-precision JavaScript code. Apparently, the technique can be found in "Modern Computer Arithmetic" by Joachim von zur Gathen and Jürgen Gerhard, Cambridge University Press, 2nd edition, 2003. [118]. I have no idea whether this particular technique is applicable to FPGA. HTH
Oh cool! I was thinking because we are dividing by a constant, there must be some shortcut for the division, especially in hardware. I barely got started figuring that out tho.

2019-05-22, 13:34   #30
theyowguy

May 2019

216 Posts

Quote:
 As far as protecting from errors... I'll be honest, Shamir's suggestion sounded very intuitive in the paper but when I sat down to implement it, I couldn't make heads or tails of it.

Let me preface all this by saying I'm not a math person but I love puzzles and the solutions to puzzles.

I first started looking at LCS 35 after I read that someone had solved it (I had never heard of it before then). Writing a pure java implementation of the solver was simple (even for me) but the portion about protecting for errors threw me. The thing I loved most about the solver is you could have it chugging away (I would save the progression every 100 million t's) while working on improving it's speed. Even though I was stuck on the error checking, I managed to enable GMP in my java implementation and it takes my 2 year old machine ~18.5 minutes to generate 100 million t's using GMP (I think that works out to almost 29 years to get a solution).

Finally though, after about 3 weeks of searching and reading, I managed to get a working error check.

Instead of 2^2^t mod n, you calculate 2^2^t mod cn where c is a 50-bit prime.

If you take your result mod n, you get the same answer as if you had been doing 2^2^t mod n all along.

If you take your result mod c, you get one side if your verification number. To get the other side you need to perform the following (where k is the current value of t);

2^(2^k) mod c
k' = 2^k mod c-1
2^k' mod c

Fermat's little theorem to the rescue.

2019-05-23, 06:31   #31
2M215856352p1

May 2019

3·37 Posts

Quote:
Hi. I am also new to the forum. I believe that you have gotten the theory correct.

The error check you proposed has a 2-5% overhead (my estimate) and has a 1/c chance of leaving an error undetected.

However, there is an even stronger error check, called the Gerbicz error check. It has negligible overhead and has a 2/n probability or less of leaving an error undetected, much smaller than 1/c. (Is that right?)

Link to the Gerbicz error check: https://www.mersenneforum.org/showpo...36&postcount=1. It works for any N since only squaring is involved.

Correct me if I am wrong.

 2019-05-23, 11:22 #32 theyowguy   May 2019 2 Posts I can try to calculate the overhead later. Shouldn't be too difficult. I had a quick look at your link and I'm not sure if I would be able to implement this check. Thanks, I will keep reading up on it though.
 2019-05-24, 18:43 #33 R. Gerbicz     "Robert Gerbicz" Oct 2005 Hungary 2×32×83 Posts Wikipedia: Code: The idea behind the challenge is that the only known way to find the value of w without knowing the factorisation of n is by t successive squarings. This is just wrong, a final step is tricky, but close to standard ideas. Skipping more iterations is the same, showing only the (r^4) mod m case. You can compute r^4 mod m without computing r^2 mod m (and then r^4 with one more squaring): compute (r^4) mod p for many primes, and with CRT reconstruct r^4: val=sum(c[i]*M/P[i]) for M=vecprod(P), and here comes the trick, we can skip the computation of val, and directly compute the mod m value: but the problem with this is that: Code: r^4 mod m=(sum(c[i]*M/P[i]) mod M) mod m and here you just can't swap mod M and mod m (!), but if you know the range of val: Code: d*M<=val<(d+1)*M then r^4 mod m=v mod m, where v=sum(c[i]*M/P[i])-d*M v==sum(c[i]*H[i])-d*G mod m, where H[i]=(M/P[i])%m and G=M%m and final observation: here abs(sum(c[i]*H[i]-d*G))<256*m, so mod m reduction is easy. standard ideas: if you can compute c*H with 256 threads in T time, then the computation of sum(c[i]*H[i]) takes only log2(256)*T=8*T time. (the same applies to eff). a working example in Pari-Gp: Code: f(r,m)={P=[];pr=2^32;for(i=1,256,pr=precprime(pr-1);P=concat(P,pr)); c=vector(256); M=vecprod(P); G=M%m; H=vector(256,i,(M/P[i])%m); inv=vector(256,i,lift(Mod(M/P[i],P[i])^(-1))); eff=vector(256,i,1.0/P[i]); for(i=1,256,p=P[i]; res=lift(Mod(r,p)^4); c[i]=(res*inv[i])%p); d=floor(sum(i=1,256,c[i]*eff[i])); return((sum(i=1,256,c[i]*H[i])-G*d)%m)} m=random(2^2047); r=random(m); f(r,m)==(r^4)%m r=random(10^20); f(r,m)==(r^4)%m %16 = 1 %18 = 0 In the last example showing that there was a computation error in d, but you can handle that easily (if without floor it would be so close to an integer). Ofcourse you can precompute P,M,G,H,inv,eff; you don't need to store all of these in all threads. Last fiddled with by R. Gerbicz on 2019-05-24 at 18:44 Reason: grammar

 Similar Threads Thread Thread Starter Forum Replies Last Post Nick Tales From the Crypt(o) 52 2020-12-17 21:16 davieddy Lounge 4 2009-10-18 04:39 R.D. Silverman Lounge 2 2007-08-08 20:24 MrHappy Lounge 0 2005-01-19 16:27 E_tron Lounge 7 2004-06-29 10:17

All times are UTC. The time now is 14:00.

Fri Oct 22 14:00:48 UTC 2021 up 91 days, 8:29, 1 user, load averages: 1.26, 1.24, 1.26