mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   EdH (https://www.mersenneforum.org/forumdisplay.php?f=152)
-   -   How I Create a Colab Session That Extends Aliquot Sequences Working Directly with factordb (https://www.mersenneforum.org/showthread.php?t=26466)

EdH 2021-02-03 19:04

How I Create a Colab Session That Extends Aliquot Sequences Working Directly with factordb
 
(Note: I expect to keep the first post of each of these "How I..." threads up-to-date with the latest version. Please read the rest of each thread to see what may have led to the current set of instructions.)

I will take the liberty of expecting readers to already be somewhat familiar with Google's Colaboratory sessions. There are several threads already on Colab and these should be reviewed by interested readers:

[URL="https://mersenneforum.org/showthread.php?t=24646"]Google Colaboratory Notebook?[/URL]
[URL="https://www.mersenneforum.org/showthread.php?t=24818"]GPU72 Notebook Integration...[/URL]
[URL="https://mersenneforum.org/showthread.php?p=527912"]Notebook Instance Reverse SSH and HTTP Tunnels.[/URL]
[URL="https://www.mersenneforum.org/showthread.php?t=24875"]Colab question[/URL]

I do not, as of yet, have a github account, so I have not created an upload of this to github. Others may feel free to do so, if desired.

The following is a manner to work with factordb.com to extend an Aliquot sequence. The session will retrieve the last composite in a sequence and factor it, returning the factors to factordb.com, which in turn extends the sequence. The session will ask for the initial sequence term to be entered. The session uses YAFU to perform the factoring, which in turn uses GMP-ECM for some of its process. The initial run installs all the necessary packages.

Note: Currently, this session is only using SIQS for factoring that GMP-ECM doesn't solve. The NFS switch is not used during compilation of YAFU and neither Msieve nor the GNFS sievers are installed.

To use Colab, you need a Gmail account and will be required to log into that account to run a session.

On to the specifics:

Open a [URL="https://colab.research.google.com/notebooks/welcome.ipynb"]Google Colaboratory[/URL] session.
Sign in with your Google/Gmail account info.
Choose New notebook:
[code]
Menu->File->New notebook (or within popup)
[/code]Click Connect to start a session.
Edit title from Untitled... to whatever you like.
Paste the following into the Codeblock:
[code]
##########################################################
### This Colaboratory session is designed to extend an ###
### Aliquot sequence. It works directly with factordb ###
### and uses YAFU to find factors, which are then sent ###
### back to factordb. ###
### ###
### The only interaction with this session is to enter ###
### the initial Aliquot term when prompted. ###
##########################################################

import os
import subprocess
import urllib.request

#reports factors to factordb
def send2db(composite, factors):
factorline = str(factors)
sendline = 'report=' + str(composite) + '%3D' + factorline
dbcall = sendline.encode('utf-8')
temp2 = urllib.request.urlopen('http://factordb.com/report.php', dbcall)

#checks to see if yafu already exists
#if it does, this portion is skipped
exists = os.path.isfile('yafu')
if exists < 1:
print("Installing system packages. . .")
# subprocess.call(["chmod", "777", "/tmp"])
subprocess.call(["apt", "update"])
subprocess.call(["apt", "install", "g++", "m4", "make", "subversion", "libgmp-dev", "libtool", "p7zip", "autoconf"])
#retrieves ecm
print("Retrieving GMP-ECM. . .")
subprocess.call(["svn", "co", "svn://scm.gforge.inria.fr/svn/ecm/trunk", "ecm"])
os.chdir("/content/ecm")
subprocess.call(["libtoolize"])
subprocess.call(["autoreconf", "-i"])
subprocess.call(["./configure", "--with-gmp=/usr/local/"])
print("Compiling GMP-ECM. . .")
subprocess.call(["make"])
subprocess.call(["make", "install"])
print("Finished installing GMP-ECM. . .")
os.chdir("/content")
#retrieves YAFU
print("Retrieving YAFU. . .")
subprocess.call(["svn", "co", "https://svn.code.sf.net/p/yafu/code/trunk", "/content/yafu"])
os.chdir("/content/yafu")
subprocess.call(["mv", "yafu.ini", "yafu.ini.orig"])
with open("yafu.ini", "a+") as yafuini:
yafuini.write("B1pm1=100000\n")
yafuini.write("B1pp1=20000\n")
yafuini.write("B1ecm=11000\n")
yafuini.write("rhomax=200\n")
yafuini.write("threads=2\n")
yafuini.write("pretest_ratio=0.25\n")
yafuini.write("ecm_path=/usr/local/bin/ecm\n")
yafuini.write("xover=93\n")
yafuini.write("no_clk_test=1")
print("Compiling YAFU. . .")
subprocess.call(["make", "x86_64", "USE_SSE41=1", "USE_AVX512=1"])
print("Finished compiling YAFU. . .")
with open("/content/yafu/AliWork.sh", "a+") as AliWork:
AliWork.write("#!/bin/bash/\n")
AliWork.write("\n")
AliWork.write("function getComp(){\n")
AliWork.write(" wget -q -U Mozilla/5.0 \"http://factordb.com/sequences.php?se=1&aq=${seq}&action=last\" -O dbTemp\n")
AliWork.write(" exec <\"dbTemp\"\n")
AliWork.write(" while read line2\n")
AliWork.write(" do\n")
AliWork.write(" case $line2 in\n")
AliWork.write(" *\"index.php?id=\"*) ind=${line2##*002099}\n")
AliWork.write(" ind1=${#line2}\n")
AliWork.write(" let ind1=${ind1}-${#ind}\n")
AliWork.write(" let ind1=${ind1}-41\n")
AliWork.write(" ind2=http://factordb.com/index.php?id=${line2:${ind1}:19}\n")
AliWork.write(" ;;\n")
AliWork.write(" esac\n")
AliWork.write(" done\n")
AliWork.write("\n")
AliWork.write(" wget -q -U Mozilla/5.0 $ind2 -O dbTemp\n")
AliWork.write("\n")
AliWork.write(" exec <\"dbTemp\"\n")
AliWork.write(" while read line3\n")
AliWork.write(" do\n")
AliWork.write(" case $line3 in\n")
AliWork.write(" *\"query\"*) comp=${line3##*\'value=\"\'}\n")
AliWork.write(" comp2=${comp:0:${#comp}-2}\n")
AliWork.write(" complen=${#comp2}\n")
AliWork.write(" ;;\n")
AliWork.write(" esac\n")
AliWork.write(" done\n")
AliWork.write("}\n")
AliWork.write("\n")
AliWork.write("rm /content/yafu/stopAliWork 2>/dev/null\n")
AliWork.write("cd /content/yafu\n")
AliWork.write("test=\"\"\n")
AliWork.write("\n")
AliWork.write("if [ ${#1} -lt 3 ]\n")
AliWork.write(" then\n")
AliWork.write(" printf \"Enter seq#: \"\n")
AliWork.write(" read line in\n")
AliWork.write(" seq=$line\n")
AliWork.write(" else\n")
AliWork.write(" seq=$1\n")
AliWork.write("fi\n")
AliWork.write("\n")
AliWork.write("while [ ! -e ~/stopAliWork ]\n")
AliWork.write(" do\n")
AliWork.write(" rm *.log 2>/dev/null\n")
AliWork.write(" rm *.dat 2>/dev/null\n")
AliWork.write(" rm factors 2>/dev/null\n")
AliWork.write(" rm db* 2>/dev/null\n")
AliWork.write(" rm nfs.* 2>/dev/null\n")
AliWork.write(" getComp\n")
AliWork.write("\n")
AliWork.write(" echo sequence is $seq and composite is ${comp2:0:10}...${comp2:${complen}-2}\'< \'${complen}\'>\'\n")
AliWork.write(" echo \"$comp2\" | ./yafu -of factors\n")
AliWork.write(" exec <\"factors\"\n")
AliWork.write(" read line in\n")
AliWork.write(" report=http://factordb.com/report.php?report=$(echo ${line:1:$complen}%3D${line:$complen+3} | tr -s \'/\' \'*\')\n")
AliWork.write(" wget -q -U Mozilla/5.0 \"$report\" -O dbRsave\n")
AliWork.write("\n")
AliWork.write(" done\n")
print("All compilations completed!\n")
%cd /content/yafu/
print("Enter sequence into box and press Enter:\n")
!bash AliWork.sh
!rm ~/stopAliWork
[/code] Click on the Run cell icon or use CTRL-Enter. The first run will install packages for a few minutes and then prompt for the seq# (initial term). It will then work toward extending the sequence.

Subsequent runs should proceed without the compilation and installation steps.

LaurV 2021-02-04 03:34

:tu: :goodposting:

I am trying to run this, it works fine for small composites, as long as the internal ECM is used, but it seems to crash when it comes to external ECM and GGNFS steps (after running the third internal "pm1" step). Are you sure the path to the ecm is in usr/bin and not in the home/content/ecm/blahblah? Ad do you need the ggnfs files in Linux? (in Windows you do, you can't run yafu without them, for composites larger than ~~110 digits, but maybe they are installed by default in Linux? I have no idea).

Anyhow, a nice beginning.

EdH 2021-02-04 14:24

[QUOTE=LaurV;570822]:tu: :goodposting:

I am trying to run this, it works fine for small composites, as long as the internal ECM is used, but it seems to crash when it comes to external ECM and GGNFS steps (after running the third internal "pm1" step). Are you sure the path to the ecm is in usr/bin and not in the home/content/ecm/blahblah? Ad do you need the ggnfs files in Linux? (in Windows you do, you can't run yafu without them, for composites larger than ~~110 digits, but maybe they are installed by default in Linux? I have no idea).

Anyhow, a nice beginning.[/QUOTE]Ah, sorry, but this does not run NFS, only SIQS ATM. I forgot to mention that it is compiled without the NFS switch (done now). I need to see if B[SUP]2[/SUP] has reinstated the link to his sievers and possibly add NFS. (I would also have to add Msieve.) But, in the overall scheme, will a Colab session survive the time required for two threads to perform an NFS run?

As to the GMP-ECM, it should be in the normal linux place. From a Colab instance:
[code]
!whereis ecm
ecm: /usr/local/bin/ecm
[/code]i haven't seen any complaint about not finding GMP-ECM during any of the runs.


For fun, I started a Colab instance against 4788 to see what it would do (or not do):
[code]
fac: factoring 521522190388785541331160787195976801952432005240028067707250869703871311778809493591678394522430885815662329064194127806602359010987662524451917763292547675317210495917002745173292704484659
fac: using pretesting plan: normal
fac: no tune info: using qs/gnfs crossover of 93 digits
div: primes less than 10000
rho: x^2 + 3, starting 200 iterations on C189
rho: x^2 + 2, starting 200 iterations on C189
rho: x^2 + 1, starting 200 iterations on C189
pm1: starting B1 = 150K, B2 = gmp-ecm default on C189
ecm: 30/30 curves on C189, B1=2K, B2=gmp-ecm default ecm:
74/74 curves on C189, B1=11K, B2=gmp-ecm default ecm:
214/214 curves on C189, B1=50K, B2=gmp-ecm default, ETA: 1 sec
pm1: starting B1 = 3750K, B2 = gmp-ecm default on C189
ecm: 430/430 curves on C189, B1=250K, B2=gmp-ecm default, ETA: 2 sec
pm1: starting B1 = 15M, B2 = gmp-ecm default on C189[/code]Then I tried something more reasonable:
[code]fac: factoring 141210648600325541705449110385440866025525895296676434777837870153850236136035700644737172874485493349
fac: using pretesting plan: normal
fac: no tune info: using qs/gnfs crossover of 93 digits
div: primes less than 10000
rho: x^2 + 3, starting 200 iterations on C102
rho: x^2 + 2, starting 200 iterations on C102
rho: x^2 + 1, starting 200 iterations on C102
pm1: starting B1 = 150K, B2 = gmp-ecm default on C102
ecm: 30/30 curves on C102, B1=2K, B2=gmp-ecm default
ecm: 74/74 curves on C102, B1=11K, B2=gmp-ecm default
ecm: 214/214 curves on C102, B1=50K, B2=gmp-ecm default, ETA: 0 sec
pm1: starting B1 = 3750K, B2 = gmp-ecm default on C102
ecm: 430/430 curves on C102, B1=250K, B2=gmp-ecm default, ETA: 1 sec
pm1: starting B1 = 15M, B2 = gmp-ecm default on C102
ecm: 170/170 curves on C102, B1=1M, B2=gmp-ecm default, ETA: 5 sec
nfs has not been enabled

starting SIQS on c102: 141210648600325541705449110385440866025525895296676434777837870153850236136035700644737172874485493349

==== sieving in progress ( 2 threads): 131760 relations needed ====
==== Press ctrl-c to abort and save state ====[/code]Thanks for the reply.

firejuggler 2021-02-04 18:00

1 Attachment(s)
And a reallly small trouble : if the sequence finish or is prime, it doesn't work.

firejuggler 2021-02-04 18:54

Ah! it get stuck if the largest factor is prime and one of the other is compositre. aka sequence

20210202

EdH 2021-02-04 19:02

[QUOTE=firejuggler;570870]Ah! it get stuck if the largest factor is prime and one of the other is compositre. aka sequence

20210202[/QUOTE]Ah, yes. I forgot about that trouble. I will have to address that. Thanks for reminding me.

LaurV 2021-02-05 10:49

I already put some terms to sequence 2880388 using colab and your script (yes, I properly reserved it, the sequences I was working were all at NFS level, but this is still SIQS-ing OK).

Got the mentioned loop (at index 756, the 41 digit factor is prime, the cofactor is 42 digits composite) but the DB elves got me out of it without my intervention. If you report the factors, the elves will factor the composite after a while, and the colab script will continue normally. Of course, this is not the desired solution, as the time in between is lost with looping.

Please fix it and post it back at your convenience. :smile:

EdH 2021-02-05 15:23

[QUOTE=LaurV;570907]I already put some terms to sequence 2880388 using colab and your script (yes, I properly reserved it, the sequences I was working were all at NFS level, but this is still SIQS-ing OK).

Got the mentioned loop (at index 756, the 41 digit factor is prime, the cofactor is 42 digits composite) but the DB elves got me out of it without my intervention. If you report the factors, the elves will factor the composite after a while, and the colab script will continue normally. Of course, this is not the desired solution, as the time in between is lost with looping.

Please fix it and post it back at your convenience. :smile:[/QUOTE]Thanks for the reply. I have a fix that I did somewhere else, but I haven't had a chance to dig it out and modify it for Colab. I think I had to do something odd like look for a color (representing composite) instead of the last term.

I hope to do this soon, but other things are distracting me.

chalsall 2021-02-05 15:30

[QUOTE=EdH;570918]I hope to do this soon, but other things are distracting me.[/QUOTE]

"Life? Don't talk to me about life." :wink:

EdH 2021-02-06 21:23

I believe I have fixed the composite retrieve issue and edited the first post to reflect the changes. It should now retrieve the composite whether it is the last entry or not.

Please let me know if this is run into again.

Thanks for all the feedback!


All times are UTC. The time now is 12:29.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.