mersenneforum.org > EdH How I Create a Colab Session That Factors factordb Composites with YAFU
 Register FAQ Search Today's Posts Mark Forums Read

 2019-11-09, 15:50 #1 EdH     "Ed Hall" Dec 2009 Adirondack Mtns 10000000101102 Posts How I Create a Colab Session That Factors factordb Composites with YAFU (Note: I expect to keep the first post of each of these "How I..." threads up-to-date with the latest version. Please read the rest of each thread to see what may have led to the current set of instructions.) I will take the liberty of expecting readers to already be somewhat familiar with Google's Colaboratory sessions. There are several threads already on Colab and these should be reviewed by interested readers: Google Colaboratory Notebook? GPU72 Notebook Integration... Notebook Instance Reverse SSH and HTTP Tunnels. Colab question I do not, as of yet, have a github account, so I have not created an upload of this to github. Others may feel free to do so, if desired. The following is a manner to compile and install a minimally working package of YAFU. For this instance, a repository version of GMP is installed, the current version of GMP-ECM is retrieved and compiled and YAFU is retrieved and compiled. This is not a fully working version of YAFU, in that it does not include any support for NFS. Since the range of composites retrieved from factordb is well less than 95 digits in length, SIQS is used for any composite not factored by ECM. When run, this session retrieves composites of a chosen size from factordb, factors them and submits the factors back to the db. To use Colab, you need a Gmail account and will be required to log into that account to run a session. On to the specifics: Open a Google Colaboratory session. Sign in with your Google/Gmail account info. Choose New Python3 notebook: Code: Menu->File->New Python3 notebook (or within popup) Click Connect to start a session. Edit title from Untitled... to whatever you like. Paste the following into the Codeblock: Code: ######################################################### ### This Colaboratory session is designed to retrieve ### ### composites from factordb.com and factor them with ### ### YAFU. The factors are then sent to factordb. ### ### ### ### To adjust the number of composites to retrieve as ### ### well as the size to retrive, change the variables ### ### below this comment block. The size of the random ### ### number to be used to help avoid collisions (1000) ### ### can also be changed, as well as the offset. ### ######################################################### compNum = 3 # Number of composites to run compSize = 70 # Size of composites to run ranNum = 1000 # Number for random count offset = 10 import fileinput import os import random import subprocess import time import urllib.request #reports factors to factordb def send2db(composite, factors): factorline = str(factors) sendline = 'report=' + str(composite) + '%3D' + factorline dbcall = sendline.encode('utf-8') temp2 = urllib.request.urlopen('http://factordb.com/report.php', dbcall) #checks to see if yafu already exists #if it does, this portion is skipped exists = os.path.isfile('yafu') if exists < 1: print("Installing system packages. . .") subprocess.call(["chmod", "777", "/tmp"]) subprocess.call(["apt", "update"]) subprocess.call(["apt", "install", "g++", "m4", "make", "subversion", "libgmp-dev", "libtool", "p7zip", "autoconf"]) #retrieves ecm print("Retrieving GMP-ECM. . .") subprocess.call(["svn", "co", "svn://scm.gforge.inria.fr/svn/ecm/trunk", "ecm"]) os.chdir("/content/ecm") subprocess.call(["libtoolize"]) subprocess.call(["autoreconf", "-i"]) subprocess.call(["./configure", "--with-gmp=/usr/local/"]) print("Compiling GMP-ECM. . .") subprocess.call(["make"]) subprocess.call(["make", "install"]) print("Finished installing GMP-ECM. . .") os.chdir("/content") #retrieves YAFU print("Retrieving YAFU. . .") subprocess.call(["svn", "co", "https://svn.code.sf.net/p/yafu/code/branches/wip", "/content/yafu"]) os.chdir("/content/yafu") for line in fileinput.input('Makefile', inplace=True): print(line.rstrip().replace('CC = gcc-7.3.0', 'CC = gcc')) for line in fileinput.input('yafu.ini', inplace=True): print(line.rstrip().replace('% threads=1', 'threads=2')) for line in fileinput.input('yafu.ini', inplace=True): print(line.rstrip().replace('ecm_path=../gmp-ecm/bin/ecm', 'ecm_path=/usr/local/bin/ecm')) print("Compiling YAFU. . .") subprocess.call(["make", "USE_SSE41=1"]) print("Finished compiling YAFU. . .") print("Starting the factoring of", compNum, "composites. . .\n") #main loop for x in range(compNum): randnum = random.randrange(ranNum) + offset #fetch a number from factordb dbcall = 'http://factordb.com/listtype.php?t=3&mindig=' + str(compSize) + '&perpage=1&start=' + str(randnum) + '&download=1' #some file processing to get the number into a format usable by yafu temp0 = urllib.request.urlopen(dbcall) temp1 = temp0.read() composite = temp1.decode(encoding='UTF-8') composite = composite.strip("\n") fstart = time.time() #print number being worked on # print("Composite", x + 1,":", composite, "<",len(composite),">") print("Composite {0}: {1} <{2}>".format( x + 1, composite,len(composite))) #run yafu factorT = subprocess.run(['./yafu', '-silent'], stdout=subprocess.PIPE, input=temp1) #find factors from the yafu run in factor.log file = open('factor.log', 'r') string = (", prp") fcheck = 0 factors = "" for line in file: found = line.rfind(string) if found > 0: line = line.rstrip("\n") ind = line.rfind(" = ") ind += 3 line = line[ind:] if fcheck > 0: factors = factors + "*" line = line.split(" ", 1)[0] factors = factors + line fcheck += 1 os.remove("factor.log") runtime = time.time() - fstart #print factors found # print("Factors:", factors) # print("Factors (%d:%02d):" %(int(runtime / 60), int(runtime % 60)), factors, "\n") print("Factors ({0:0>1}:{1:0>2}): {2}\n".format(int(runtime / 60), int(runtime % 60), factors)) # print("Elapsed time:", int(runtime / 60), "minutes and", int(runtime % 60), "seconds.\n") # print("Elapsed time:", runtime("%H:%M:%S")) #send number and factors to factordb send2db(composite, factors) #all numbers are completed print("Completed all", compNum, "composites!") Click on the Run cell icon or use CTRL-Enter. The compilations will run for about two and a half minutes. When YAFU finishes its compilation, after a couple message blocks, if all went well, the factoring process will begin. The current default is to factor three, 80 digit composites and stop. The factors are sent to the db automatically, so no other manual intervention is needed. To change the number of composites to work on for each run, edit the compNum variable. To change the size of the composites to work on edit the compSize variable. Eventually, I hope to add a more detailed description of all the code. Last fiddled with by EdH on 2021-07-28 at 19:49
 2019-11-09, 16:46 #2 mathwiz   Mar 2019 5·41 Posts Now to automate it... Awesome guide! Now we just have to figure out how to wire this up to FactorDB.com so composites are factored automatically
 2019-11-09, 17:07 #3 Dylan14     "Dylan" Mar 2017 2·293 Posts I can confirm that the code works. A few things: 1. It suffices to just comment out the lines above the imports after you made the code once. 2. Is there a reason why you use the build option USE_SSE41=1, instead of something that is faster like AVX2? As it appears all of the Colab entities have at least this. 3. I added some more comments to the code below the compilation: Code: import random import subprocess import urllib.request compNum = 2# Number of composites to run compSize = 80# Size of composites to run ranNum = 1000# Number for random count defsend2db(composite, lastfactor): #reports factors to factordb factorline = str(lastfactor) sendline = 'report=' + str(composite) + '%3D' + factorline dbcall = sendline.encode('utf-8') temp2 = urllib.request.urlopen('http://factordb.com/report.php', dbcall) #main loop for x inrange(compNum):#run compnum composites randnum = random.randrange(ranNum)#pick a random number #fetch number from factordb dbcall = 'http://factordb.com/listtype.php?t=3&mindig=' + str(compSize) + '&perpage=1&start=' + str(randnum) + '&download=1' #some file processing to get the number into a format usable by yafu temp0 = urllib.request.urlopen(dbcall) temp1 = temp0.read() composite = temp1.decode(encoding='UTF-8') composite = composite.strip("\n") #print composite to test print("The composite is", composite) #run yafu factorT = subprocess.run(['./yafu'], stdout=subprocess.PIPE,input=temp1) #find factors from a yafu run factor = factorT.stdout.decode('utf-8') factorloc = factor.index('***factors found***') factorloc += 22 tail = factor[factorloc:] factors = tail[:-34] facind = factors.rfind('=') facind += 2 lastfactor = factors[facind:] #print last factor found print("The last factor is", lastfactor) #send factors to fdb send2db(composite, lastfactor) #run complete print("Completed requested number of composites!") 
2019-11-09, 18:11   #4
EdH

"Ed Hall"
Dec 2009

2·29·71 Posts

Quote:
 Originally Posted by mathwiz Awesome guide! Now we just have to figure out how to wire this up to FactorDB.com so composites are factored automatically
Thanks, but I must not understand your comment.

Once started, the composites are retrieved, factored and uploaded to factordb.com automatically. The only manual part is the session start and choosing how many composites to work. Then, all is fully automated.

2019-11-09, 18:26   #5
EdH

"Ed Hall"
Dec 2009

2×29×71 Posts

Quote:
 Originally Posted by Dylan14 I can confirm that the code works. A few things: 1. It suffices to just comment out the lines above the imports after you made the code once. 2. Is there a reason why you use the build option USE_SSE41=1, instead of something that is faster like AVX2? As it appears all of the Colab entities have at least this. 3. I added some more comments to the code below the compilation:
Thanks Dylan,

I tried a direct copy/paste and lost some formatting. I had to go back to my original. I'm being pulled away ATM, but plan to address all else later.

1. I considered a block delete easier than commenting out lines.
2. I have experienced segmentation faults with AVX2 in the past.
3. Thanks! I'll work on those later.

 2019-11-09, 23:30 #6 EdH     "Ed Hall" Dec 2009 Adirondack Mtns 2×29×71 Posts I made some changes, but unfortunately, the AVX2 option causes SIQS to return earlier than completion and the zero value for factorloc crashes the run. I'll work on this more later.
2019-11-10, 03:52   #7
LaurV
Romulan Interpreter

"name field"
Jun 2011
Thailand

265416 Posts

Quote:
 Originally Posted by EdH Thanks, but I must not understand your comment. Once started, the composites are retrieved, factored and uploaded to factordb.com automatically. The only manual part is the session start and choosing how many composites to work. Then, all is fully automated.
I think he meant more or less in a serious way, something along the lines that factordb itself could be "wired" to run such script on colab by itself too.

2019-11-10, 04:01   #8
EdH

"Ed Hall"
Dec 2009

100268 Posts

Quote:
 Originally Posted by LaurV I think he meant more or less in a serious way, something along the lines that factordb itself could be "wired" to run such script on colab by itself too.
AH! Thank you! I was correct that I must not have understood. Indeed, I did not. But now I do see how it was meant, with your assistance. I fear factordb would overrun Colab if such was the case, though. . .

2019-11-10, 04:18   #9
LaurV
Romulan Interpreter

"name field"
Jun 2011
Thailand

100110010101002 Posts

Quote:
 Originally Posted by EdH I fear factordb would overrun Colab if such was the case, though. . .
That for sure. One can not compare 20 or 50 real cores that Syd has, with 1 virtual core that colab gives you. But 101 mile per hour is better than 100 miles per hour (this I learned on this forum!)

 2019-11-11, 16:10 #10 EdH     "Ed Hall" Dec 2009 Adirondack Mtns 2·29·71 Posts I made some major changes, all reflected in the original post. All comments welcome. . .
 2019-11-11, 17:20 #11 bsquared     "Ben" Feb 2007 3·1,193 Posts Are colab sessions single threaded? If not it would be helpful to run multithreaded.

 Similar Threads Thread Thread Starter Forum Replies Last Post David703 GPU to 72 279 2020-12-12 01:26 EdH EdH 2 2019-10-28 03:10 GP2 FactorDB 6 2018-07-24 19:45 Batalov Cunningham Tables 15 2011-07-30 03:43 AntonVrba Factoring 3 2006-02-05 06:30

All times are UTC. The time now is 01:32.

Thu Dec 9 01:32:44 UTC 2021 up 138 days, 20:01, 0 users, load averages: 0.77, 1.30, 1.38